Mamba
Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers.
This backend:
- provides support for Mamba models
- requires CUDA runtime
note
This is an experimental backend and it may change in the future.
Example
warning
Please make sure to change syntax to #syntax=ghcr.io/sozercan/aikit:latest
in the examples below.
https://github.com/sozercan/aikit/blob/main/test/aikitfile-mamba.yaml