Pre-made Models
AIKit comes with pre-made models that you can use out-of-the-box!
If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!
CPU
Model | Optimization | Parameters | Command | Model Name | License |
---|---|---|---|---|---|
🦙 Llama 3.2 | Instruct | 1B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:1b | llama-3.2-1b-instruct | Llama |
🦙 Llama 3.2 | Instruct | 3B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:3b | llama-3.2-3b-instruct | Llama |
🦙 Llama 3.1 | Instruct | 8B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8b | llama-3.1-8b-instruct | Llama |
🦙 Llama 3.1 | Instruct | 70B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70b | llama-3.1-70b-instruct | Llama |
Ⓜ️ Mixtral | Instruct | 8x7B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b | mixtral-8x7b-instruct | Apache |
🅿️ Phi 3.5 | Instruct | 3.8B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b | phi-3.5-3.8b-instruct | MIT |
🔡 Gemma 2 | Instruct | 2B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2b | gemma-2-2b-instruct | Gemma |
⌨️ Codestral 0.1 | Code | 22B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22b | codestral-22b | MNLP |
NVIDIA CUDA
Model | Optimization | Parameters | Command | Model Name | License |
---|---|---|---|---|---|
🦙 Llama 3.2 | Instruct | 1B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:1b | llama-3.2-1b-instruct | Llama |
🦙 Llama 3.2 | Instruct | 3B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:3b | llama-3.2-3b-instruct | Llama |
🦙 Llama 3.1 | Instruct | 8B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8b | llama-3.1-8b-instruct | Llama |
🦙 Llama 3.1 | Instruct | 70B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70b | llama-3.1-70b-instruct | Llama |
Ⓜ️ Mixtral | Instruct | 8x7B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7b | mixtral-8x7b-instruct | Apache |
🅿️ Phi 3.5 | Instruct | 3.8B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8b | phi-3.5-3.8b-instruct | MIT |
🔡 Gemma 2 | Instruct | 2B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2b | gemma-2-2b-instruct | Gemma |
⌨️ Codestral 0.1 | Code | 22B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22b | codestral-22b | MNLP |
📸 Flux 1 Dev | Text to image | 12B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/flux1:dev | flux-1-dev | FLUX.1 [dev] Non-Commercial License |
Please see models folder for pre-made model definitions.
If not being offloaded to GPU VRAM, minimum of 8GB of RAM is required for 7B models, 16GB of RAM to run 13B models, and 32GB of RAM to run 8x7B models.
All pre-made models include CUDA v12 libraries. They are used with NVIDIA GPU acceleration. If a supported NVIDIA GPU is not found in your system, AIKit will automatically fallback to CPU with the most optimized runtime (avx2
, avx
, or fallback
).
Deprecated Models
The following pre-made models are deprecated and no longer updated. Images will continue to be pullable, if needed.
If you need to use these specific models, you can always create your own images, and host in a container registry of your choice!
CPU
Model | Optimization | Parameters | Command | License |
---|---|---|---|---|
🐬 Orca 2 | 13B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/orca2:13b | Microsoft Research | |
🅿️ Phi 2 | Instruct | 2.7B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi2:2.7b | MIT |
🅿️ Phi 3 | Instruct | 3.8B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8b | phi-3-3.8b |
🦙 Llama 3 | Instruct | 8B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8b | llama-3-8b-instruct |
🦙 Llama 3 | Instruct | 70B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:70b | llama-3-70b-instruct |
🦙 Llama 2 | Chat | 7B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:7b | llama-2-7b-chat |
🦙 Llama 2 | Chat | 13B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:13b | llama-2-13b-chat |
🔡 Gemma 1.1 | Instruct | 2B | docker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2b | gemma-2b-instruct |
NVIDIA CUDA
Model | Optimization | Parameters | Command | License |
---|---|---|---|---|
🐬 Orca 2 | 13B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/orca2:13b-cuda | Microsoft Research | |
🅿️ Phi 2 | Instruct | 2.7B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi2:2.7b-cuda | MIT |
🅿️ Phi 3 | Instruct | 3.8B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8b | phi-3-3.8b |
🦙 Llama 3 | Instruct | 8B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:8b | llama-3-8b-instruct |
🦙 Llama 3 | Instruct | 70B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:70b | llama-3-70b-instruct |
🦙 Llama 2 | Chat | 7B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:7b | llama-2-7b-chat |
🦙 Llama 2 | Chat | 13B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:13b | llama-2-13b-chat |
🔡 Gemma 1.1 | Instruct | 2B | docker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2b | gemma-2b-instruct |