Skip to main content

Pre-made Models

AIKit comes with pre-made models that you can use out-of-the-box!

If it doesn't include a specific model, you can always create your own images, and host in a container registry of your choice!

CPU

ModelOptimizationParametersCommandModel NameLicense
🦙 Llama 3.2Instruct1Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:1bllama-3.2-1b-instructLlama
🦙 Llama 3.2Instruct3Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.2:3bllama-3.2-3b-instructLlama
🦙 Llama 3.1Instruct8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:8bllama-3.1-8b-instructLlama
🦙 Llama 3.1Instruct70Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3.1:70bllama-3.1-70b-instructLlama
Ⓜ️ MixtralInstruct8x7Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/mixtral:8x7bmixtral-8x7b-instructApache
🅿️ Phi 3.5Instruct3.8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8bphi-3.5-3.8b-instructMIT
🔡 Gemma 2Instruct2Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma2:2bgemma-2-2b-instructGemma
⌨️ Codestral 0.1Code22Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/codestral:22bcodestral-22bMNLP

NVIDIA CUDA

ModelOptimizationParametersCommandModel NameLicense
🦙 Llama 3.2Instruct1Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:1bllama-3.2-1b-instructLlama
🦙 Llama 3.2Instruct3Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.2:3bllama-3.2-3b-instructLlama
🦙 Llama 3.1Instruct8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:8bllama-3.1-8b-instructLlama
🦙 Llama 3.1Instruct70Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3.1:70bllama-3.1-70b-instructLlama
Ⓜ️ MixtralInstruct8x7Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/mixtral:8x7bmixtral-8x7b-instructApache
🅿️ Phi 3.5Instruct3.8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3.5:3.8bphi-3.5-3.8b-instructMIT
🔡 Gemma 2Instruct2Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma2:2bgemma-2-2b-instructGemma
⌨️ Codestral 0.1Code22Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/codestral:22bcodestral-22bMNLP
📸 Flux 1 DevText to image12Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/flux1:devflux-1-devFLUX.1 [dev] Non-Commercial License
note

Please see models folder for pre-made model definitions.

If not being offloaded to GPU VRAM, minimum of 8GB of RAM is required for 7B models, 16GB of RAM to run 13B models, and 32GB of RAM to run 8x7B models.

All pre-made models include CUDA v12 libraries. They are used with NVIDIA GPU acceleration. If a supported NVIDIA GPU is not found in your system, AIKit will automatically fallback to CPU with the most optimized runtime (avx2, avx, or fallback).

Deprecated Models

The following pre-made models are deprecated and no longer updated. Images will continue to be pullable, if needed.

If you need to use these specific models, you can always create your own images, and host in a container registry of your choice!

CPU

ModelOptimizationParametersCommandLicense
🐬 Orca 213Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/orca2:13bMicrosoft Research
🅿️ Phi 2Instruct2.7Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi2:2.7bMIT
🅿️ Phi 3Instruct3.8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/phi3:3.8bphi-3-3.8b
🦙 Llama 3Instruct8Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:8bllama-3-8b-instruct
🦙 Llama 3Instruct70Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama3:70bllama-3-70b-instruct
🦙 Llama 2Chat7Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:7bllama-2-7b-chat
🦙 Llama 2Chat13Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/llama2:13bllama-2-13b-chat
🔡 Gemma 1.1Instruct2Bdocker run -d --rm -p 8080:8080 ghcr.io/sozercan/gemma:2bgemma-2b-instruct

NVIDIA CUDA

ModelOptimizationParametersCommandLicense
🐬 Orca 213Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/orca2:13b-cudaMicrosoft Research
🅿️ Phi 2Instruct2.7Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi2:2.7b-cudaMIT
🅿️ Phi 3Instruct3.8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/phi3:3.8bphi-3-3.8b
🦙 Llama 3Instruct8Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:8bllama-3-8b-instruct
🦙 Llama 3Instruct70Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama3:70bllama-3-70b-instruct
🦙 Llama 2Chat7Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:7bllama-2-7b-chat
🦙 Llama 2Chat13Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/llama2:13bllama-2-13b-chat
🔡 Gemma 1.1Instruct2Bdocker run -d --rm --gpus all -p 8080:8080 ghcr.io/sozercan/gemma:2bgemma-2b-instruct