Llama-3.2-3B-Instruct (GGUF / Apple Silicon)
Processby mm-team · 5/18/2026
Llama 3.2 3B Instruct Q4_K_M via llama-server on Apple Silicon (M1/M2/M3). Metal GPU acceleration. OpenAI-compatible API on port 8080.
↓ 9 downloads ♥ 3 likes ⚡ 6 boots macos
config.json
{
"server_type": "process",
"hardware_type": "macos",
"script": "scripts/process/macos/setup.sh",
"env": {
"MODEL_PATH": "models/llama-3.2-3b-instruct-q4_k_m.gguf",
"HOST": "127.0.0.1",
"PORT": "8080",
"GPU_LAYERS": "99",
"CTX_SIZE": "4096"
},
"health_check": "http://127.0.0.1:8080/health",
"startup_timeout_secs": 45
}