Snap Library
Community-verified AI model configurations
Llama-3.2-3B-Instruct (GGUF / CUDA)
processLlama 3.2 3B Instruct Q4_K_M via llama-server. Runs on RTX 3060 or better (12 GB VRAM). OpenAI-compatible API on port 8080.
↓ 12 ♥ 4 ⚡ 8 cuda
Llama-3.2-3B-Instruct (GGUF / Apple Silicon)
processLlama 3.2 3B Instruct Q4_K_M via llama-server on Apple Silicon (M1/M2/M3). Metal GPU acceleration. OpenAI-compatible API on port 8080.
↓ 9 ♥ 3 ⚡ 6 macos
Qwen2.5-7B-Instruct (vLLM / Docker / CUDA)
dockerQwen2.5 7B Instruct served via vLLM in Docker. Requires 16 GB VRAM (RTX 3080 / 4070 or better). OpenAI-compatible API.
↓ 21 ♥ 7 ⚡ 15 cuda
SDXL Turbo txt2img (ComfyUI Workflow)
comfyui-workflowSDXL Turbo text-to-image workflow for ComfyUI. Fast single-step generation. Tested on RTX 3080 (10 GB VRAM). Generates 1024x1024 in ~1s.
↓ 35 ♥ 11 ⚡ 28 cuda
LFM-2B (GGUF / CUDA)
processLiquid Foundation Model 2B in GGUF format via llama-server. Ultra-fast on any CUDA GPU. Only ~2 GB VRAM needed. OpenAI-compatible API on port 8080.
↓ 18 ♥ 5 ⚡ 14 cuda