Documentation
DocsLocal Models
Local Models
Run AI completely offline on your machine. Zero latency, full privacy, no API costs.
Ollama
RecommendedThe easiest way to run local models. GRID auto-detects Ollama when running.
- 1Install from ollama.ai
- 2Run
ollama servein terminal - 3Pull a model:
ollama pull qwen2.5-coder - 4GRID will auto-detect and list available models
Recommended Models
| Model | Size | Best For |
|---|---|---|
| qwen2.5-coder:7b | 4.7 GB | Fast coding, autocomplete |
| codellama:13b | 7.4 GB | Balanced quality/speed |
| deepseek-coder-v2 | 8.9 GB | Complex reasoning |
Other Options
LM Studio
GUI-based model manager. Set endpoint to localhost:1234.
vLLM
Production-grade serving. Use OpenAI-compatible endpoint.