Overview
Inferencer is a local AI model runner for macOS.
Features
| Feature | Description |
|---|---|
| Local inference | Run models on-device |
| Model library | Browse and download models |
| Chat interface | Conversational UI |
| API server | Local API endpoint |
| GPU acceleration | Metal/MPS support |
Supported Formats
| Format | Description |
|---|---|
| GGUF | llama.cpp quantized models |
| MLX | Apple MLX framework |
| CoreML | Apple CoreML models |
Performance Tips
| Tip | Description |
|---|---|
| Quantization | Use Q4_K_M for good balance |
| GPU layers | Maximize GPU offload |
| Context size | Reduce for faster inference |
| Batch size | Increase for throughput |
| Metal | Ensure GPU acceleration enabled |