Inferencer Cheatsheet

Overview

Inferencer is a local AI model runner for macOS.

Features

Supported Formats

Performance Tips

Tip	Description
Quantization	Use Q4_K_M for good balance
GPU layers	Maximize GPU offload
Context size	Reduce for faster inference
Batch size	Increase for throughput
Metal	Ensure GPU acceleration enabled

Edit Markdown files in /content/ · Auto-index regenerated every deploy · / or ⌘K to search