Inferencer | BeeDifferent

Inferencer Local model inference runner

⌨️ Shortcuts 📋 Cheatsheet 📖 Docs ↗

Inferencer Cheatsheet

Overview

Inferencer is a local AI model runner for macOS.

Features

Feature	Description
Local inference	Run models on-device
Model library	Browse and download models
Chat interface	Conversational UI
API server	Local API endpoint
GPU acceleration	Metal/MPS support

Supported Formats

Format	Description
GGUF	llama.cpp quantized models
MLX	Apple MLX framework
CoreML	Apple CoreML models

Performance Tips

Tip	Description
Quantization	Use Q4_K_M for good balance
GPU layers	Maximize GPU offload
Context size	Reduce for faster inference
Batch size	Increase for throughput
Metal	Ensure GPU acceleration enabled

Inferencer Shortcuts

General

Shortcut	Action
⌘N	New session
⌘K	Quick model switch
⌘,	Settings
⌘Enter	Send/Generate
⌘L	Clear context
⌘/	Toggle sidebar

Model Management

Shortcut	Action
⌘⇧M	Model browser
⌘⇧D	Download model
⌘⇧I	Model info
⌘R	Reload model