Second Brain Workflow

The personal knowledge pipeline โ€” document ingestion via Paperless-NGX, semantic search and knowledge graph via the Life Archive RAG system, and structured knowledge management in Tana.

Last updated: March 2026. Previous version referenced old infrastructure (TrueNAS, Proxmox at .230, paperless-ai) โ€” all decommissioned.

Architecture Overview
Source documents (PDFs, email attachments, Evernote exports, magazines)
    โ†“
Paperless-NGX (Mac Studio :8100)
    OCR, full-text index, tagging, archival
    โ†“
Life Archive RAG Pipeline (Mac Studio :8900)
    gte-Qwen2-7B embeddings โ†’ LanceDB vectors โ†’ knowledge graph
    Multi-strategy retrieval: dense + SPLADE + QA pairs + KG + HyDE
    โ†“
Query interfaces
    Claude MCP tools (life_archive_search, entity_lookup, etc.)
    HTTP API (:8900)  ยท  KG web explorer (:1313/kg/)
    โ†“
Tana โ€” structured knowledge workspace
    394+ contacts, 85+ plant species, homelab docs, tasks
Paperless-NGX

Runs on Mac Studio via Docker at http://192.168.8.180:8100.

Start/stop:

cd ~/paperless-ngx/docker
docker compose up -d
docker compose down

Key volumes:

Host Path Purpose
~/paperless-ngx/data/ SQLite DB and search index
~/paperless-ngx/media/ Stored documents
~/paperless-ngx/consume/ Drop files here to ingest (polls every 10s)
~/paperless-ngx/export/ Bulk export output

Ingest a file: Drop it in ~/paperless-ngx/consume/ โ€” Paperless OCRs, tags, and indexes automatically within ~10 seconds.

Paperless API:

curl -H "Authorization: Token 9838fafecb452b514ee0cfcc84ce42df718d4984" \
  http://localhost:8100/api/documents/

Note: paperless-ai container (AI auto-tagger) is stopped โ€” was causing memory pressure. Manual tagging or tag rules handle classification instead.

Life Archive

See the full Life Archive page for complete documentation. Summary:

Metric Value
Total documents ~74K in LanceDB
Paragraphs indexed ~2.69M
Knowledge graph entities ~276K
Sources Evernote, emails, magazines, Tana nodes, Paperless docs
Embedding model gte-Qwen2-7B on Apple MPS (port 1235)
Query API FastAPI at port 8900
MCP server Streamable HTTP at port 8901

Check service status:

launchctl list | grep beedifferent

Quick search:

curl -X POST http://localhost:8900/search \
  -H "Content-Type: application/json" \
  -d '{"query": "your question here"}'
Evernote Source Material

Evernote exports (155 notebooks) were converted using Yarle and are staged at ~/Sync/ED/life_archive/:

Directory Contents
Evernote/ Source ENEX exports
EmailAttachments/ ~9,490 extracted email attachments

Yarle configs:

  • ~/paperless-ngx/yarle_config_tana.json โ€” Tana Internal Format output
  • ~/paperless-ngx/yarle_config_paperless.json โ€” HTML/MD for Paperless ingestion

Both passes were completed. Evernote notes are fully indexed in both Paperless and the Life Archive.

Tana

Tana is the structured knowledge layer โ€” everything that needs relationships, fields, and queries rather than just search.

Two workspaces:

  • Main / BeeDifferent โ€” contacts (~394), general knowledge, tasks
  • Brownsville โ€” 93-acre property management: 85+ plant species, 9 habitat zones, 10 custom supertags, ecological tracking

MCP integration: tana-local MCP server connects Claude directly to both workspaces. See MCP Servers.

Key supertags: Contact, Place, Thing, Event, Activity, Resource, Concept, Plant (55 fields), Habitat Zone, Observation.

Common Tasks

Ingest a document into Paperless:

cp /path/to/document.pdf ~/paperless-ngx/consume/

Search the Life Archive: Ask Claude directly โ€” life_archive_search MCP tool is connected. Or use the API at http://192.168.8.180:8900/docs.

Check Life Archive services:

launchctl list | grep beedifferent
curl http://localhost:8900/health
curl http://localhost:8901/mcp

Restart embed server:

launchctl kickstart -k gui/$(id -u)/com.beedifferent.embed-server

View logs:

tail -f ~/Sync/ED/life_archive/http_api.stdout.log
tail -f ~/Sync/ED/life_archive/http_api.stderr.log