Can I Run It?
- Hardware to Run a 70B Model Locally: VRAM, the 48GB Wall, and Your Real Options
What it actually takes to run a 70B model at home: the VRAM math, why 48GB is the practical floor at Q4, and the four hardware paths (dual 3090, used A6000, Apple Silicon, or cloud).
- Hardware to Run a 7B/8B Model Locally: RTX 3090, Apple M3 Max, and Budget Options
Benchmark-backed hardware guide for running 7B and 8B parameter models locally. Covers RTX 3090, Apple M3 Max, RTX 3060, and Apple M4 — with first-party Apple M4 benchmarks, community throughput data, VRAM requirements, and honest trade-offs.
- The Local-AI Hardware Buying Framework
A constraint-first framework for choosing hardware to run AI models locally. Covers VRAM, memory bandwidth, quantization, Apple Silicon, and budget paths — so you buy once and regret nothing.
- Quantization: What It Means for Local AI and Why It Matters
Quantization reduces the numerical precision of a model's weights to shrink its memory footprint — the single technique that determines whether a 7B or 70B model fits in your GPU's VRAM and how fast it will run.