To run Gemma 3 12B locally at FP16 quantization, you need at minimum 29.6 GB of GPU VRAM.
Dual-GPU via tensor parallelism. Best cost per GB at this tier.
Single-card 48GB pro GPU. Clean setup, no multi-GPU overhead.
Data-centre HBM2e bandwidth. Dramatically faster throughput.
Use the interactive calculator to compare Gemma 3 12B across all available formats.
Open Live Calculator →