To run Phi-3.5 Mini 3.8B locally at Q4_K_M quantization, you need at minimum 6.4 GB of GPU VRAM.
Used market gem. Tight on VRAM but viable for this workload.
Strong inference GPU. Handles 7-13B models comfortably.
Best consumer GPU. Breeze through 13B models at any quantization.
Use the interactive calculator to compare Phi-3.5 Mini 3.8B across all available formats.
Open Live Calculator →