Back to blog
technicalquantizationguide

ZSE Quantization Guide: NF4 vs INT4 vs INT8

ZSE TeamFebruary 22, 20268 min read

ZSE Quantization Guide

Choosing the right quantization is key to balancing quality, speed, and memory.

Available Quantization Types

NF4 (NormalFloat4) - Default

zse convert model -o model.zse --quant nf4

Bits: 4

Quality: ★★★★☆ (best 4-bit)

Size: ~0.56GB per billion params

Use case: Most models, production deployments

NF4 uses an asymmetric quantization grid optimized for the weight distribution of neural networks.

INT4

zse convert model -o model.zse --quant int4

Bits: 4

Quality: ★★★☆☆

Size: ~0.53GB per billion params

Use case: Maximum compression, less sensitive tasks

INT8

zse convert model -o model.zse --quant int8

Bits: 8

Quality: ★★★★★ (near FP16)

Size: ~1.1GB per billion params

Use case: When quality is critical

FP16 (No Quantization)

zse convert model -o model.zse --quant fp16

Bits: 16

Quality: ★★★★★ (original)

Size: ~2GB per billion params

Use case: Fine-tuning, debugging

Quality Comparison (Qwen 7B)

Quant
Perplexity
MMLU
Size
FP16
5.38
64.8%
14GB
INT8
5.39
64.7%
7.5GB
NF4
5.42
64.2%
4.2GB
INT4
5.51
63.5%
4.0GB

Recommendations

General use: NF4 (best quality/size ratio)

Code generation: INT8 (higher precision helps)

Embeddings: INT8 or FP16

Chat/creative: NF4 is plenty

Related Posts