Linear 8-Bit Quantization
This section describes the following:
- Quantization Overview: Overview of linear quantization and considerations on model size and performance.
- Post-Training Quantization: Weight quantization of Core ML models using
ct.optimize.coreml.linear_quantize_weights
. - Training-Time Quantization: Linear quantization of PyTorch models with data and fine-tuning, using
ct.optimize.torch.quantization.LinearQuantizer
.
Updated 4 months ago