Quantization
QuantizeStep
Bases: PipelineStep
Pipeline step for quantizing a model using ONNX Runtime.
Source code in textforge/quantize.py
__init__()
convert_to_onnx(output_path)
Convert the model to ONNX format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
str
|
Directory containing the model to be converted. |
required |
Source code in textforge/quantize.py
convert_to_onnx_q8(output_path)
Quantize the ONNX model to 8-bit precision.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
str
|
Directory containing the ONNX model. |
required |
Source code in textforge/quantize.py
run(output_path)
Run the quantization pipeline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
str
|
Directory where the model files are located. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
The output path after quantization is complete. |