Quantize Bias for Conv/Gemm on Quantized Model #22889

centwang · 2024-11-19T10:54:58Z

Some quantized models don't have Conv/Gemm node's bias quantized but still leave them in float. This PR is to create a sub-graph to quantize the bias for Conv/Gemm nodes with scale = scale_input_0 * scale_input_1 and zp = 0. We only do this for bias initializer so that ConstantFolding will fold the sub-graph to a real quantized int32 bias initializer during the graph optimization next round.

skottmckay · 2024-11-26T07:34:06Z

onnxruntime/core/optimizer/qdq_transformer/bias_quantization.cc

+    // Bias is quantized to int32.
+    ONNX_NAMESPACE::TypeProto int32_type_proto;


Does this data type work in general or would it potentially be EP specific. i.e. if the EP uses the quantized op is it universally expected that the bias input would be int32 if the data type is say 8-bit int for the quantized Conv/Gemm?

In https://github.com/onnx/onnx/blob/main/docs/Operators.md#QLinearConv, T4 (for bias) is tensor(int32) only. Same from QGemm schema, so I guess this work in general?

skottmckay · 2024-11-26T07:34:59Z

onnxruntime/core/optimizer/qdq_transformer/bias_quantization.cc

+    // Bias DQ node produces output to Conv/Gemm node's input_2, with scale = input_scale_0 * input_scale_1, zp = 0.
+    NodeArg& bias_dq_node_arg =
+        graph.GetOrCreateNodeArg(graph.GenerateNodeArgName(node.Name() + "_bias_dq"), &bias_dq_type);
+    Node& dp_node = graph.AddNode(graph.GenerateNodeName(node.Name() + "_bias_dq"), QDQ::DQOpName, "Bias DQ node",


centwang added 2 commits November 26, 2024 10:34

bias quantization optimizer

1b9c20c

add uts

4860244

centwang force-pushed the weicwang/bias_quantization branch from 60f58bc to 4860244 Compare November 26, 2024 04:43

centwang marked this pull request as ready for review November 26, 2024 04:47

centwang requested review from adrianlizarraga, skottmckay and jywu-msft November 26, 2024 04:47

fix build error

3d475b1

skottmckay reviewed Nov 26, 2024

View reviewed changes

fix typo

49064e6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantize Bias for Conv/Gemm on Quantized Model #22889

Quantize Bias for Conv/Gemm on Quantized Model #22889

centwang commented Nov 19, 2024 •

edited

Loading

skottmckay Nov 26, 2024

centwang Nov 26, 2024

skottmckay Nov 26, 2024

		// Bias is quantized to int32.
		ONNX_NAMESPACE::TypeProto int32_type_proto;

Quantize Bias for Conv/Gemm on Quantized Model #22889

Are you sure you want to change the base?

Quantize Bias for Conv/Gemm on Quantized Model #22889

Conversation

centwang commented Nov 19, 2024 • edited Loading

skottmckay Nov 26, 2024

Choose a reason for hiding this comment

centwang Nov 26, 2024

Choose a reason for hiding this comment

skottmckay Nov 26, 2024

Choose a reason for hiding this comment

centwang commented Nov 19, 2024 •

edited

Loading