Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantize Bias for Conv/Gemm on Quantized Model #22889

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

centwang
Copy link
Contributor

@centwang centwang commented Nov 19, 2024

Some quantized models don't have Conv/Gemm node's bias quantized but still leave them in float. This PR is to create a sub-graph to quantize the bias for Conv/Gemm nodes with scale = scale_input_0 * scale_input_1 and zp = 0. We only do this for bias initializer so that ConstantFolding will fold the sub-graph to a real quantized int32 bias initializer during the graph optimization next round.

@centwang centwang marked this pull request as ready for review November 26, 2024 04:47
Comment on lines +90 to +91
// Bias is quantized to int32.
ONNX_NAMESPACE::TypeProto int32_type_proto;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this data type work in general or would it potentially be EP specific. i.e. if the EP uses the quantized op is it universally expected that the bias input would be int32 if the data type is say 8-bit int for the quantized Conv/Gemm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In https://github.com/onnx/onnx/blob/main/docs/Operators.md#QLinearConv, T4 (for bias) is tensor(int32) only. Same from QGemm schema, so I guess this work in general?

// Bias DQ node produces output to Conv/Gemm node's input_2, with scale = input_scale_0 * input_scale_1, zp = 0.
NodeArg& bias_dq_node_arg =
graph.GetOrCreateNodeArg(graph.GenerateNodeArgName(node.Name() + "_bias_dq"), &bias_dq_type);
Node& dp_node = graph.AddNode(graph.GenerateNodeName(node.Name() + "_bias_dq"), QDQ::DQOpName, "Bias DQ node",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dq_node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants