Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] how to set the threads when using TRT EP #22913

Open
noahzn opened this issue Nov 21, 2024 · 2 comments
Open

[Performance] how to set the threads when using TRT EP #22913

noahzn opened this issue Nov 21, 2024 · 2 comments
Labels
ep:TensorRT issues related to TensorRT execution provider performance issues related to performance regressions platform:jetson issues related to the NVIDIA Jetson platform

Comments

@noahzn
Copy link

noahzn commented Nov 21, 2024

Describe the issue

I notice multiple threads when using ONNXRUNTIME (TRT EP). Is this a normal behavior?
Image

From the documentation it says:

Set number of intra-op threads
Onnxruntime sessions utilize multi-threading to parallelize computation inside each operator.

By default with intra_op_num_threads=0 or not set, each session will start with the main thread on the 1st core (not affinitized). Then extra threads per additional physical core are created, and affinitized to that core (1 or 2 logical processors).

I'm using TRT EP, although in providers I also include CPUExecutionProvider and CUDAExecutionProvider. How can I set number of threads for TRT EP? Thanks.

To reproduce

No code can be provided.

Urgency

No response

Platform

Other / Unknown

OS Version

JetPack=5.1.2

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu=1.17.0

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

TensorRT

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

No

@noahzn noahzn added the performance issues related to performance regressions label Nov 21, 2024
@github-actions github-actions bot added platform:jetson issues related to the NVIDIA Jetson platform ep:TensorRT issues related to TensorRT execution provider labels Nov 21, 2024
@yf711
Copy link
Contributor

yf711 commented Nov 25, 2024

@skottmckay
Copy link
Contributor

AFAIK the intra op threads are used for CPU based kernels, and the threadpools are created during inference session initialization.

Not sure if the TRT EP uses that count to somehow control the number of threads that TRT uses, but I wouldn't necessarily expect those threads to show up in the system threads.

Is there a problem/concern with CPU based threads being created but potentially sitting idle if TRT is being used to run the model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:TensorRT issues related to TensorRT execution provider performance issues related to performance regressions platform:jetson issues related to the NVIDIA Jetson platform
Projects
None yet
Development

No branches or pull requests

3 participants