pytorch与turbomind推理速度对比 #2538
zhuchen1109
started this conversation in
General
Replies: 2 comments 10 replies
-
目前还有一定差距
数据的话可以看下 https://github.com/InternLM/lmdeploy/actions/runs/11051851005 ,rps 一般差 15% 左右,长文本差距依然很大 如果有什么关于这个的建议或者看法欢迎交流 |
Beta Was this translation helpful? Give feedback.
6 replies
-
还想多问一个,Chunk Prefills有计划支持吗? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
新版pytorch支持了cuda graph,速度提升明显。那么现在pytorch是否赶上或超过turbomind的推理速度了呢?这个有正式的对比数据吗供参考吗?
Beta Was this translation helpful? Give feedback.
All reactions