RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197 ??直接调用terminal.py会报如下错误，是CUDA版本不匹配的吗？当前CUDA12.1，pytorch2.2 #243

xiangcaoximilu · 2024-11-08T00:55:16Z

<frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()

:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
/usr/local/lib/python3.11/dist-packages/lmdeploy/model.py:1716: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(f'Did not find a chat template matching {query}.')
2024-11-08 08:40:10,742 - lmdeploy - WARNING - Did not find a chat template matching /hy-tmp.
Device does not support bfloat16. Set float16 forcefully
[TM][WARNING] [LlamaTritonModel] max_context_token_num = 32776.
2024-11-08 08:40:12,461 - lmdeploy - WARNING - get 227 model params
Convert to turbomind format: 81%|████████████████████████████████████████████████████████████████████████████████████████▌ Convert to turbomind format: 84%|███████████████████████████████████████████████████████████████████████████████████████████▉ Convert to turbomind format: 88%|███████████████████████████████████████████████████████████████████████████████████████████████▍ Convert to turbomind format: 91%|██████████████████████████████████████████████████████████████████████████████████████████████████▊ Convert to turbomind format: 94%|██████████████████████████████████████████████████████████████████████████████████████████████████████▏ Convert to turbomind format: 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▌Convert to turbomind format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████ Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/openai/api_server.py", line 1285, in serve
VariableInterface.async_engine = pipeline_class(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/async_engine.py", line 218, in init
self.gens_set.add(self.engine.create_instance())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 356, in create_instance
return TurboMindInstance(self, cuda_stream_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 388, in init
future.result()
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 397, in _create_model_instance
model_inst = self.tm_model.model_comm.create_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197

The text was updated successfully, but these errors were encountered:

Harold-lkk · 2024-11-08T03:42:42Z

用的是Mac么？

xiangcaoximilu · 2024-11-08T04:32:26Z

不是mac，云服务器，M40-24G，是硬件太老了吗？

xiangcaoximilu · 2024-11-08T05:00:34Z

请问有没有成功部署的，lmdepoly版本是什么呢

braisedpork1964 · 2024-11-08T05:23:02Z

lmdeploy 0.5.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197 ??直接调用terminal.py会报如下错误，是CUDA版本不匹配的吗？当前CUDA12.1，pytorch2.2 #243

RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197 ??直接调用terminal.py会报如下错误，是CUDA版本不匹配的吗？当前CUDA12.1，pytorch2.2 #243

xiangcaoximilu commented Nov 8, 2024

Harold-lkk commented Nov 8, 2024

xiangcaoximilu commented Nov 8, 2024

xiangcaoximilu commented Nov 8, 2024

braisedpork1964 commented Nov 8, 2024

RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197 ??直接调用terminal.py会报如下错误，是CUDA版本不匹配的吗？当前CUDA12.1，pytorch2.2 #243

RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197 ??直接调用terminal.py会报如下错误，是CUDA版本不匹配的吗？当前CUDA12.1，pytorch2.2 #243

Comments

xiangcaoximilu commented Nov 8, 2024

Harold-lkk commented Nov 8, 2024

xiangcaoximilu commented Nov 8, 2024

xiangcaoximilu commented Nov 8, 2024

braisedpork1964 commented Nov 8, 2024