You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
/usr/local/lib/python3.11/dist-packages/lmdeploy/model.py:1716: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(f'Did not find a chat template matching {query}.')
2024-11-08 08:40:10,742 - lmdeploy - WARNING - Did not find a chat template matching /hy-tmp.
Device does not support bfloat16. Set float16 forcefully
[TM][WARNING] [LlamaTritonModel] max_context_token_num = 32776.
2024-11-08 08:40:12,461 - lmdeploy - WARNING - get 227 model params
Convert to turbomind format: 81%|████████████████████████████████████████████████████████████████████████████████████████▌ Convert to turbomind format: 84%|███████████████████████████████████████████████████████████████████████████████████████████▉ Convert to turbomind format: 88%|███████████████████████████████████████████████████████████████████████████████████████████████▍ Convert to turbomind format: 91%|██████████████████████████████████████████████████████████████████████████████████████████████████▊ Convert to turbomind format: 94%|██████████████████████████████████████████████████████████████████████████████████████████████████████▏ Convert to turbomind format: 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▌Convert to turbomind format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████ Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/openai/api_server.py", line 1285, in serve
VariableInterface.async_engine = pipeline_class(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/async_engine.py", line 218, in init
self.gens_set.add(self.engine.create_instance())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 356, in create_instance
return TurboMindInstance(self, cuda_stream_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 388, in init
future.result()
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 397, in _create_model_instance
model_inst = self.tm_model.model_comm.create_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197
The text was updated successfully, but these errors were encountered:
<frozen importlib._bootstrap>:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
:1047: ImportWarning: _SixMetaPathImporter.find_spec() not found; falling back to find_module()
/usr/local/lib/python3.11/dist-packages/lmdeploy/model.py:1716: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(f'Did not find a chat template matching {query}.')
2024-11-08 08:40:10,742 - lmdeploy - WARNING - Did not find a chat template matching /hy-tmp.
Device does not support bfloat16. Set float16 forcefully
[TM][WARNING] [LlamaTritonModel]
max_context_token_num
= 32776.2024-11-08 08:40:12,461 - lmdeploy - WARNING - get 227 model params
Convert to turbomind format: 81%|████████████████████████████████████████████████████████████████████████████████████████▌ Convert to turbomind format: 84%|███████████████████████████████████████████████████████████████████████████████████████████▉ Convert to turbomind format: 88%|███████████████████████████████████████████████████████████████████████████████████████████████▍ Convert to turbomind format: 91%|██████████████████████████████████████████████████████████████████████████████████████████████████▊ Convert to turbomind format: 94%|██████████████████████████████████████████████████████████████████████████████████████████████████████▏ Convert to turbomind format: 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████▌Convert to turbomind format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████ Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/openai/api_server.py", line 1285, in serve
VariableInterface.async_engine = pipeline_class(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/serve/async_engine.py", line 218, in init
self.gens_set.add(self.engine.create_instance())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 356, in create_instance
return TurboMindInstance(self, cuda_stream_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 388, in init
future.result()
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/lmdeploy/turbomind/turbomind.py", line 397, in _create_model_instance
model_inst = self.tm_model.model_comm.create_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:197
The text was updated successfully, but these errors were encountered: