Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: I am trying to run ollama/dolphin-phi available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

Closed
shubham-dayma opened this issue Jan 22, 2024 · 9 comments
Labels
area: backend Related to backend functionality or under the /backend directory bug Something isn't working Stale type: dependencies

Comments

@shubham-dayma
Copy link

What happened?

I am trying to run ollama/dolphin-phi model on ollama but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} error. I don't want to load llama2 on my ollama due to less resources available on my system.

Relevant log output

INFO:     192.168.29.176:50344 - "POST /chat/9ba4c7d0-80a0-4798-a2b0-5749748775c7/question?brain_id=f92b9784-847c-48df-a016-fb0933fc2323 HTTP/1.
1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 91, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 146, in simple_response
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/modules/chat/controller/chat_routes.py", line 171, in create_question_handler
    chat_answer = gpt_answer_generator.generate_answer(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/llm/knowledge_brain_qa.py", line 122, in generate_answer
    model_response = qa(
                     ^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 312, in __call__
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 306, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/conversational_retrieval/base.py", line 151, in _call
    docs = self._get_docs(new_question, inputs, run_manager=_run_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/conversational_retrieval/base.py", line 308, in _get_docs
    docs = self.retriever.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 211, in get_relevant_documents
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 204, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 656, in _get_relevant_documents
    docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/vectorstore/supabase.py", line 32, in similarity_search
    vectors = self._embedding.embed_documents([query])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 190, in embed_documents
    embeddings = self._embed(instruction_pairs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 175, in _embed
    embeddings = self._process_emb_response(prompt)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 160, in _process_emb_response
    raise ValueError(
ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'llama2' not found, try pulling it first"}

Twitter / LinkedIn details

No response

@shubham-dayma shubham-dayma added the bug Something isn't working label Jan 22, 2024
Copy link

dosubot bot commented Jan 22, 2024

🤖

Hello @shubham-dayma! 👋 I'm Dosu, a friendly bot here to help while a human maintainer gets to your issue. I'm great at solving bugs, answering questions, and helping folks like you become contributors.

Consider me a friendly helper while we wait for a human touch. I'm currently diving into the issue you've posted and will have a full response for you shortly. Hang tight! 🚀

@dosubot dosubot bot added area: backend Related to backend functionality or under the /backend directory type: dependencies labels Jan 22, 2024
@StanGirard
Copy link
Collaborator

ollama pull dolphin-phi my friend ;)

@shubham-dayma
Copy link
Author

@StanGirard Thanks for looking into this. I already had dolphin-phi in ollama. I tried checking ollama logs and it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings we are not passing param model and by default this is considering model as llama2. I changed this code as below and it worked.

@Property
def embeddings(self):
if self.brain_settings.ollama_api_base_url:
return OllamaEmbeddings(
base_url=self.brain_settings.ollama_api_base_url,
model=self.model.replace('ollama/', "")
) # pyright: ignore reportPrivateUsage=none
else:
return OpenAIEmbeddings()

Now I am wondering how should I get brain's model in backend/models/settings.py and update the initializing of OllamaEmbeddings here.

Copy link

dosubot bot commented Jan 24, 2024

🤖

Based on the context provided, it seems like you've already found a solution to the issue by modifying the OllamaEmbeddings initialization in quivr_rag.py. To apply a similar fix in backend/models/settings.py, you would need to have access to the model name in the get_embeddings function.

One way to achieve this could be to modify the BrainSettings class to include the model name, and then use this when initializing OllamaEmbeddings. Here's an example of how you could do this:

class BrainSettings(BaseSettings):
    ...
    model: str = "llama2"  # Add this line

    class Config:
        env_file = ".env"


def get_embeddings():
    settings = BrainSettings()
    if settings.ollama_api_base_url:
        embeddings = OllamaEmbeddings(
            base_url=settings.ollama_api_base_url,
            model=settings.model  # Use the model from settings here
        )
    else:
        embeddings = OpenAIEmbeddings()
    return embeddings

This solution assumes that the OllamaEmbeddings class accepts a model parameter during initialization. If it doesn't, you might need to modify the OllamaEmbeddings class to accept and use this parameter.

Please note that this is a potential solution based on the information provided and the context of the Quivr repository. It might not be the best or the only solution. If you have any further questions or need more help, feel free to ask.

For more information, you can refer to the Quivr repository and the FastAPI documentation on settings management.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@weihang-wills
Copy link

@StanGirard Thanks for looking into this. I already had dolphin-phi in ollama. I tried checking ollama logs and it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings we are not passing param model and by default this is considering model as llama2. I changed this code as below and it worked.

@Property
def embeddings(self):
if self.brain_settings.ollama_api_base_url:
return OllamaEmbeddings(
base_url=self.brain_settings.ollama_api_base_url,
model=self.model.replace('ollama/', "")
) # pyright: ignore reportPrivateUsage=none
else:
return OpenAIEmbeddings()

Now I am wondering how should I get brain's model in backend/models/settings.py and update the initializing of OllamaEmbeddings here.

embeddings = OllamaEmbeddings(model="XXX")

make it your own model, like "qwen:7b"、“llama2” and so on, that will be solved

@StanGirard
Copy link
Collaborator

StanGirard commented Mar 18, 2024 via email

@wwjCMP
Copy link

wwjCMP commented May 13, 2024

Exactly!

On Sun, Mar 17, 2024 at 21:20:51, Wills < @.*** > wrote: > > > @ StanGirard ( https://github.com/StanGirard ) Thanks for looking into > this. I already had dolphin-phi in ollama. I tried checking ollama logs and > it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings > we are not passing param model and by default this is considering model as > llama2. I changed this code ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/llm/rags/quivr_rag.py#L67 > ) as below and it worked. > > >> >> >> @ Property ( https://github.com/Property ) >> def embeddings(self): >> if self.brain_settings.ollama_api_base_url: >> return OllamaEmbeddings( >> base_url=self.brain_settings.ollama_api_base_url, >> model=self.model.replace('ollama/', "") >> ) # pyright: ignore reportPrivateUsage=none >> else: >> return OpenAIEmbeddings() >> >> > > > > Now I am wondering how should I get brain's model in backend/ models/ settings. > py ( http://backend/models/settings.py ) and update the initializing of OllamaEmbeddings > here ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/models/settings.py#L133 > ). > > embeddings = OllamaEmbeddings(model="XXX") make it your own model, like "qwen:7b"、“llama2” and so on, that will be solved — Reply to this email directly, view it on GitHub ( #2056 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AEVUW3DBZREPV5C7J6UV65LYYZTSHAVCNFSM6AAAAABCFEO5WWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSHEYDCNBQGA ). You are receiving this because you were mentioned. Message ID: </issues/2056/2002901400 @ github. com>

https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/llm/rags/quivr_rag.py#L67

How can I set up a local embedded model if the latest version does not have this file?

@mkhludnev
Copy link

come up with the fix #2690

Copy link
Contributor

Thanks for your contributions, we'll be closing this issue as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

@github-actions github-actions bot added the Stale label Sep 17, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: backend Related to backend functionality or under the /backend directory bug Something isn't working Stale type: dependencies
Projects
None yet
Development

No branches or pull requests

5 participants