-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Takes too much VRAM to transcribe audios #13
Comments
How are you running this exactly. 40GB should definately be more than sufficient VRAM :) Have you tried running the example code from the repo like this replacing the 'your_audio_path.mp3' with your actual audio? `import os from datasets import load_dataset device = "cuda:0" if torch.cuda.is_available() else "cpu" model_id = "nyrahealth/CrisperWhisper" model = AutoModelForSpeechSeq2Seq.from_pretrained( processor = AutoProcessor.from_pretrained(model_id) pipe = pipeline( hf_pipeline_output = pipe('your_audio_path.mp3') If this does not solve your issue please send me some code so i can reproduce the issue :) |
Thank you for your response. I have given my code below. I am running it in google Collab on a A100 GPU. I am using the same code that you sent after installing required libraries and logging into HuggingFace. I get a CUDA OOM error when transcribing a 2 minute audio. !pip install torch torchaudio #From Laurin #from datasets import load_dataset device = "cuda:0" if torch.cuda.is_available() else "cpu" model_id = "nyrahealth/CrisperWhisper" model = AutoModelForSpeechSeq2Seq.from_pretrained( processor = AutoProcessor.from_pretrained(model_id) pipe = pipeline( |
Same on A100 80G: python transcribe.py --f audio.aac An error occurred while transcribing the audio: CUDA out of memory. Tried to allocate 768.00 MiB. GPU 0 has a total capacity of 79.14 GiB of which 164.75 MiB is free. Including non-PyTorch memory, this process has 78.97 GiB memory in use. Of the allocated memory 73.51 GiB is allocated by PyTorch, and 4.96 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) |
okay, lowering the batch size ( for example 1 in the extreme case) to fit your GPU size and/or adjusting the beam size should resolve your issue. Could you try this out and let me know how it went? number of beams you can adjust by using the generate_kwargs argument: |
Hi,
Thank you for your work.
I tested CrisperWhisper today with an audio of 2 minute duration on a NVIDAI A100 GPU. The model VRAM footprint is only 3.5GB which is great. However, when processing a 2 minute audio, I get a CUDA out of memory error as the GPU usage goes above 40 GB.
Is this something that will be fixed soon. If not, what would be the best solution to handle long audios.
The text was updated successfully, but these errors were encountered: