-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solver diverges when using CUDA #22
Comments
Thank you for your feedback! Could you please provide us with scripts and the LP instance in |
I also ran into this problem, and did a little bit of debugging to figure out why the CPU and the GPU versions work differently. While I have not fully figured it out, I did find something: If I, in However, I did these tests in a rush, and might have messed up, don't trust my findings. I'm using CUDA 12.4.1 on Windows 10. |
After changing to CUDA 12.3.2, the problem with |
Changing |
Bummer! It's probably a different problem then. I forgot to mention one thing before: When changing |
Downgraded CUDA from 12.4 to 12.3 (and gcc/g++ from 13 to 12) and get a different error instead, e.g. running on the example mps:
|
I also had the case of my problem blowing up the step size I ended up downgrading to HiGHS v1.6.0 and cuda-toolkit-12-3. Here are my notes/steps on clean Ubuntu 22.04:
|
I almost don't want to admit this, but it seems like rebooting did the trick, after downgrading to 12.3. |
Hi , I am from NVIDIA and someone reported this case 4601974 to us . I am looking into a reproducing in house on CUDA 12.4 . But will need any of the LP instances in .mps file that can reproduce the issue . If it is not confidential , can anyone share the .mps file to alias [email protected] ? On the other side , CUSPARSE_LOG_LEVEL=5 can get useful logs if none of the .mps can be shared . Thanks . |
It looks like the instance |
Thanks ! Also we got another user volunteered to provide us a .mps case . We can reproduce the issue in house and will get back here conclusion after investigation . |
I think this issue is solved by #32 |
I have a large-ish LP which
plc
can solve just fine when compiled with-DBUILD_CUDA=OFF
, but with-DBUILD_CUDA=ON
I'm getting output a laIt looks like the step sizes are blowing up(?) -- can I somehow control those? Setting
-eLineSearchMethod 0
seems like it would make it easier to control the steps, but that gives a number of errors:The text was updated successfully, but these errors were encountered: