Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation/advice on differences of direct vs codspeed runner test execution #93

Open
mih opened this issue Feb 20, 2024 · 2 comments

Comments

@mih
Copy link

mih commented Feb 20, 2024

In datalad/datalad-next#644 I have added benchmarks and a codspeed Github action. Benchmarks run fine locally and in the action via python -m pytest --codspeed datalad_next (see "Debug" step).

However, when executed within CodSpeedHQ/action@v2 the execution fails, because a required command is not found. This is independent of whether that command is installed via the method in the PR, or as an Ubuntu system package.

Can you advice on what to do in this case? Are there additional requirements to be met for compatibility with the codspeed runner?

Thanks in advance!

@adriencaccia
Copy link
Member

Since the action runs the provided run command under valgrind and some other stuff, it might be because under this new environment git-annex is not available.
You can enable the step debug logging to have more ingo about the command that was run under the action, which will help investigate.

@mih
Copy link
Author

mih commented Feb 21, 2024

Thanks for the pointer. This now makes clear what is being executed. I am posting the example of my usecase below (slightly formatted for better readability).

ARCH="x86_64"
CODSPEED_ENV="runner"
PATH="/tmp/codspeed_introspected_node:
  /opt/hostedtoolcache/Python/3.12.2/x64/bin:
  /opt/hostedtoolcache/Python/3.12.2/x64:
  /tmp/dl-build-4f3gnv48/git-annex.linux:
  /opt/hostedtoolcache/Python/3.12.2/x64/bin:
  /opt/hostedtoolcache/Python/3.12.2/x64:
  /snap/bin:/home/runner/.local/bin:
  /opt/pipx_bin:
  /home/runner/.cargo/bin:
  /home/runner/.config/composer/vendor/bin:
  /usr/local/.ghcup/bin:
  /home/runner/.dotnet/tools:
  /usr/local/sbin:
  /usr/local/bin:
  /usr/sbin:
  /usr/bin:
  /sbin:
  /bin:
  /usr/games:
  /usr/local/games:
  /snap/bin"
PYTHONHASHSEED="0"
PYTHONMALLOC="malloc"
"setarch" "x86_64" "-R"
  "valgrind" "-q"
    "--tool=callgrind"
    "--trace-children=yes"
    "--cache-sim=yes"
    "--I1=32768,8,64" "--D1=32768,8,64" "--LL=8388608,16,64"
    "--instr-atstart=no"
    "--collect-systime=nsec"
    "--compress-strings=no"
    "--combine-dumps=yes"
    "--dump-line=no"
    "--trace-children-skip=*esbuild"
    "--obj-skip=/opt/hostedtoolcache/Python/3.12.2/x64/lib/libpython3.12.so.1.0"
    "--obj-skip=/usr/local/bin/node"
    "--callgrind-out-file

This indicates that the PATH is properly propagated.

I tried reproducing the behavior locally with this valgrind call pattern. On the test system (Debian sid), git-annex is installed as an official system package, with no PATH manipulation necessary:

❯ apt-cache policy git-annex
git-annex:
  Installed: 10.20230802-1
  Candidate: 10.20230802-1
  Version table:
 *** 10.20230802-1 500
        500 http://deb.debian.org/debian sid/main amd64 Packages
❯ ldd /usr/bin/git-annex
        linux-vdso.so.1 (0x00007ffd309bf000)
        libyaml-0.so.2 => /lib/x86_64-linux-gnu/libyaml-0.so.2 (0x00007f35c97c5000)
        libsqlite3.so.0 => /lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f35c9655000)
        libmagic.so.1 => /lib/x86_64-linux-gnu/libmagic.so.1 (0x00007f35c9629000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f35c960a000)
        libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f35c9586000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f35c93a2000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f35c92c3000)
        libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007f35c92b6000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f35c9286000)
        libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f35c9273000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f35c9808000)

The benchmarks run fine with --trace-children=no. Here is the full command:

valgrind -q --tool=callgrind --trace-children=no --cache-sim=yes "--I1=32768,8,64" "--D1=32768,8,64" "--LL=8388608,16,64" --instr-atstart=no --collect-systime=nsec --compress-strings=no --combine-dumps=yes --dump-line=no "--trace-children-skip=*esbuild" python -m pytest --codspeed datalad_next
...
=============================== 2 passed, 391 deselected, 12 warnings in 15.64s ===============================
valgrind -q --tool=callgrind --trace-children=no --cache-sim=yes           -m  10,21s user 9,18s system 100% cpu 19,367 total

Switching to --trace-children=yes (while leaving everything else constant) causes the tests to fail.

valgrind -q --tool=callgrind --trace-children=yes --cache-sim=yes "--I1=32768,8,64" "--D1=32768,8,64" "--LL=8388608,16,64" --instr-atstart=no --collect-systime=nsec --compress-strings=no --combine-dumps=yes --dump-line=no "--trace-children-skip=*esbuild" python -m pytest --codspeed datalad_next
...
=========================================== short test summary info ===========================================
ERROR datalad_next/iter_collections/tests/test_itergitstatus.py::test_status_smrec - datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 36 failed:
ERROR datalad_next/iter_collections/tests/test_itergitstatus.py::test_status_monorec - datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 36 failed:
========================= 391 deselected, 12 warnings, 2 errors in 364.92s (0:06:04) ==========================
valgrind -q --tool=callgrind --trace-children=yes --cache-sim=yes           -  362,93s user 32,89s system 105% cpu 6:14,51 total

Importantly, the failure pattern is different locally from what is happening in the github action CI run. It looks like some kind of race condition.

These particular benchmarks call out to various Git and git-annex command line tools. It seems that the valgrind wrapping of such subprocesses causes significant changes in their behavior.

That being said: from a benchmarking perspective, I am not interested in what these external tools do exactly. I am only interested in the performance of the Python code that calls out to them. Would it be sensible to turn off the subprocess tracing? And if so, is this somehow possible from the outside?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants