[Inference] use fp8 cuda core gemm kernel when M<=4 #15713

Sign in to view logs

Summary
Jobs
- Test
Run details
- Usage
- Workflow file

Triggered via pull request November 26, 2024 03:39

zhink

synchronize #9423

zhink:develop

Status Success

Total duration 31m 14s

Artifacts –

tests.yml

on: pull_request

Annotations

1 warning

The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3, actions/setup-python@v4. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/