CUDA: fix race condition in MMQ stream-k fixup #13299

JohannesGaessler · 2025-05-04T12:12:11Z

Follow-up to #13294 . I forgot that the stream-k fixup would suffer from the same problem.

CUDA: fix race condition in MMQ stream-k fixup

ba4e521

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels May 4, 2025

This was referenced May 4, 2025

Eval bug: Heavy nondeterminism in Qwen3 MoE (CUDA) #13280

Closed

Eval bug: b5237 broke Llama Scout #13287

Closed

slaren approved these changes May 4, 2025

View reviewed changes

JohannesGaessler merged commit 93c4e23 into ggml-org:master May 4, 2025
46 checks passed

This was linked to issues May 4, 2025

Eval bug: Heavy nondeterminism in Qwen3 MoE (CUDA) #13280

Closed

Eval bug: b5237 broke Llama Scout #13287

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA: fix race condition in MMQ stream-k fixup #13299

CUDA: fix race condition in MMQ stream-k fixup #13299

JohannesGaessler commented May 4, 2025

CUDA: fix race condition in MMQ stream-k fixup #13299

CUDA: fix race condition in MMQ stream-k fixup #13299

Conversation

JohannesGaessler commented May 4, 2025