-
-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Installation]: compilation of flash-attn e4m3 kernels fails due to layout incompatibility in copy_traits.hpp
installation
Installation problems
#17597
opened May 2, 2025 by
m-rds
1 task done
[Bug]: fp8 w8a8 quantized Qwen2.5-VL hits AssertionError
bug
Something isn't working
#17595
opened May 2, 2025 by
cebtenzzre
1 task done
[Bug]: torch._inductor.exc.InductorError: TypeError: cannot pickle 'torch._C.DispatchKeySet' object
bug
Something isn't working
torch.compile
#17593
opened May 2, 2025 by
zou3519
1 task done
[Bug]: vLLM pre-commit hook doesn't work with git worktree
bug
Something isn't working
#17592
opened May 2, 2025 by
zou3519
[Bug]: Cannot load Gemma3 27b QAT GGUF on RTX 5090
bug
Something isn't working
#17587
opened May 2, 2025 by
FremyCompany
1 task done
[Feature]: benchmarks for vllm, it should support OpenAI Chat Completions API
feature request
New feature or request
#17586
opened May 2, 2025 by
xiayu98020214
1 task done
[Bug]: Mistral tool parser & streaming: corrupt tool_calls completions
bug
Something isn't working
#17585
opened May 2, 2025 by
hibukipanim
[Bug]: Qwen3 FP8 on 0.8.5: type fp8e4nv not supported in this architecture.
bug
Something isn't working
#17581
opened May 2, 2025 by
AlexBefest
1 task done
[Bug]: V1 Engine generates corrupt responses on large batch inference with long sequences and fails in seed control
bug
Something isn't working
#17580
opened May 2, 2025 by
Cakeyan
1 task done
[Feature]: support for fp8 marlin with MoE
feature request
New feature or request
#17579
opened May 2, 2025 by
ehartford
1 task done
[Usage]: Support Qwen3 inference in vLLM==0.8.5 with CUDA 11.8 (currently only vLLM==0.6.1.post1 works)
usage
How to use vllm
#17578
opened May 2, 2025 by
YihongT
[Bug]: ValueError: The output_size of gate's and up's weight = 192 is not divisible by weight quantization block_n = 128.
bug
Something isn't working
#17569
opened May 2, 2025 by
zzlgreat
1 task done
[Performance]: 单次请求速度30t/s ,并发请求只有1.5t/s
performance
Performance-related issues
#17568
opened May 2, 2025 by
nvliajia
[Bug]: Function calling does not work with Mistral Small
bug
Something isn't working
#17557
opened May 1, 2025 by
menardorama
1 task done
[Bug]: Something isn't working
top_k: 0
in generation_config.json can't disable top-k sampling
bug
#17553
opened May 1, 2025 by
toslunar
1 task done
[Feature]: Support HF-style chat template for multi-modal data in offline chat
feature request
New feature or request
good first issue
Good for newcomers
#17551
opened May 1, 2025 by
DarkLight1337
1 task done
[Usage]: understanding the vllm's gpu_memory_utilization and cuda graph memory requirement
usage
How to use vllm
#17549
opened May 1, 2025 by
mces89
1 task done
[Bug]: failed to run LMCache example for v0
bug
Something isn't working
#17545
opened May 1, 2025 by
gaowayne
1 task done
[Bug]: cached_get_processor is not actually cached
bug
Something isn't working
#17543
opened May 1, 2025 by
Zazzle516
1 task done
[Performance]: Performance comparison for v1 engine and v0 engine
performance
Performance-related issues
#17540
opened May 1, 2025 by
hustxiayang
1 task done
[Usage]: CUDA Error with Qwen3-32B Model When Processing larger tokens it leads to model went to non responsive condition / stability concerns
usage
How to use vllm
#17534
opened May 1, 2025 by
nskpro-cmd
1 task done
[Bug]: AttributeError: 'MultiprocExecutor' object has no attribute 'workers' when VLLM_USE_V1=1 on rocm platform serve deepseek-r1 671B
bug
Something isn't working
rocm
Related to AMD ROCm
#17533
opened May 1, 2025 by
GuoxiangZu
1 task done
[Bug]: Bad requests are not captured as traces
bug
Something isn't working
#17528
opened May 1, 2025 by
frzifus
1 task done
[Bug]: Training with vllm not supports Qwen3
bug
Something isn't working
#17527
opened May 1, 2025 by
fly-dragon211
1 task done
[Bug]: '_OpNamespace' '_C' object has no attribute 'rms_norm' on docker environment
bug
Something isn't working
#17526
opened May 1, 2025 by
erkintelnyx
1 task done
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.