vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 7.2k
Star 46.4k

Code
Issues 1.8k
Pull requests 603
Discussions
Actions
Projects 11
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 9

[Roadmap] vLLM Release/CI/Performance Benchmark Q2 2025

#16284 opened Apr 8, 2025 by khluu

Open 3

[Usage] Qwen3 Usage Guide

#17327 opened Apr 28, 2025 by simon-mo

Open 43

Beta

Labels 47 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,778 Open 6,546 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Installation]: compilation of flash-attn e4m3 kernels fails due to layout incompatibility in copy_traits.hpp installation

Installation problems

#17597 opened May 2, 2025 by m-rds

1 task done

[Bug]: fp8 w8a8 quantized Qwen2.5-VL hits AssertionError bug

Something isn't working

#17595 opened May 2, 2025 by cebtenzzre

1 task done

[Bug]: torch._inductor.exc.InductorError: TypeError: cannot pickle 'torch._C.DispatchKeySet' object bug

Something isn't working

torch.compile

#17593 opened May 2, 2025 by zou3519

1 task done

[Bug]: vLLM pre-commit hook doesn't work with git worktree bug

Something isn't working

#17592 opened May 2, 2025 by zou3519

[Bug]: Cannot load Gemma3 27b QAT GGUF on RTX 5090 bug

Something isn't working

#17587 opened May 2, 2025 by FremyCompany

1 task done

[Feature]: benchmarks for vllm, it should support OpenAI Chat Completions API feature request

New feature or request

#17586 opened May 2, 2025 by xiayu98020214

1 task done

[Bug]: Mistral tool parser & streaming: corrupt tool_calls completions bug

Something isn't working

#17585 opened May 2, 2025 by hibukipanim

[Bug]: Qwen3 FP8 on 0.8.5: type fp8e4nv not supported in this architecture. bug

Something isn't working

#17581 opened May 2, 2025 by AlexBefest

1 task done

[Bug]: V1 Engine generates corrupt responses on large batch inference with long sequences and fails in seed control bug

Something isn't working

#17580 opened May 2, 2025 by Cakeyan

1 task done

[Feature]: support for fp8 marlin with MoE feature request

New feature or request

#17579 opened May 2, 2025 by ehartford

1 task done

[Usage]: Support Qwen3 inference in vLLM==0.8.5 with CUDA 11.8 (currently only vLLM==0.6.1.post1 works) usage

How to use vllm

#17578 opened May 2, 2025 by YihongT

[Bug]: ValueError: The output_size of gate's and up's weight = 192 is not divisible by weight quantization block_n = 128. bug

Something isn't working

#17569 opened May 2, 2025 by zzlgreat

1 task done

[Performance]: 单次请求速度30t/s ，并发请求只有1.5t/s performance

Performance-related issues

#17568 opened May 2, 2025 by nvliajia

[Bug]: Function calling does not work with Mistral Small bug

Something isn't working

#17557 opened May 1, 2025 by menardorama

1 task done

[Bug]: top_k: 0 in generation_config.json can't disable top-k sampling bug

Something isn't working

#17553 opened May 1, 2025 by toslunar

1 task done

[Feature]: Support HF-style chat template for multi-modal data in offline chat feature request

New feature or request

good first issue

Good for newcomers

#17551 opened May 1, 2025 by DarkLight1337

1 task done

[Usage]: understanding the vllm's gpu_memory_utilization and cuda graph memory requirement usage

How to use vllm

#17549 opened May 1, 2025 by mces89

1 task done

[Bug]: failed to run LMCache example for v0 bug

Something isn't working

#17545 opened May 1, 2025 by gaowayne

1 task done

[Bug]: cached_get_processor is not actually cached bug

Something isn't working

#17543 opened May 1, 2025 by Zazzle516

1 task done

[Performance]: Performance comparison for v1 engine and v0 engine performance

Performance-related issues

#17540 opened May 1, 2025 by hustxiayang

1 task done

[Usage]: CUDA Error with Qwen3-32B Model When Processing larger tokens it leads to model went to non responsive condition / stability concerns usage

How to use vllm

#17534 opened May 1, 2025 by nskpro-cmd

1 task done

[Bug]: AttributeError: 'MultiprocExecutor' object has no attribute 'workers' when VLLM_USE_V1=1 on rocm platform serve deepseek-r1 671B bug

Something isn't working

rocm

Related to AMD ROCm

#17533 opened May 1, 2025 by GuoxiangZu

1 task done

[Bug]: Bad requests are not captured as traces bug

Something isn't working

#17528 opened May 1, 2025 by frzifus

1 task done

[Bug]: Training with vllm not supports Qwen3 bug

Something isn't working

#17527 opened May 1, 2025 by fly-dragon211

1 task done

[Bug]: '_OpNamespace' '_C' object has no attribute 'rms_norm' on docker environment bug

Something isn't working

#17526 opened May 1, 2025 by erkintelnyx

1 task done

Previous 1 2 3 4 5 … 71 72 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly