Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Draft: test: [CI] remove closed bugs
#4046 opened May 4, 2025 by xinhe-nv Draft
fix: apply rope twice in Qwen3.
#4040 opened May 3, 2025 by yuxianq Loading…
chore: cleanup llmapi for 1.0
#4039 opened May 2, 2025 by hchings Loading…
[Deepseek] Add fp8 kvcache test
#4038 opened May 2, 2025 by hlu1 Loading…
Add disaggregated serving accuracy tests
#4036 opened May 2, 2025 by Tabrizian Loading…
Draft: feat: non-invasive pipeline parallelism
#4034 opened May 2, 2025 by yuxianq Loading…
[DRAFT] setting attention prior
#4031 opened May 2, 2025 by vklimkov-nvidia Loading…
fix: instantiate decoder early in pytorch
#4029 opened May 2, 2025 by dcampora Loading…
feat: Enable AutoDeploy to llm-eval example
#4020 opened May 2, 2025 by meenchen Loading…
feat:add slurm support and add b40 to test-db
#4019 opened May 2, 2025 by yuanjingx87 Loading…
TorchLLM: Pass local dir to processor creation
#4018 opened May 1, 2025 by milesial Loading…
[Deepseek] Refactor Deepseek Decoder layer
#4016 opened May 1, 2025 by hlu1 Loading…
[fix] support llama + eagle head checkpoint conversion
#4013 opened May 1, 2025 by jhaotingc Loading…
bench: Port benchmark_serving.py
#4011 opened May 1, 2025 by kaiyux Draft
ProTip! no:milestone will show everything without a milestone.