NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 10.4k

Code
Issues 567
Pull requests 229
Discussions
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: NVIDIA/TensorRT-LLM

[RFC]Feedback collection about TensorRT-LLM 1.0 Release Plann...

#3148 opened Mar 29, 2025 by juney-nvidia

Open 1

[RFC]Topics you want to discuss with TensorRT-LLM team in the...

#3124 opened Mar 27, 2025 by juney-nvidia

Open 9

Beta

Labels 42 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

567 Open 1,912 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Call for contributions]The development plan of large-scale EP support in TensorRT-LLM Community Engagement

#4127 opened May 7, 2025 by juney-nvidia

Qwen2-0.5B Inference Freezes with TensorRT-LLM on RTX 5000

#4118 opened May 7, 2025 by ashkanzarkhah

Install from docs not working bug

Something isn't working

Installation triaged

Issue has been triaged by maintainers

#4099 opened May 6, 2025 by darraghdog

4 tasks

convert_checkpoint.py with Llama 4 ?

#4088 opened May 6, 2025 by LRUnivAlban

apply_per_channel_scale may fail to launch when the input sequence length is extremely large bug

Something isn't working

#4085 opened May 6, 2025 by StudyingShao

1 of 4 tasks

deepseek-fp4 quality issue -- bad generation output after the second batch run

#4072 opened May 5, 2025 by xwuShirley

Use report.json from trtllm-bench in perf integration tests instead of redirecting stdout others

#4071 opened May 5, 2025 by venkywonka

LLama 3.1 405B FP8 perf low on B200 bug

Something isn't working

#4043 opened May 3, 2025 by andyluo7

4 tasks

Deepseek R1 and V3, FP4 quant, output quality issues at batch size > 2 bug

Something isn't working

triaged

Issue has been triaged by maintainers

#4037 opened May 2, 2025 by pankajroark

2 of 4 tasks

Add FP8 Support for Llama-4-Maverick on B200 (Feature Request)

#4033 opened May 2, 2025 by indianspeedster

Is text-only inference supported for multi-modal vision models like Llama 3.2 11b/90b Vision Instruct?

#4012 opened May 1, 2025 by natehofmann

[bug] Lookahead spec-dec verifies w guesses instead of g bug

Something isn't working

#4003 opened Apr 30, 2025 by mahmoudhas

3 of 4 tasks

Front page readme should list models supported.

#3995 opened Apr 30, 2025 by NikolaBorisov

Qserve-w4a8 Shows Lower Computational Efficiency on H20 not a bug

Some known limitation, but not a bug.

#3977 opened Apr 30, 2025 by StaryDing

2 of 4 tasks

SW Architecture Enhancements roadmap SW Architecture

#3966 opened Apr 29, 2025 by mk-nvidia

Reasoning-related enhancements feature request

New feature or request

roadmap

#3965 opened Apr 29, 2025 by mk-nvidia

1.0 Architecture roadmap SW Architecture

#3964 opened Apr 29, 2025 by mk-nvidia

Disaggregated Prefill & Decode serving optimizations Investigating Performance

Issue about performance number

roadmap triaged

Issue has been triaged by maintainers

#3963 opened Apr 29, 2025 by mk-nvidia

MoE optimizations Investigating Performance

Issue about performance number

roadmap triaged

Issue has been triaged by maintainers

#3962 opened Apr 29, 2025 by mk-nvidia

Support versioned github.io doc to make it easy to map code with the corresponding doc version Documentation

Improvements or additions to documentation

Investigating roadmap triaged

Issue has been triaged by maintainers

#3961 opened Apr 29, 2025 by mk-nvidia

Re-organize the example directory into Feature level examples and Model level examples Documentation

Improvements or additions to documentation

Investigating roadmap triaged

Issue has been triaged by maintainers

#3960 opened Apr 29, 2025 by mk-nvidia

Publication of pre-built development and deployment Docker containers to improve TensorRT-LLM user experience Ease of Use roadmap

#3959 opened Apr 29, 2025 by mk-nvidia

Intra-1.x-version backward compatibility for selected APIs. Investigating roadmap triaged

Issue has been triaged by maintainers

#3958 opened Apr 29, 2025 by mk-nvidia

Plenty of regressions in trt-llm v0.20.0 bug

Something isn't working

#3955 opened Apr 29, 2025 by michaelfeil

4 tasks

Context node crash when using PD Disaggregation bug

Something isn't working

#3937 opened Apr 29, 2025 by nsealati

2 of 4 tasks

Previous 1 2 3 4 5 … 22 23 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-05-04.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly