ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 11.6k
Star 79.2k

Code
Issues 333
Pull requests 431
Discussions
Actions
Projects 9
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: ggml-org/llama.cpp

examples : add configuration presets

#10932 opened Dec 21, 2024 by ggerganov

Open 3

changelog : libllama API

#9289 opened Sep 3, 2024 by ggerganov

Open 9

changelog : llama-server REST API

#9291 opened Sep 3, 2024 by ggerganov

Open 14

Beta

Labels 75 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

333 Open 4,691 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Compile bug: paths with spaces fail on Unix with Vulkan backend bug-unconfirmed

#13288 opened May 3, 2025 by kangalio

Eval bug: b5237 broke Llama Scout bug

Something isn't working

#13287 opened May 3, 2025 by steampunque

Misc. bug: Completions hang after CUDA error, but health endpoint reports all OK bug-unconfirmed

#13281 opened May 3, 2025 by lee-b

Eval bug: grammar breaks sampling in Qwen3 MoE bug-unconfirmed

#13280 opened May 3, 2025 by matteoserva

Misc. bug: llama-server webui overriding command line parameters bug-unconfirmed

#13277 opened May 3, 2025 by merc4derp

Feature Request: Granite 4 Support enhancement

New feature or request

#13275 opened May 2, 2025 by gabe-l-hart

5 of 16 tasks

Feature Request: add to llama-bench device info reporting of "bf16:1", if built with VK_KHR_bfloat16 support and driver also supports it.. enhancement

New feature or request

#13274 opened May 2, 2025 by oscarbg

4 tasks done

Feature Request: add per-request "reasoning" options in llama-server enhancement

New feature or request

#13272 opened May 2, 2025 by ngxson

Compile bug: nvcc fatal : Unsupported gpu architecture 'compute_120' bug-unconfirmed

#13271 opened May 2, 2025 by jacekpoplawski

Misc. bug: Server does not always cancel requests for disconnected connections bug-unconfirmed

#13262 opened May 2, 2025 by CyberShadow

Misc. bug: the output file of llama-quantize is not gguf format bug-unconfirmed

#13258 opened May 2, 2025 by samsosu

Eval bug: sentencepiece tokenizer generates incorrect tokens bug-unconfirmed

#13256 opened May 2, 2025 by taylorchu

Eval bug: ggml_cuda_compute_forward: MUL_MAT failed when using FA + MLA on DeepSeekv3 0324, on mixed CPU + GPU bug-unconfirmed

#13252 opened May 2, 2025 by Panchovix

When using the qwen2.5-vl model on AMD Ryzen APU under Windows, the error "failed to allocate Vulkan0 buffer of size 4342230552" may appear. bug-unconfirmed

#13250 opened May 2, 2025 by xeden3

Misc. bug: terminate called after throwing an instance of 'vk::DeviceLostError' bug-unconfirmed

#13248 opened May 1, 2025 by Som-anon

Feature Request: Support multimodal LLMs such as Qwen2.5-VL as embedding models enhancement

New feature or request

#13247 opened May 1, 2025 by cebtenzzre

4 tasks done

Feature Request: s390x CI enhancement

New feature or request

#13243 opened May 1, 2025 by taronaeo

4 tasks done

Feature Request: Allow disabling offload_op for backends by user enhancement

New feature or request

#13241 opened May 1, 2025 by hjc4869

4 tasks done

Eval bug: -sm row causes GGML_ASSERT fail in Llama 4 Scout bug-unconfirmed

#13240 opened May 1, 2025 by FullstackSensei

Feature Request: XiaomiMiMo/MiMo-7B-RL enhancement

New feature or request

#13218 opened Apr 30, 2025 by Superluckyhu

4 tasks done

Eval bug: Qwen3-30B-A3B-Q4_K_M: Vulkan ~10% slower than AVX2 bug-unconfirmed

#13217 opened Apr 30, 2025 by ZUIcat

Eval bug: Qwen3 30B A3B is slow with CUDA bug-unconfirmed

#13211 opened Apr 30, 2025 by Nepherpitou

Eval bug: Can't utilize all 16 threads / 8 CPU cores for prompt processing when using llama-server. works fine with llama-cli bug-unconfirmed

#13197 opened Apr 29, 2025 by c777s

Eval bug: Persistent <think> Tags in Qwen3-32B Output Despite enable_thinking: False and --reasoning-format none in llama.cpp bug-unconfirmed

#13189 opened Apr 29, 2025 by shyn01

Misc. bug: rpc-server crash without cache bug-unconfirmed

#13185 opened Apr 29, 2025 by hbuxiaofei

Previous 1 2 3 4 5 … 13 14 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly