Misc. bug: the output file of llama-quantize is not gguf format #13258

samsosu · 2025-05-02T07:35:36Z

Name and Version

./llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA A10, compute capability 8.6, VMM: yes
version: 5225 (a0f7016)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-quantize

Command line

cd build/bin
./llama-quantize /mnt/data/train_output/Qwen2.5-32B-f16.gguf /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf Q4_K_M

Problem description & steps to reproduce

the quantize process has been finished successfully. but the output file can not be loaded by the following command:
./llama-cli -m /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf -n 128 --color -ngl 35

the error response like this:
gguf_init_from_file_impl: invalid magic characters: '', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf

llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf'
main: error: unable to load model

And I check the header data of this gguf file, find out there is not GGUF header, there is a lot of zero bytes at the beginning of gguf file

I also checked the source code of quantize.cpp，there is no code about outputing gguf format header at all.

First Bad Commit

No response

Relevant log output

The text was updated successfully, but these errors were encountered:

samsosu added the bug-unconfirmed label May 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: the output file of llama-quantize is not gguf format #13258

Misc. bug: the output file of llama-quantize is not gguf format #13258

samsosu commented May 2, 2025

Misc. bug: the output file of llama-quantize is not gguf format #13258

Misc. bug: the output file of llama-quantize is not gguf format #13258

Comments

samsosu commented May 2, 2025

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output