You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
./llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA A10, compute capability 8.6, VMM: yes
version: 5225 (a0f7016)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-quantize
Command line
cd build/bin
./llama-quantize /mnt/data/train_output/Qwen2.5-32B-f16.gguf /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf Q4_K_M
Problem description & steps to reproduce
the quantize process has been finished successfully. but the output file can not be loaded by the following command:
./llama-cli -m /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf -n 128 --color -ngl 35
the error response like this:
gguf_init_from_file_impl: invalid magic characters: '', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf'
main: error: unable to load model
And I check the header data of this gguf file, find out there is not GGUF header, there is a lot of zero bytes at the beginning of gguf file
I also checked the source code of quantize.cpp,there is no code about outputing gguf format header at all.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered:
Name and Version
./llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA A10, compute capability 8.6, VMM: yes
version: 5225 (a0f7016)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-quantize
Command line
cd build/bin ./llama-quantize /mnt/data/train_output/Qwen2.5-32B-f16.gguf /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf Q4_K_M
Problem description & steps to reproduce
the quantize process has been finished successfully. but the output file can not be loaded by the following command:
./llama-cli -m /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf -n 128 --color -ngl 35
the error response like this:
gguf_init_from_file_impl: invalid magic characters: '', expected 'GGUF'
llama_model_load: error loading model: llama_model_loader: failed to load model from /mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/mnt/data/train_output/Qwen2.5-32B-Q4_K_M.gguf'
main: error: unable to load model
And I check the header data of this gguf file, find out there is not GGUF header, there is a lot of zero bytes at the beginning of gguf file
I also checked the source code of quantize.cpp,there is no code about outputing gguf format header at all.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: