Skip to content

Commit 75f93d8

Browse files
ngxsonjulien-cMishig
authored
gguf: update README (#663)
Follow up #655 and #656 (comment) Added some examples on how to use local file + strictly typed --------- Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Mishig <[email protected]>
1 parent ef79d5d commit 75f93d8

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

packages/gguf/README.md

+40
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ npm install @huggingface/gguf
1818

1919
## Usage
2020

21+
### Basic usage
22+
2123
```ts
2224
import { GGMLQuantizationType, gguf } from "@huggingface/gguf";
2325

@@ -56,6 +58,44 @@ console.log(tensorInfos);
5658

5759
```
5860

61+
### Reading a local file
62+
63+
```ts
64+
// Reading a local file. (Not supported on browser)
65+
const { metadata, tensorInfos } = await gguf(
66+
'./my_model.gguf',
67+
{ allowLocalFile: true },
68+
);
69+
```
70+
71+
### Strictly typed
72+
73+
By default, known fields in `metadata` are typed. This includes various fields found in [llama.cpp](https://github.com/ggerganov/llama.cpp), [whisper.cpp](https://github.com/ggerganov/whisper.cpp) and [ggml](https://github.com/ggerganov/ggml).
74+
75+
```ts
76+
const { metadata, tensorInfos } = await gguf(URL_MODEL);
77+
78+
// Type check for model architecture at runtime
79+
if (metadata["general.architecture"] === "llama") {
80+
81+
// "llama.attention.head_count" is a valid key for llama architecture, this is typed as a number
82+
console.log(model["llama.attention.head_count"]);
83+
84+
// "mamba.ssm.conv_kernel" is an invalid key, because it requires model architecture to be mamba
85+
console.log(model["mamba.ssm.conv_kernel"]); // error
86+
}
87+
```
88+
89+
### Disable strictly typed
90+
91+
Because GGUF format can be used to store tensors, we can technically use it for other usages. For example, storing [control vectors](https://github.com/ggerganov/llama.cpp/pull/5970), [lora weights](https://github.com/ggerganov/llama.cpp/pull/2632), etc.
92+
93+
In case you want to use your own GGUF metadata structure, you can disable strict typing by casting the parse output to `GGUFParseOutput<{ strict: false }>`:
94+
95+
```ts
96+
const { metadata, tensorInfos }: GGUFParseOutput<{ strict: false }> = await gguf(URL_LLAMA);
97+
```
98+
5999
## Hugging Face Hub
60100

61101
The Hub supports all file formats and has built-in features for GGUF format.

0 commit comments

Comments
 (0)