Skip to content

Commit 749093e

Browse files
committed
Update on "Summary: redoing"
5bf70c1 in a way that doesn't get reverted Test Plan: export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5 python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model_int4-gptq.g32.cuda.pth --tasks wikitext --limit 5 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
1 parent 8704152 commit 749093e

File tree

3 files changed

+482
-5
lines changed

3 files changed

+482
-5
lines changed

GPTQ.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,7 @@ def __init__(
150150
}
151151

152152
# trace model for one input
153-
one_input = [multi.values[0].cpu() for multi in inputs]
153+
one_input = tuple([multi.values[0].cpu() for multi in inputs])
154154
exported_model = torch._dynamo.export(
155155
model.cpu(), aten_graph=True, pre_dispatch=True, tracing_mode="fake"
156156
)(*one_input)

0 commit comments

Comments
 (0)