大模型量化-rr

发布时间:2026/7/2 4:58:23
大模型量化-rr from awq import AutoAWQForCausalLMfrom transformers import AutoTokenizermodel_path ./Qwen3.6-27B # 替换为你下载的原始模型路径quant_path ./models/Qwen3.6-27B-AWQ-Local# 1. 加载模型与分词器model AutoAWQForCausalLM.from_pretrained(model_path, trust_remote_codeTrue)tokenizer AutoTokenizer.from_pretrained(model_path, trust_remote_codeTrue)# 2. 配置 AWQ 量化参数quant_config {zero_point: True,q_group_size: 128,w_bit: 4,version: GEMM}# 3. 执行量化这一步极度消耗内存请盯紧系统资源print(开始本地量化请耐心等待...)model.quantize(tokenizer, quant_configquant_config)# 4. 保存量化后的模型print(保存量化模型中...)model.save_quantized(quant_path)tokenizer.save_pretrained(quant_path)print(本地量化完成)