说明
利用huggingface的transformers包提供各种API,我们能在消费级显卡,比如3090,很容易的就微调出自己的大模型。比如本文就利用SFT(Supervised fine-tuning)微调出一个自己的模型。
环境准备
在开始之前首先要准备一台带有3090显卡的主机,且主机内存必须>64G。在机器准备完毕后就可以安装所需要的各种包了。
1
2
conda create -p ./env python=3.9
pip install -U transformers peft datasets trl bitsandbytes sentencepiece protobuf
数据准备
本文使用的是ultrachat数据集。ultrachat是一个多轮英文对话数据集。这里不再赘述,如果想要了解更多,可以参考链接。
值得注意的是:在加载数据前需要将Ultrachat数据集转换为Mistral数据的格式:
1
prompt prompt_id messages [content,role ]
比如:
1
[{"prompt":"","prompt_id":"","messages":["role":"user","content":"","role":"assistant","content":""]}]
其中:role取user或assistant,代表的是用户和AI助手。另外需要注意的是在messages中user和assistant是顺序出现的。
在数据准备好后直接读取:
1
2
3
4
5
df = pd.read_json('ultrachat_train.jsonl')
# convert to dataset object
dataset = ds.dataset(pa.Table.from_pandas(df).to_batches())
dataset = Dataset(pa.Table.from_pandas(df))
模型定义
加载模型:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
base_model='../Mistral-7B-Instruct-v0.3'
# Load base model
bnb_config = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
local_files_only=True
)
model = prepare_model_for_kbit_training(model)
peft_config = LoraConfig(
lora_alpha=16,
lora_dropout=0.1,
r=64,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj"]
)
model = get_peft_model(model, peft_config)
加载tokenizer:
1
2
3
4
5
6
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True, local_files_only=True)
tokenizer.padding_side = 'right'
tokenizer.pad_token = tokenizer.eos_token
tokenizer.add_eos_token = True
tokenizer.bos_token, tokenizer.eos_token
训练
准备训练:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#Hyperparamter
training_arguments = TrainingArguments(
output_dir="./results",
num_train_epochs=1,
per_device_train_batch_size=2,
gradient_accumulation_steps=1,
optim="paged_adamw_32bit",
save_steps=30,
logging_steps=1,
learning_rate=2e-4,
weight_decay=0.001,
fp16=False,
bf16=False,
max_grad_norm=0.3,
max_steps=-1,
warmup_ratio=0.03,
group_by_length=True,
lr_scheduler_type="constant",
)
# Setting sft parameters
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
max_seq_length= None,
# dataset_text_field="text",
tokenizer=tokenizer,
args=training_arguments,
packing= False,
)
训练和保存、评估:
1
2
3
4
5
6
trainer.train(resume_from_checkpoint=True)
new_model='./NewModel'
# Save the fine-tuned model
trainer.model.save_pretrained(new_model, safe_serialization=False)
model.config.use_cache = True
model.eval()
推理
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=800)
def build_prompt(question):
prompt=f"<s>[INST]{question} [/INST]"
return prompt
# [
# {
# "role": "system",
# "content": "You are a friendly chatbot who always responds in the style of a pirate",
# },
# {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
# ]
question = "以人工智能写一篇700字作文?"
prompt = build_prompt(question)
result = pipe(prompt)
print(result[0]['generated_text'])
参考链接
1.RLHF 2.sft_trainer 3.UltraChat