如何利用SFT训练出自己的Mistral 7B大模型

说明

利用huggingface的transformers包提供各种API，我们能在消费级显卡，比如3090，很容易的就微调出自己的大模型。比如本文就利用SFT（Supervised fine-tuning）微调出一个自己的模型。

环境准备

在开始之前首先要准备一台带有3090显卡的主机，且主机内存必须>64G。在机器准备完毕后就可以安装所需要的各种包了。

conda create -p ./env python=3.9
pip install -U transformers peft datasets  trl bitsandbytes  sentencepiece protobuf

数据准备

本文使用的是ultrachat数据集。ultrachat是一个多轮英文对话数据集。这里不再赘述，如果想要了解更多，可以参考链接。

值得注意的是：在加载数据前需要将Ultrachat数据集转换为Mistral数据的格式：

prompt prompt_id messages [content,role ] 

比如：

[{"prompt":"","prompt_id":"","messages":["role":"user","content":"","role":"assistant","content":""]}]

其中：role取user或assistant，代表的是用户和AI助手。另外需要注意的是在messages中user和assistant是顺序出现的。

在数据准备好后直接读取：

df = pd.read_json('ultrachat_train.jsonl')
# convert to dataset object
dataset = ds.dataset(pa.Table.from_pandas(df).to_batches())
dataset = Dataset(pa.Table.from_pandas(df))

模型定义

加载模型：

base_model='../Mistral-7B-Instruct-v0.3'

# Load base model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
        base_model,
        quantization_config=bnb_config,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
        local_files_only=True
)
model = prepare_model_for_kbit_training(model)
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj"]
)
model = get_peft_model(model, peft_config)

加载tokenizer：

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True, local_files_only=True)
tokenizer.padding_side = 'right'
tokenizer.pad_token = tokenizer.eos_token
tokenizer.add_eos_token = True
tokenizer.bos_token, tokenizer.eos_token

训练

准备训练：

#Hyperparamter
training_arguments = TrainingArguments(
    output_dir="./results",
    num_train_epochs=1,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=30,
    logging_steps=1,
    learning_rate=2e-4,
    weight_decay=0.001,
    fp16=False,
    bf16=False,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="constant",
)

# Setting sft parameters
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    max_seq_length= None,
    # dataset_text_field="text",
    tokenizer=tokenizer,
    args=training_arguments,
    packing= False,
)

训练和保存、评估：

trainer.train(resume_from_checkpoint=True)
new_model='./NewModel'
# Save the fine-tuned model
trainer.model.save_pretrained(new_model, safe_serialization=False)
model.config.use_cache = True
model.eval()

推理

pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=800)
def build_prompt(question):
  prompt=f"<s>[INST]{question} [/INST]"
  return prompt
#  [
#     {
#         "role": "system",
#         "content": "You are a friendly chatbot who always responds in the style of a pirate",
#     },
#     {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
# ]
question = "以人工智能写一篇700字作文?"
prompt = build_prompt(question)
result = pipe(prompt)

print(result[0]['generated_text'])

参考链接

1.RLHF 2.sft_trainer 3.UltraChat

如何利用SFT训练出自己的Mistral 7B大模型

说明

环境准备

数据准备

模型定义

训练

推理

参考链接

Further Reading

1.Training on GPU

2.Detect Language with MLP

2017年深度学习在NLP领域的进展和趋势