JZTXT

llama-factory fine-tuning-3 (conception and technologies explanation)

发布时间 2023-11-29 15:45:50作者: Daze_Lu

train method

supervised fine-tuning

Reward Modeling

PPO training

DPO training

full-parameter

partial-parameter

LoRA

QLoRA

command parameter

fp16

gradient_accumulation_steps

lr_scheduler_type

lora_target

overwrite_cache

stage