Skip to content

Commit d503a53

Browse files
committed
docs: add a fine tune samples
1 parent 046632e commit d503a53

File tree

1 file changed

+59
-0
lines changed

1 file changed

+59
-0
lines changed

README.md

+59
Original file line numberDiff line numberDiff line change
@@ -851,6 +851,65 @@ if __name__ == "__main__":
851851
852852
![Finetune Model Choice](images/finetune-model-choice.jpg)
853853
854+
#### 数据集信息
855+
856+
由 Unit Eval + OSS Instruct 数据集构建而来:
857+
858+
- 3000 条补全(Inline,InBlock,AfterBlock)数据集。
859+
- 1500 条单元测试数据集。
860+
- 4000 条 OSS Instruct 数据集。
861+
862+
#### 参数示例:
863+
864+
```bash
865+
!cd DeepSeek-Coder/finetune && deepspeed finetune_deepseekcoder.py \
866+
--model_name_or_path $MODEL_PATH \
867+
--data_path $DATA_PATH \
868+
--output_dir $OUTPUT_PATH \
869+
--num_train_epochs 1 \
870+
--model_max_length 1024 \
871+
--per_device_train_batch_size 2 \
872+
--per_device_eval_batch_size 1 \
873+
--gradient_accumulation_steps 1 \
874+
--evaluation_strategy "no" \
875+
--save_strategy "steps" \
876+
--save_steps 2000 \
877+
--save_total_limit 10 \
878+
--learning_rate 1e-4 \
879+
--warmup_steps 10 \
880+
--logging_steps 1 \
881+
--lr_scheduler_type "cosine" \
882+
--gradient_checkpointing True \
883+
--report_to "tensorboard" \
884+
--deepspeed configs/ds_config_zero3.json \
885+
--bf16 True
886+
```
887+
888+
运行日志:
889+
890+
```bash
891+
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
892+
0%| | 0/2125 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
893+
{'loss': 3.9356, 'learning_rate': 0.0, 'epoch': 0.0}
894+
{'loss': 0.8462, 'learning_rate': 3.0102999566398115e-05, 'epoch': 0.0}
895+
{'loss': 0.909, 'learning_rate': 4.771212547196624e-05, 'epoch': 0.0}
896+
{'loss': 0.3674, 'learning_rate': 6.020599913279623e-05, 'epoch': 0.0}
897+
{'loss': 0.3959, 'learning_rate': 6.989700043360187e-05, 'epoch': 0.0}
898+
{'loss': 0.7964, 'learning_rate': 7.781512503836436e-05, 'epoch': 0.0}
899+
{'loss': 0.3542, 'learning_rate': 8.450980400142567e-05, 'epoch': 0.0}
900+
{'loss': 1.7094, 'learning_rate': 9.030899869919434e-05, 'epoch': 0.0}
901+
{'loss': 0.5968, 'learning_rate': 9.542425094393248e-05, 'epoch': 0.0}
902+
{'loss': 0.6208, 'learning_rate': 9.999999999999999e-05, 'epoch': 0.0}
903+
{'loss': 0.4074, 'learning_rate': 0.0001, 'epoch': 0.01}
904+
{'loss': 0.3637, 'learning_rate': 0.0001, 'epoch': 0.01}
905+
{'loss': 0.3459, 'learning_rate': 0.0001, 'epoch': 0.01}
906+
{'loss': 0.6971, 'learning_rate': 0.0001, 'epoch': 0.01}
907+
{'loss': 0.3917, 'learning_rate': 0.0001, 'epoch': 0.01}
908+
{'loss': 0.5859, 'learning_rate': 0.0001, 'epoch': 0.01}
909+
{'loss': 0.5923, 'learning_rate': 0.0001, 'epoch': 0.01}
910+
1%|▎ | 17/2125 [05:14<10:03:38, 17.18s/it]
911+
```
912+
854913
其它:
855914
856915
- 详细的 Notebook 见:[code/finetune/finetune.ipynb](code/finetune/finetune.ipynb)

0 commit comments

Comments
 (0)