LLM-quickstart-main.zip
大小:5.95MB
价格:49积分
下载量:0
评分:
5.0
上传者:weixin_43620082
更新日期:2025-09-22

大模型微调入门 LLM-quickstart-main

资源文件列表(大概)

文件名
大小
LLM-quickstart-main/
-
LLM-quickstart-main/.gitignore
3.08KB
LLM-quickstart-main/LICENSE
11.09KB
LLM-quickstart-main/README-en.md
6.32KB
LLM-quickstart-main/README.md
6.22KB
LLM-quickstart-main/chatglm/
-
LLM-quickstart-main/chatglm/chatbot_webui.py
1.07KB
LLM-quickstart-main/chatglm/chatbot_with_memory.ipynb
11.29KB
LLM-quickstart-main/chatglm/chatglm_inference.ipynb
34.65KB
LLM-quickstart-main/chatglm/data/
-
LLM-quickstart-main/chatglm/data/raw_data.txt
18.9KB
LLM-quickstart-main/chatglm/data/zhouyi_dataset_20240118_152413.csv
213.75KB
LLM-quickstart-main/chatglm/data/zhouyi_dataset_20240118_163659.csv
147.08KB
LLM-quickstart-main/chatglm/data/zhouyi_dataset_handmade.csv
7.53KB
LLM-quickstart-main/chatglm/gen_dataset.ipynb
73.01KB
LLM-quickstart-main/chatglm/qlora_chatglm3.ipynb
39.45KB
LLM-quickstart-main/chatglm/qlora_chatglm3_timestamp.ipynb
36.9KB
LLM-quickstart-main/deepspeed/
-
LLM-quickstart-main/deepspeed/README.md
1.74KB
LLM-quickstart-main/deepspeed/config/
-
LLM-quickstart-main/deepspeed/config/ds_config_zero2.json
1.2KB
LLM-quickstart-main/deepspeed/config/ds_config_zero3.json
1.46KB
LLM-quickstart-main/deepspeed/train_on_multi_nodes.sh
2.12KB
LLM-quickstart-main/deepspeed/train_on_one_gpu.sh
1.81KB
LLM-quickstart-main/deepspeed/translation/
-
LLM-quickstart-main/deepspeed/translation/README.md
7.88KB
LLM-quickstart-main/deepspeed/translation/requirements.txt
119B
LLM-quickstart-main/deepspeed/translation/run_translation.py
29.58KB
LLM-quickstart-main/docs/
-
LLM-quickstart-main/docs/INSTALL.md
4.46KB
LLM-quickstart-main/docs/cuda_installation.png
136.28KB
LLM-quickstart-main/docs/version_check.py
966B
LLM-quickstart-main/docs/version_info.txt
382B
LLM-quickstart-main/langchain/
-
LLM-quickstart-main/langchain/chains/
-
LLM-quickstart-main/langchain/chains/router_chain.ipynb
16.45KB
LLM-quickstart-main/langchain/chains/sequential_chain.ipynb
22.3KB
LLM-quickstart-main/langchain/chains/transform_chain.ipynb
317.34KB
LLM-quickstart-main/langchain/data_connection/
-
LLM-quickstart-main/langchain/data_connection/document_loader.ipynb
63.45KB
LLM-quickstart-main/langchain/data_connection/document_transformer.ipynb
60.63KB
LLM-quickstart-main/langchain/data_connection/text_embedding.ipynb
7.18KB
LLM-quickstart-main/langchain/data_connection/vector_stores.ipynb
76.04KB
LLM-quickstart-main/langchain/images/
-
LLM-quickstart-main/langchain/images/llm_chain.png
1.94MB
LLM-quickstart-main/langchain/images/memory.png
110.59KB
LLM-quickstart-main/langchain/images/model_io.jpeg
643.33KB
LLM-quickstart-main/langchain/images/router_chain.png
524.24KB
LLM-quickstart-main/langchain/images/sequential_chain_0.png
502.17KB
LLM-quickstart-main/langchain/images/simple_sequential_chain_0.png
479.32KB
LLM-quickstart-main/langchain/images/simple_sequential_chain_1.png
616.04KB
LLM-quickstart-main/langchain/images/transform_chain.png
498.14KB
LLM-quickstart-main/langchain/memory/
-
LLM-quickstart-main/langchain/memory/memory.ipynb
24.31KB
LLM-quickstart-main/langchain/model_io/
-
LLM-quickstart-main/langchain/model_io/model.ipynb
35.87KB
LLM-quickstart-main/langchain/model_io/output_parser.ipynb
15.11KB
LLM-quickstart-main/langchain/model_io/prompt.ipynb
60.29KB
LLM-quickstart-main/langchain/tests/
-
LLM-quickstart-main/langchain/tests/state_of_the_union.txt
38.11KB
LLM-quickstart-main/langchain/tests/the_old_man_and_the_sea.txt
137.4KB
LLM-quickstart-main/llama/
-
LLM-quickstart-main/llama/llama2_inference.ipynb
5.37KB
LLM-quickstart-main/llama/llama2_instruction_tuning.ipynb
22.68KB
LLM-quickstart-main/peft/
-
LLM-quickstart-main/peft/chatglm3.ipynb
23.52KB
LLM-quickstart-main/peft/data/
-
LLM-quickstart-main/peft/data/audio/
-
LLM-quickstart-main/peft/data/audio/test_zh.flac
788.95KB
LLM-quickstart-main/peft/peft_chatglm_inference.ipynb
6.86KB
LLM-quickstart-main/peft/peft_lora_opt-6.7b.ipynb
268.51KB
LLM-quickstart-main/peft/peft_lora_whisper-large-v2.ipynb
40.85KB
LLM-quickstart-main/peft/peft_qlora_chatglm.ipynb
41.27KB
LLM-quickstart-main/peft/whisper_eval.ipynb
23.18KB
LLM-quickstart-main/quantization/
-
LLM-quickstart-main/quantization/AWQ-opt-125m.ipynb
13.69KB
LLM-quickstart-main/quantization/AWQ_opt-2.7b.ipynb
11.63KB
LLM-quickstart-main/quantization/AutoGPTQ_opt-2.7b.ipynb
600.78KB
LLM-quickstart-main/quantization/bits_and_bytes.ipynb
11.7KB
LLM-quickstart-main/quantization/docs/
-
LLM-quickstart-main/quantization/docs/images/
-
LLM-quickstart-main/quantization/docs/images/qlora.png
140.99KB
LLM-quickstart-main/requirements.txt
430B
LLM-quickstart-main/transformers/
-
LLM-quickstart-main/transformers/data/
-
LLM-quickstart-main/transformers/data/audio/
-
LLM-quickstart-main/transformers/data/audio/mlk.flac
374.46KB
LLM-quickstart-main/transformers/data/image/
-
LLM-quickstart-main/transformers/data/image/cat-chonk.jpeg
54.99KB
LLM-quickstart-main/transformers/data/image/cat_dog.jpg
68.63KB
LLM-quickstart-main/transformers/data/image/panda.jpg
600.53KB
LLM-quickstart-main/transformers/docs/
-
LLM-quickstart-main/transformers/docs/images/
-
LLM-quickstart-main/transformers/docs/images/bert-base-chinese.png
47.29KB
LLM-quickstart-main/transformers/docs/images/bert.png
222.39KB
LLM-quickstart-main/transformers/docs/images/bert_pretrain.png
111.51KB
LLM-quickstart-main/transformers/docs/images/full_nlp_pipeline.png
96.02KB
LLM-quickstart-main/transformers/docs/images/gpt2.png
42.6KB
LLM-quickstart-main/transformers/docs/images/pipeline_advanced.png
96.02KB
LLM-quickstart-main/transformers/docs/images/pipeline_func.png
51.51KB
LLM-quickstart-main/transformers/docs/images/question_answering.png
52.74KB
LLM-quickstart-main/transformers/fine-tune-QA.ipynb
87.73KB
LLM-quickstart-main/transformers/fine-tune-quickstart.ipynb
40.88KB
LLM-quickstart-main/transformers/pipelines.ipynb
50.68KB
LLM-quickstart-main/transformers/pipelines_advanced.ipynb
27.67KB

资源内容介绍

大模型微调入门
<!---Copyright 2020 The HuggingFace Team. All rights reserved.Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->## TranslationThis directory contains examples for finetuning and evaluating transformers on translation tasks.Please tag @patil-suraj with any issues/unexpected behaviors, or send a PR!For deprecated `bertabs` instructions, see [`bertabs/README.md`](https://github.com/huggingface/transformers/blob/main/examples/research_projects/bertabs/README.md).For the old `finetune_trainer.py` and related utils, see [`examples/legacy/seq2seq`](https://github.com/huggingface/transformers/blob/main/examples/legacy/seq2seq).### Supported Architectures- `BartForConditionalGeneration`- `FSMTForConditionalGeneration` (translation only)- `MBartForConditionalGeneration`- `MarianMTModel`- `PegasusForConditionalGeneration`- `T5ForConditionalGeneration`- `MT5ForConditionalGeneration``run_translation.py` is a lightweight examples of how to download and preprocess a dataset from the [🤗 Datasets](https://github.com/huggingface/datasets) library or use your own files (jsonlines or csv), then fine-tune one of the architectures above on it.For custom datasets in `jsonlines` format please see: https://huggingface.co/docs/datasets/loading_datasets#json-filesand you also will find examples of these below.## With TrainerHere is an example of a translation fine-tuning with a MarianMT model:```bashpython examples/pytorch/translation/run_translation.py \ --model_name_or_path Helsinki-NLP/opus-mt-en-ro \ --do_train \ --do_eval \ --source_lang en \ --target_lang ro \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --output_dir /tmp/tst-translation \ --per_device_train_batch_size=4 \ --per_device_eval_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate```MBart and some T5 models require special handling.T5 models `t5-small`, `t5-base`, `t5-large`, `t5-3b` and `t5-11b` must use an additional argument: `--source_prefix "translate {source_lang} to {target_lang}"`. For example:```bashpython examples/pytorch/translation/run_translation.py \ --model_name_or_path t5-small \ --do_train \ --do_eval \ --source_lang en \ --target_lang ro \ --source_prefix "translate English to Romanian: " \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --output_dir /tmp/tst-translation \ --per_device_train_batch_size=4 \ --per_device_eval_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate```If you get a terrible BLEU score, make sure that you didn't forget to use the `--source_prefix` argument.For the aforementioned group of T5 models it's important to remember that if you switch to a different language pair, make sure to adjust the source and target values in all 3 language-specific command line argument: `--source_lang`, `--target_lang` and `--source_prefix`.MBart models require a different format for `--source_lang` and `--target_lang` values, e.g. instead of `en` it expects `en_XX`, for `ro` it expects `ro_RO`. The full MBart specification for language codes can be found [here](https://huggingface.co/facebook/mbart-large-cc25). For example:```bashpython examples/pytorch/translation/run_translation.py \ --model_name_or_path facebook/mbart-large-en-ro \ --do_train \ --do_eval \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --source_lang en_XX \ --target_lang ro_RO \ --output_dir /tmp/tst-translation \ --per_device_train_batch_size=4 \ --per_device_eval_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate ```And here is how you would use the translation finetuning on your own files, after adjusting thevalues for the arguments `--train_file`, `--validation_file` to match your setup:```bashpython examples/pytorch/translation/run_translation.py \ --model_name_or_path t5-small \ --do_train \ --do_eval \ --source_lang en \ --target_lang ro \ --source_prefix "translate English to Romanian: " \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --train_file path_to_jsonlines_file \ --validation_file path_to_jsonlines_file \ --output_dir /tmp/tst-translation \ --per_device_train_batch_size=4 \ --per_device_eval_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate```The task of translation supports only custom JSONLINES files, with each line being a dictionary with a key `"translation"` and its value another dictionary whose keys is the language pair. For example:```json{ "translation": { "en": "Others have dismissed him as a joke.", "ro": "Alții l-au numit o glumă." } }{ "translation": { "en": "And some are holding out for an implosion.", "ro": "Iar alții așteaptă implozia." } }```Here the languages are Romanian (`ro`) and English (`en`).If you want to use a pre-processed dataset that leads to high BLEU scores, but for the `en-de` language pair, you can use `--dataset_name stas/wmt14-en-de-pre-processed`, as following:```bashpython examples/pytorch/translation/run_translation.py \ --model_name_or_path t5-small \ --do_train \ --do_eval \ --source_lang en \ --target_lang de \ --source_prefix "translate English to German: " \ --dataset_name stas/wmt14-en-de-pre-processed \ --output_dir /tmp/tst-translation \ --per_device_train_batch_size=4 \ --per_device_eval_batch_size=4 \ --overwrite_output_dir \ --predict_with_generate ```## With AccelerateBased on the script [`run_translation_no_trainer.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/translation/run_translation_no_trainer.py).Like `run_translation.py`, this script allows you to fine-tune any of the models supported on atranslation task, the main difference is that thisscript exposes the bare training loop, to allow you to quickly experiment and add any customization you would like.It offers less options than the script with `Trainer` (for instance you can easily change the options for the optimizeror the dataloaders directly in the script) but still run in a distributed setup, on TPU and supports mixed precision bythe mean of the [🤗 `Accelerate`](https://github.com/huggingface/accelerate) library. You can use the script normallyafter installing it:```bashpip install git+https://github.com/huggingface/accelerate```then```bashpython run_translation_no_trainer.py \ --model_name_or_path Helsinki-NLP/opus-mt-en-ro \ --source_lang en \ --target_lang ro \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --output_dir ~/tmp/tst-translation```You can then use your usual launchers to run in it in a distributed environment, but the easiest way is to run```bashaccelerate config```and reply to the questions asked. Then```bashaccelerate test```that will check everything is ready for training. Finally, you can launch training with```bashaccelerate launch run_translation_no_trainer.py \ --model_name_or_path Helsinki-NLP/opus-mt-en-ro \ --source_lang en \ --target_lang ro \ --dataset_name wmt16 \ --dataset_config_name ro-en \ --output_dir ~/tmp/tst-translation```This command is the same and will work for:- a CPU-only setup- a setup with one GPU- a distributed training with several GPUs (single or multi node)- a training on TPUsNote that this library is in alpha release so your feedb

用户评论 (0)

发表评论

captcha