gpt-neox-main.zip
大小:51.27MB
价格:21积分
下载量:0
评分:
5.0
上传者:m0_61006552
更新日期:2025-09-22

gpt-neox-main.zip

资源文件列表(大概)

文件名
大小
gpt-neox-main/
-
gpt-neox-main/.idea/
-
gpt-neox-main/.idea/.gitignore
50B
gpt-neox-main/.idea/gpt-neox-main.iml
567B
gpt-neox-main/.idea/inspectionProfiles/
-
gpt-neox-main/.idea/inspectionProfiles/profiles_settings.xml
174B
gpt-neox-main/.idea/misc.xml
292B
gpt-neox-main/.idea/modules.xml
285B
gpt-neox-main/.idea/workspace.xml
2.06KB
gpt-neox-main/gpt-neox-main/
-
gpt-neox-main/gpt-neox-main/.clang-format
4.4KB
gpt-neox-main/gpt-neox-main/.dockerignore
17B
gpt-neox-main/gpt-neox-main/.github/
-
gpt-neox-main/gpt-neox-main/.github/CODEOWNERS
19B
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/
-
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/bug_report.md
712B
gpt-neox-main/gpt-neox-main/.github/ISSUE_TEMPLATE/feature_request.md
608B
gpt-neox-main/gpt-neox-main/.github/workflows/
-
gpt-neox-main/gpt-neox-main/.github/workflows/coverity_scan.yml
1.96KB
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci.yml
1017B
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci_dispatch.yml
438B
gpt-neox-main/gpt-neox-main/.github/workflows/cpu_ci_on_pr.yml
425B
gpt-neox-main/gpt-neox-main/.github/workflows/docker_build.yml
1.16KB
gpt-neox-main/gpt-neox-main/.github/workflows/pull_request.yml
1.36KB
gpt-neox-main/gpt-neox-main/.gitignore
2.05KB
gpt-neox-main/gpt-neox-main/.pre-commit-config.yaml
1.35KB
gpt-neox-main/gpt-neox-main/CITATION.cff
2.02KB
gpt-neox-main/gpt-neox-main/ckpts/
-
gpt-neox-main/gpt-neox-main/ckpts/20B_tokenizer.json
2.11MB
gpt-neox-main/gpt-neox-main/configs/
-
gpt-neox-main/gpt-neox-main/configs/1-3B.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/125M-dmoe.yml
2.47KB
gpt-neox-main/gpt-neox-main/configs/125M-json.yml
1.69KB
gpt-neox-main/gpt-neox-main/configs/125M-moe.yml
2.47KB
gpt-neox-main/gpt-neox-main/configs/125M.yml
2.35KB
gpt-neox-main/gpt-neox-main/configs/125M_my.yml
2.35KB
gpt-neox-main/gpt-neox-main/configs/13B.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/175B.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/19M.yml
2.14KB
gpt-neox-main/gpt-neox-main/configs/2-7B.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/20B.yml
3KB
gpt-neox-main/gpt-neox-main/configs/350M.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/49M.yml
2.15KB
gpt-neox-main/gpt-neox-main/configs/6-7B.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/760M.yml
2.32KB
gpt-neox-main/gpt-neox-main/configs/800M.yml
1.93KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/
-
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/small_tune.json
1.86KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune.json
1.83KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune_1-3B.json
2.01KB
gpt-neox-main/gpt-neox-main/configs/autotuning_configs/tune_6-7B.json
1.69KB
gpt-neox-main/gpt-neox-main/configs/bf16_125M.yml
2.11KB
gpt-neox-main/gpt-neox-main/configs/bnb_125M.yml
2.17KB
gpt-neox-main/gpt-neox-main/configs/cpu_mock_config.yml
186B
gpt-neox-main/gpt-neox-main/configs/docker/
-
gpt-neox-main/gpt-neox-main/configs/docker/pythia-paths.yml
496B
gpt-neox-main/gpt-neox-main/configs/eleutherai_cluster.yml
1.1KB
gpt-neox-main/gpt-neox-main/configs/finetuning_configs/
-
gpt-neox-main/gpt-neox-main/configs/finetuning_configs/6-9B.yml
1.96KB
gpt-neox-main/gpt-neox-main/configs/gen_docs.py
3.14KB
gpt-neox-main/gpt-neox-main/configs/gmlp_small.yml
1.74KB
gpt-neox-main/gpt-neox-main/configs/llama/
-
gpt-neox-main/gpt-neox-main/configs/llama/13B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama/30B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama/65B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama/7B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama/README.md
678B
gpt-neox-main/gpt-neox-main/configs/llama/train_config.yml
1.58KB
gpt-neox-main/gpt-neox-main/configs/llama2/
-
gpt-neox-main/gpt-neox-main/configs/llama2/13B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama2/70B.yml
751B
gpt-neox-main/gpt-neox-main/configs/llama2/7B.yml
628B
gpt-neox-main/gpt-neox-main/configs/llama2/codellama_34B.yml
829B
gpt-neox-main/gpt-neox-main/configs/llama2/codellama_7B.yml
808B
gpt-neox-main/gpt-neox-main/configs/llemma/
-
gpt-neox-main/gpt-neox-main/configs/llemma/34B.yml
2.61KB
gpt-neox-main/gpt-neox-main/configs/llemma/7B.yml
2.51KB
gpt-neox-main/gpt-neox-main/configs/local_setup.yml
1.2KB
gpt-neox-main/gpt-neox-main/configs/mamba/
-
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-1.4B.yml
628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-130M.yml
627B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-2.8B.yml
628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-370M.yml
628B
gpt-neox-main/gpt-neox-main/configs/mamba/mamba-790M.yml
628B
gpt-neox-main/gpt-neox-main/configs/mistral/
-
gpt-neox-main/gpt-neox-main/configs/mistral/7B.yml
1.32KB
gpt-neox-main/gpt-neox-main/configs/neox_arguments.md
42.76KB
gpt-neox-main/gpt-neox-main/configs/pythia/
-
gpt-neox-main/gpt-neox-main/configs/pythia/1-4B.yml
1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/12B.yml
1.84KB
gpt-neox-main/gpt-neox-main/configs/pythia/14M.yml
2.26KB
gpt-neox-main/gpt-neox-main/configs/pythia/160M.yml
1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/1B.yml
1.84KB
gpt-neox-main/gpt-neox-main/configs/pythia/2-8B.yml
1.85KB
gpt-neox-main/gpt-neox-main/configs/pythia/31M.yml
2.25KB
gpt-neox-main/gpt-neox-main/configs/pythia/410M.yml
1.79KB
gpt-neox-main/gpt-neox-main/configs/pythia/6-9B.yml
1.82KB
gpt-neox-main/gpt-neox-main/configs/pythia/70M.yml
1.79KB
gpt-neox-main/gpt-neox-main/configs/README.md
12.15KB
gpt-neox-main/gpt-neox-main/configs/rwkv/
-
gpt-neox-main/gpt-neox-main/configs/rwkv/170M.yml
2.36KB
gpt-neox-main/gpt-neox-main/configs/slurm_125M.yml
1.63KB
gpt-neox-main/gpt-neox-main/configs/slurm_local.json
305B
gpt-neox-main/gpt-neox-main/configs/slurm_local.yml
356B
gpt-neox-main/gpt-neox-main/configs/sparse.yml
542B
gpt-neox-main/gpt-neox-main/configs/text_generation.yml
494B
gpt-neox-main/gpt-neox-main/CONTRIBUTING.md
4.62KB
gpt-neox-main/gpt-neox-main/data/
-
gpt-neox-main/gpt-neox-main/data/openwebtext2_sample.jsonl
125.51MB
gpt-neox-main/gpt-neox-main/deepy.py
1.31KB
gpt-neox-main/gpt-neox-main/docker-compose-dockerhub.yml
545B
gpt-neox-main/gpt-neox-main/docker-compose.yml
589B
gpt-neox-main/gpt-neox-main/Dockerfile
3.76KB
gpt-neox-main/gpt-neox-main/eval.py
2.6KB
gpt-neox-main/gpt-neox-main/eval_tasks/
-
gpt-neox-main/gpt-neox-main/eval_tasks/eval_adapter.py
19.82KB
gpt-neox-main/gpt-neox-main/eval_tasks/__init__.py
643B
gpt-neox-main/gpt-neox-main/generate.py
3.24KB
gpt-neox-main/gpt-neox-main/images/
-
gpt-neox-main/gpt-neox-main/images/memory_profiling.png
1.04MB
gpt-neox-main/gpt-neox-main/images/nsight_profiling.png
472.09KB
gpt-neox-main/gpt-neox-main/LICENSE
25.18KB
gpt-neox-main/gpt-neox-main/MANIFEST.in
65B
gpt-neox-main/gpt-neox-main/megatron/
-
gpt-neox-main/gpt-neox-main/megatron/checkpointing.py
17.14KB
gpt-neox-main/gpt-neox-main/megatron/data/
-
gpt-neox-main/gpt-neox-main/megatron/data/blendable_dataset.py
2.56KB
gpt-neox-main/gpt-neox-main/megatron/data/data_utils.py
17.63KB
gpt-neox-main/gpt-neox-main/megatron/data/gpt2_dataset.py
12.54KB
gpt-neox-main/gpt-neox-main/megatron/data/helpers.cpp
33.18KB
gpt-neox-main/gpt-neox-main/megatron/data/indexed_dataset.py
18.79KB
gpt-neox-main/gpt-neox-main/megatron/data/Makefile
279B
gpt-neox-main/gpt-neox-main/megatron/data/samplers.py
6.07KB
gpt-neox-main/gpt-neox-main/megatron/data/test.py
20B
gpt-neox-main/gpt-neox-main/megatron/data/__init__.py
16B
gpt-neox-main/gpt-neox-main/megatron/devutil.py
1.25KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/
-
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/compat.h
893B
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding.cpp
6.37KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding.h
18.63KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/fused_rotary_positional_embedding_cuda.cu
15.36KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax.cpp
3.13KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax.h
23.44KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_masked_softmax_cuda.cu
4.55KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax.cpp
2.64KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h
26.3KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/scaled_upper_triang_masked_softmax_cuda.cu
3.37KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/setup.py
2.92KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/type_shim.h
21.61KB
gpt-neox-main/gpt-neox-main/megatron/fused_kernels/__init__.py
5.86KB
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/
-
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/gradient_noise_scale.py
7.96KB
gpt-neox-main/gpt-neox-main/megatron/gradient_noise_scale/__init__.py
53B
gpt-neox-main/gpt-neox-main/megatron/initialize.py
8.38KB
gpt-neox-main/gpt-neox-main/megatron/learning_rates.py
5.1KB
gpt-neox-main/gpt-neox-main/megatron/logging.py
13.65KB
gpt-neox-main/gpt-neox-main/megatron/model/
-
gpt-neox-main/gpt-neox-main/megatron/model/activations.py
4.28KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_bias_dropout.py
1.83KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_layer_norm.py
4.77KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_rope.py
4.84KB
gpt-neox-main/gpt-neox-main/megatron/model/fused_softmax.py
6.83KB
gpt-neox-main/gpt-neox-main/megatron/model/gmlp.py
4.97KB
gpt-neox-main/gpt-neox-main/megatron/model/gpt2_model.py
16.06KB
gpt-neox-main/gpt-neox-main/megatron/model/init_functions.py
7.49KB
gpt-neox-main/gpt-neox-main/megatron/model/mamba/
-
gpt-neox-main/gpt-neox-main/megatron/model/mamba/mamba.py
14.32KB
gpt-neox-main/gpt-neox-main/megatron/model/mamba/__init__.py
91B
gpt-neox-main/gpt-neox-main/megatron/model/megablocks_utils.py
896B
gpt-neox-main/gpt-neox-main/megatron/model/norms.py
2.89KB
gpt-neox-main/gpt-neox-main/megatron/model/positional_embeddings.py
9.93KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/
-
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/
-
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/
-
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/wkv6_cuda.cu
7.87KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/cuda/wkv6_op.cpp
2.5KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/rwkv.py
12.46KB
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/v6/__init__.py
59B
gpt-neox-main/gpt-neox-main/megatron/model/rwkv/__init__.py
-
gpt-neox-main/gpt-neox-main/megatron/model/transformer.py
49.76KB
gpt-neox-main/gpt-neox-main/megatron/model/utils.py
14.12KB
gpt-neox-main/gpt-neox-main/megatron/model/word_embeddings.py
9.4KB
gpt-neox-main/gpt-neox-main/megatron/model/__init__.py
894B
gpt-neox-main/gpt-neox-main/megatron/mpu/
-
gpt-neox-main/gpt-neox-main/megatron/mpu/cross_entropy.py
4.69KB
gpt-neox-main/gpt-neox-main/megatron/mpu/data.py
3.79KB
gpt-neox-main/gpt-neox-main/megatron/mpu/initialize.py
10.87KB
gpt-neox-main/gpt-neox-main/megatron/mpu/layers.py
27.37KB
gpt-neox-main/gpt-neox-main/megatron/mpu/mappings.py
4.83KB
gpt-neox-main/gpt-neox-main/megatron/mpu/random.py
1.53KB
gpt-neox-main/gpt-neox-main/megatron/mpu/utils.py
2.71KB
gpt-neox-main/gpt-neox-main/megatron/mpu/__init__.py
2.31KB
gpt-neox-main/gpt-neox-main/megatron/mup_substitute.py
7.62KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/
-
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/arguments.py
54.58KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/deepspeed_args.py
11.86KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/neox_args.py
34.96KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/template.py
1.63KB
gpt-neox-main/gpt-neox-main/megatron/neox_arguments/__init__.py
2.89KB
gpt-neox-main/gpt-neox-main/megatron/optimizers.py
17.69KB
gpt-neox-main/gpt-neox-main/megatron/text_generation_utils.py
33.38KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/
-
gpt-neox-main/gpt-neox-main/megatron/tokenizer/tokenizer.py
11.15KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/train_tokenizer.py
3.89KB
gpt-neox-main/gpt-neox-main/megatron/tokenizer/__init__.py
651B
gpt-neox-main/gpt-neox-main/megatron/training.py
42.11KB
gpt-neox-main/gpt-neox-main/megatron/utils.py
16.87KB
gpt-neox-main/gpt-neox-main/megatron/__init__.py
929B
gpt-neox-main/gpt-neox-main/prepare_data.py
2.28KB
gpt-neox-main/gpt-neox-main/preprocess_data.sh
310B
gpt-neox-main/gpt-neox-main/pretrain.sh
62B
gpt-neox-main/gpt-neox-main/README-MUP.md
1.53KB
gpt-neox-main/gpt-neox-main/README.md
52.62KB
gpt-neox-main/gpt-neox-main/requirements/
-
gpt-neox-main/gpt-neox-main/requirements/requirements-apex-pip.txt
12B
gpt-neox-main/gpt-neox-main/requirements/requirements-dev.txt
142B
gpt-neox-main/gpt-neox-main/requirements/requirements-flashattention.txt
18B
gpt-neox-main/gpt-neox-main/requirements/requirements-mamba.txt
104B
gpt-neox-main/gpt-neox-main/requirements/requirements-onebitadam.txt
20B
gpt-neox-main/gpt-neox-main/requirements/requirements-s3.txt
25B
gpt-neox-main/gpt-neox-main/requirements/requirements-sparseattention.txt
14B
gpt-neox-main/gpt-neox-main/requirements/requirements-tensorboard.txt
20B
gpt-neox-main/gpt-neox-main/requirements/requirements-wandb.txt
15B
gpt-neox-main/gpt-neox-main/requirements/requirements.txt
395B
gpt-neox-main/gpt-neox-main/tests/
-
gpt-neox-main/gpt-neox-main/tests/common.py
22.44KB
gpt-neox-main/gpt-neox-main/tests/config/
-
gpt-neox-main/gpt-neox-main/tests/config/test_setup.yml
1.97KB
gpt-neox-main/gpt-neox-main/tests/conftest.py
3.37KB
gpt-neox-main/gpt-neox-main/tests/cpu_tests/
-
gpt-neox-main/gpt-neox-main/tests/cpu_tests/action.yml
3.45KB
gpt-neox-main/gpt-neox-main/tests/cpu_tests/docker-compose.yml
506B
gpt-neox-main/gpt-neox-main/tests/data/
-
gpt-neox-main/gpt-neox-main/tests/data/enwik8_first100.txt
3.28KB
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/
-
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/tokenizer/
-
gpt-neox-main/gpt-neox-main/tests/data/hf_cache/tokenizer/gpt2.json
2.01MB
gpt-neox-main/gpt-neox-main/tests/data/sample_prompt.txt
28B
gpt-neox-main/gpt-neox-main/tests/model/
-
gpt-neox-main/gpt-neox-main/tests/model/test_fused_kernels.py
7.94KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_checkpoint.py
4.06KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_generation.py
3.78KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_instantiation.py
3.85KB
gpt-neox-main/gpt-neox-main/tests/model/test_model_train.py
3.5KB
gpt-neox-main/gpt-neox-main/tests/model/__init__.py
579B
gpt-neox-main/gpt-neox-main/tests/neox_args/
-
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_commandline.py
5.52KB
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_implementation.py
914B
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_load.py
4.95KB
gpt-neox-main/gpt-neox-main/tests/neox_args/test_neoxargs_usage.py
2.61KB
gpt-neox-main/gpt-neox-main/tests/neox_args/__init__.py
89B
gpt-neox-main/gpt-neox-main/tests/pytest.ini
746B
gpt-neox-main/gpt-neox-main/tests/README.md
1.56KB
gpt-neox-main/gpt-neox-main/tests/test_configs/
-
gpt-neox-main/gpt-neox-main/tests/test_configs/test_train_base.yml
3.44KB
gpt-neox-main/gpt-neox-main/tests/unit/
-
gpt-neox-main/gpt-neox-main/tests/unit/test_arguments.py
1.53KB
gpt-neox-main/gpt-neox-main/tests/unit/test_dependencies.py
196B
gpt-neox-main/gpt-neox-main/tests/unit/test_format_conversion_scripts.py
930B
gpt-neox-main/gpt-neox-main/tests/unit/test_launcher_scripts.py
3.84KB
gpt-neox-main/gpt-neox-main/tests/unit/test_tokenizer.py
333B
gpt-neox-main/gpt-neox-main/tests/unit/test_url_accessibility.py
691B
gpt-neox-main/gpt-neox-main/tests/unit/__init__.py
-
gpt-neox-main/gpt-neox-main/tests/__init__.py
-
gpt-neox-main/gpt-neox-main/tools/
-
gpt-neox-main/gpt-neox-main/tools/bash/
-
gpt-neox-main/gpt-neox-main/tools/bash/kill.sh
16B
gpt-neox-main/gpt-neox-main/tools/bash/killall.sh
55B
gpt-neox-main/gpt-neox-main/tools/bash/README.md
512B
gpt-neox-main/gpt-neox-main/tools/bash/sync.sh
845B
gpt-neox-main/gpt-neox-main/tools/bash/syncdir.sh
905B
gpt-neox-main/gpt-neox-main/tools/bash/sync_cmd.sh
741B
gpt-neox-main/gpt-neox-main/tools/ckpts/
-
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_hf_to_sequential.py
22.29KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_neox_to_hf.py
26.27KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_neox_to_mamba_ssm.py
11.63KB
gpt-neox-main/gpt-neox-main/tools/ckpts/convert_raw_llama_weights_to_neox.py
21.61KB
gpt-neox-main/gpt-neox-main/tools/ckpts/inspect_checkpoints.py
11.78KB
gpt-neox-main/gpt-neox-main/tools/ckpts/merge20b.py
9.23KB
gpt-neox-main/gpt-neox-main/tools/ckpts/README.md
5.29KB
gpt-neox-main/gpt-neox-main/tools/ckpts/upload.py
1.51KB
gpt-neox-main/gpt-neox-main/tools/datasets/
-
gpt-neox-main/gpt-neox-main/tools/datasets/corpora.py
10.54KB
gpt-neox-main/gpt-neox-main/tools/datasets/dataset_token_count.py
876B
gpt-neox-main/gpt-neox-main/tools/datasets/merge_datasets.py
2.26KB
gpt-neox-main/gpt-neox-main/tools/datasets/multinode_prepare_data.sh
2.26KB
gpt-neox-main/gpt-neox-main/tools/datasets/preprocess_data.py
7.56KB
gpt-neox-main/gpt-neox-main/tools/datasets/preprocess_data_with_mask.py
12.3KB
gpt-neox-main/gpt-neox-main/tools/datasets/README.md
5.48KB
gpt-neox-main/gpt-neox-main/tools/README.md
736B
gpt-neox-main/gpt-neox-main/tools/__init__.py
-
gpt-neox-main/gpt-neox-main/train.py
1.31KB

资源内容介绍

gpt-neox-main.zip
[![GitHub issues](https://img.shields.io/github/issues/EleutherAI/gpt-neox)](https://github.com/EleutherAI/gpt-neox/issues)[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Weights & Biases monitoring" height=20>](https://wandb.ai/eleutherai/neox)# GPT-NeoXThis repository records [EleutherAI](https://www.eleuther.ai)'s library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's [Megatron Language Model](https://github.com/NVIDIA/Megatron-LM) and has been augmented with techniques from [DeepSpeed](https://www.deepspeed.ai) as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training. This library is in widespread use in [academic, industry, and government labs](https://github.com/EleutherAI/gpt-neox#adoption-and-publications), including by researchers at Oak Ridge National Lab, CarperAI, Stability AI, Together.ai, Korea University, Carnegie Mellon University, and the University of Tokyo among others. Uniquely among similar libraries GPT-NeoX supports a wide variety of systems and hardwares, including launching via Slurm, MPI, and the IBM Job Step Manager, and has been run at scale on [AWS](https://aws.amazon.com/), [CoreWeave](https://www.coreweave.com/), [ORNL Summit](https://www.olcf.ornl.gov/summit/), [ORNL Frontier](https://www.olcf.ornl.gov/frontier/), [LUMI](https://www.lumi-supercomputer.eu/), and others.**If you are not looking to train models with billions of parameters from scratch, this is likely the wrong library to use. For generic inference needs, we recommend you use the Hugging Face `transformers` library instead which supports GPT-NeoX models.**## Why GPT-NeoX?GPT-NeoX leverages many of the same features and technologies as the popular Megatron-DeepSpeed library but with substantially increased usability and novel optimizations. Major features include:* Distributed training with ZeRO and 3D parallelism* A wide variety of systems and hardwares, including launching via Slurm, MPI, and the IBM Job Step Manager, and has been run at scale on [AWS](https://aws.amazon.com/), [CoreWeave](https://www.coreweave.com/), Oak Ridge's [Summit](https://www.olcf.ornl.gov/summit/) and [Frontier](https://www.olcf.ornl.gov/frontier/), [Pacific Northwest National Laboratory](https://hpc.pnl.gov/index.shtml), Argonne's [Polaris](https://docs.alcf.anl.gov/polaris/data-science-workflows/applications/gpt-neox/), [LUMI](https://www.lumi-supercomputer.eu/), and more.* Cutting edge architectural innovations including rotary and alibi positional embeddings, parallel feedforward attention layers, and flash attention.* Predefined configurations for popular architectures including Pythia, PaLM, Falcon, and LLaMA 1 \& 2* Curriculum Learning* Easy connections with the open source ecosystem, including Hugging Face's [tokenizers](https://github.com/huggingface/tokenizers) and [transformers](https://github.com/huggingface/transformers/) libraries, logging via [WandB](https://wandb.ai/site), and evaluation via our [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness).## News**[8/10/2023]** We now support checkpointing with AWS S3! Activate with the `s3_path` config option (for more detail, see [the PR](https://github.com/EleutherAI/gpt-neox/pull/1010))**[9/20/2023]** As of https://github.com/EleutherAI/gpt-neox/pull/1035, we have deprecated Flash Attention 0.x and 1.x, and migrated support to Flash Attention 2.x. We don't believe this will cause problems, but if you have a specific use-case that requires old flash support using the latest GPT-NeoX, please raise an issue.**[8/10/2023]** We have experimental support for LLaMA 2 and Flash Attention v2 supported in our [math-lm](https://github.com/EleutherAI/math-lm) project that will be upstreamed later this month.**[5/17/2023]** After fixing some miscellaneous bugs we now fully support bf16.**[4/11/2023]** We have upgraded our Flash Attention implementation to now support Alibi positional embeddings.**[3/9/2023]** We have released GPT-NeoX 2.0.0, an upgraded version built on the latest DeepSpeed which will be regularly synced with going forward.## VersionsPrior to 3/9/2023, GPT-NeoX relied on [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed), which was based on an old version of DeepSpeed (0.3.15). In order to migrate to the latest upstream DeepSpeed version while allowing users to access the old versions of GPT-NeoX and DeeperSpeed, we have introduced two versioned releases for both libraries:- Version 2.0 of [GPT-NeoX](https://github.com/EleutherAI/gpt-neox/releases/tag/v2.0) and [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed/releases/tag/v2.0) are the latest versions built on the latest DeepSpeed, and will be maintained going forward.- Version 1.0 of [GPT-NeoX](https://github.com/EleutherAI/gpt-neox/releases/tag/v1.0) and [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed/releases/tag/v1.0) maintain snapshots of the old stable versions that [GPT-NeoX-20B](https://arxiv.org/abs/2204.06745) and the [Pythia Suite](https://github.com/EleutherAI/pythia) were trained on.# Contents- [GPT-NeoX](#gpt-neox) * [Why GPT-NeoX?](#why-gpt-neox) * [News](#news) * [Versions](#versions)- [Contents](#contents)- [Quick Start](#quick-start) * [Environment and Dependencies](#environment-and-dependencies) + [Host Setup](#host-setup) + [Flash Attention](#flash-attention) + [Multi-Node Launching](#multi-node-launching) + [Containerized Setup](#containerized-setup) * [Usage](#usage)- [Configuration](#configuration) * [Mixture of Experts](#mixture-of-experts)- [Datasets](#datasets) * [Preconfigured Datasets](#preconfigured-datasets) * [Using Custom Data](#using-custom-data)- [Training and Finetuning](#training-and-finetuning) * [Pretrained Models](#pretrained-models) + [GPT-NeoX-20B](#gpt-neox-20b) + [Pythia](#pythia) + [Polyglot](#polyglot)- [Inference](#inference)- [Evaluation](#evaluation)- [Exporting to Hugging Face](#exporting-to-hugging-face)- [Monitoring](#monitoring) * [Weights and Biases](#weights-and-biases) * [TensorBoard](#tensorboard)- [Running on multi-node](#running-on-multi-node)- [Profiling](#profiling)- [Adoption and Publications](#adoption-and-publications) * [Publications](#publications) * [Models](#models) + [English LLMs](#english-llms) + [Non-English LLMs](#non-english-llms) + [Code Models](#code-models) + [Other Modalities](#other-modalities)- [Administrative Notes](#administrative-notes) * [Citing GPT-NeoX](#citing-gpt-neox) * [Contributing](#contributing) * [Licensing](#licensing) * [Acknowledgements](#acknowledgements)# Quick Start## Environment and Dependencies### Host SetupFirst make sure you are in an environment with Python 3.8 with an appropriate version of PyTorch 1.8 or later installed. **Note:** Some of the libraries that GPT-NeoX depends on have not been updated to be compatible with Python 3.10+. Python 3.9 appears to work, but this codebase has been developed and tested for Python 3.8.To install the remaining basic dependencies, run:```bashpip install -r requirements/requirements.txtpip install -r requirements/requirements-wandb.txt # optional, if logging using WandBpip install -r requirements/requirements-tensorboard.txt # optional, if logging via tensorboardpython ./megatron/fused_kernels/setup.py install # optional, if using fused kernels```from the repository root.> [!Warning]> Our codebase relies on [DeeperSpeed](https://github.com/EleutherAI/DeeperSpeed), our fork of the [DeepSpeed](https://github.com/microsoft/DeepSpeed) library with some added changes. We strongly recommend using Anaconda, a virtual machine, or some other form of environment isol

用户评论 (0)

发表评论

captcha