llms-from-scratch-cn-main.zip
大小:42.65MB
价格:14积分
下载量:0
评分:
5.0
上传者:u013818406
更新日期:2025-09-22

大模型实战教程,从0手撸LLM

资源文件列表(大概)

文件名
大小
llms-from-scratch-cn-main/
-
__MACOSX/._llms-from-scratch-cn-main
212B
llms-from-scratch-cn-main/Translated_Book/
-
__MACOSX/llms-from-scratch-cn-main/._Translated_Book
212B
llms-from-scratch-cn-main/images/
-
__MACOSX/llms-from-scratch-cn-main/._images
212B
llms-from-scratch-cn-main/README.md
12.86KB
__MACOSX/llms-from-scratch-cn-main/._README.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/
-
__MACOSX/llms-from-scratch-cn-main/._Model_Architecture_Discussions
212B
llms-from-scratch-cn-main/.gitignore
3.22KB
__MACOSX/llms-from-scratch-cn-main/._.gitignore
212B
llms-from-scratch-cn-main/Book/
-
__MACOSX/llms-from-scratch-cn-main/._Book
212B
llms-from-scratch-cn-main/Codes/
-
__MACOSX/llms-from-scratch-cn-main/._Codes
212B
llms-from-scratch-cn-main/LICENSE.txt
1.02KB
__MACOSX/llms-from-scratch-cn-main/._LICENSE.txt
212B
llms-from-scratch-cn-main/Translated_Book/ch01/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch01
212B
llms-from-scratch-cn-main/Translated_Book/ch04/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch04
212B
llms-from-scratch-cn-main/Translated_Book/ch03/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch03
212B
llms-from-scratch-cn-main/Translated_Book/img/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._img
212B
llms-from-scratch-cn-main/Translated_Book/ch02/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch02
212B
llms-from-scratch-cn-main/Translated_Book/ch05/
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/._ch05
212B
llms-from-scratch-cn-main/images/mental-model.jpg
173.65KB
__MACOSX/llms-from-scratch-cn-main/images/._mental-model.jpg
212B
llms-from-scratch-cn-main/images/cover.jpg
47.2KB
__MACOSX/llms-from-scratch-cn-main/images/._cover.jpg
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._llama3
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._phi-3
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._olmo
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._MiniCPM
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v1
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v6
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._pangu
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._mamba
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-compare
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/.keep
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._.keep
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._ChatGLM4
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._ChatGLM3
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/img/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._img
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._openelm
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._gptj
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v3
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v4
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v5
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._rwkv-v2
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/._phi
212B
llms-from-scratch-cn-main/Book/ch06/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch06
212B
llms-from-scratch-cn-main/Book/ch01/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch01
212B
llms-from-scratch-cn-main/Book/ch04/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch04
212B
llms-from-scratch-cn-main/Book/ch03/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch03
212B
llms-from-scratch-cn-main/Book/ch02/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch02
212B
llms-from-scratch-cn-main/Book/ch05/
-
__MACOSX/llms-from-scratch-cn-main/Book/._ch05
212B
llms-from-scratch-cn-main/Codes/ch07/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch07
212B
llms-from-scratch-cn-main/Codes/ch06/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch06
212B
llms-from-scratch-cn-main/Codes/ch01/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch01
212B
llms-from-scratch-cn-main/Codes/appendix-B/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._appendix-B
212B
llms-from-scratch-cn-main/Codes/ch04/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch04
212B
llms-from-scratch-cn-main/Codes/ch03/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch03
212B
llms-from-scratch-cn-main/Codes/ch02/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch02
212B
llms-from-scratch-cn-main/Codes/ch05/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._ch05
212B
llms-from-scratch-cn-main/Codes/appendix-A/
-
__MACOSX/llms-from-scratch-cn-main/Codes/._appendix-A
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.1什么是LLM.md
20.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.1什么是LLM.md
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.0理解大型语言模型.md
14.14KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.0理解大型语言模型.md
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.8总结.ipynb
2.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.8总结.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch01/.keep
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.6深入剖析GPT架构.ipynb
5.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.6深入剖析GPT架构.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.7构建大语言模型.ipynb
2.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.7构建大语言模型.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.2LLMs的应用.md
5.15KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.2LLMs的应用.md
212B
llms-from-scratch-cn-main/Translated_Book/ch01/1.5利用大型数据集.ipynb
5.59KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._1.5利用大型数据集.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch01/welcome.ipynb
21.17KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch01/._welcome.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.7 生成文本.ipynb
6.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.7 生成文本.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.5 在transfomer模块中连接注意力层和线性层.ipynb
12.37KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.5 在transfomer模块中连接注意力层和线性层.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/.keep
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.6 编码GPT模型.ipynb
15.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.6 编码GPT模型.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.1.ipynb
17.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.1.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.3 实现使用 GELU 激活函数的前馈网络.ipynb
56.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.3 实现使用 GELU 激活函数的前馈网络.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.2 使用层归一化对激活进行归一化.ipynb
15.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.2 使用层归一化对激活进行归一化.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.2.ipynb
15.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.2.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.4 增加快捷链接.ipynb
11.6KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.4 增加快捷链接.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.1 从头开始实现 GPT 模型以生成文本.ipynb
17.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.1 从头开始实现 GPT 模型以生成文本.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch04/4.6 编码GPT模型-Copy1.ipynb
15.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch04/._4.6 编码GPT模型-Copy1.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.1.ipynb
9.1KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.1.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.3.ipynb
25.13KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.3.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.7.ipynb
2.4KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.7.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.5.ipynb
27.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.5.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.2.ipynb
3.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.2.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.4.ipynb
25.56KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.4.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch03/.keep
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/ch03/3.6.ipynb
23.54KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch03/._3.6.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-1.jpg
94.35KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-12.jpg
138.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-12.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-13.jpg
167.87KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-13.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-26.jpg
115.67KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-26.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-9.jpg
129.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-2.jpg
131.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-24.jpg
128.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-24.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-18.jpg
196.89KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-18.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-11.jpg
93.37KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-11.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-19.jpg
126.72KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-19.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-10.jpg
167.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-10.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-25.jpg
83.88KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-25.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-3.jpg
104.26KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-3.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-8.jpg
61.73KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-7.jpg
47.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-14.jpg
87.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-14.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-21.jpg
43.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-21.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-20.jpg
45.71KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-20.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-15.jpg
168.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-15.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-6.jpg
70.01KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-6.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-4.jpg
74.29KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-17.jpg
112.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-17.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-22.jpg
172.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-22.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-9.jpg
81.89KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-8.jpg
109.97KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-23.jpg
61.8KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-23.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-16.jpg
155.7KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-16.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-5.jpg
79.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-5.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-10.jpg
150.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-10.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-9.jpg
213.86KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-8.jpg
95.4KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-11.jpg
90.18KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-11.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-13.jpg
107.2KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-13.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-12.jpg
133.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-12.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-9.jpg
66.19KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-16.jpg
115.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-16.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-17.jpg
125.99KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-17.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-D-1.jpg
50.15KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-D-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-8.jpg
79.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/.keep
1B
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-15.jpg
86.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-15.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-8.jpg
104.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-9.jpg
97.65KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-14.jpg
135.22KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-14.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-D-2.jpg
59.13KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-D-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-6.jpg
101.75KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-6.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-19.jpg
147.47KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-19.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-10.jpg
93.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-10.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-4.jpg
126.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-6.png
87.65KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-6.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-1.jpg
102.06KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-5.jpg
105.8KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-5.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-18.jpg
69.81KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-18.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-11.jpg
202.93KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-11.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-11.png
208.53KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-11.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-7.jpg
112.03KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-5.jpg
83.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-5.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-13.png
182.67KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-13.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-13.jpg
91.33KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-13.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-7.jpg
56.11KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-5.png
152.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-5.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-3.png
117.99KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-3.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-2.jpg
102KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-4.png
134.22KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-4.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-12.jpg
70.08KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-12.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-6.png
132.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-6.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-12.png
69.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-12.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-4.jpg
70.08KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-2.jpg
101.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-16.jpg
94.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-16.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-6.jpg
93.44KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-6.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-7.jpg
95.7KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/cover-1.jpg
83.35KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._cover-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-1.png
225.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-1.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-17.jpg
137.73KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-17.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-3.jpg
104.25KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-3.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-1.jpg
76.02KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-3.jpg
113.68KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-3.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-20.png
86.1KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-20.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1-1.jpg
68.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-15.jpg
108.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-15.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-3.png
202.77KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-3.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-5.png
157.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-5.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-4.jpg
111.83KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/cover-2.jpg
83.55KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._cover-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-2.png
188KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-2.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-5-14.jpg
72.42KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-5-14.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-21.png
138.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-21.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-2.jpg
81.32KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-3.jpg
114.82KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-3.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.2.png
67.07KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.2.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-10.jpg
80.32KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-10.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-3.png
209.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-3.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-8.jpg
118.12KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-8.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-12.jpg
85.3KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-12.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-5.jpg
42.94KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-5.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-4.jpg
87.74KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-13.jpg
109.86KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-13.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-9.jpg
197.05KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-9.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-2.png
121.98KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-2.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-11.jpg
96.81KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-11.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.3.png
87.5KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.3.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-2.jpg
123.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.1.png
54.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.1.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-13.jpg
65.57KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-13.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-18.jpg
101.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-18.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-11.jpg
112.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-11.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-6.jpg
134.34KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-6.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-1.7-1.jpg
939.45KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-1.7-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-7.jpg
120.84KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-10.jpg
87.2KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-10.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-1.png
188.84KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-1.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-A-12.jpg
71.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-A-12.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-1.jpg
89.94KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-5.jpg
153.75KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-5.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.4.png
129.01KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.4.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-5.png
274.39KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-5.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-3.jpg
78.54KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-3.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-14.jpg
82.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-14.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-15.jpg
72.77KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-15.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-2.jpg
85.79KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-2.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-4.png
218.43KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-4.png
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.5.png
109.51KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.5.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-4.jpg
151KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-4.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-6.jpg
148.58KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-6.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-6.png
217.09KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-6.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-17.jpg
88.76KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-17.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-3-16.jpg
85.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-3-16.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-2-1.jpg
106.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-2-1.jpg
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-7.png
216.16KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-7.png
212B
llms-from-scratch-cn-main/Translated_Book/img/Figure 1.6.png
76.27KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._Figure 1.6.png
212B
llms-from-scratch-cn-main/Translated_Book/img/fig-4-7.jpg
92.46KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/img/._fig-4-7.jpg
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.1理解词嵌入.ipynb
6.61KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.1理解词嵌入.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.5 字节对编码(BPE).ipynb
101.49KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.5 字节对编码(BPE).ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.8词位置编码.ipynb
8.11KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.8词位置编码.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.6使用滑动窗口进行数据采样.ipynb
20.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.6使用滑动窗口进行数据采样.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/.keep
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.7 构建词符嵌入.ipynb
6.24KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.7 构建词符嵌入.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.文本数据处理.ipynb
3.25KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.文本数据处理.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.2文本分词(序列化).ipynb
13.23KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.2文本分词(序列化).ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.3将令牌转换为令牌 ID.ipynb
16.38KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.3将令牌转换为令牌 ID.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch02/2.4添加特殊上下文tokens.ipynb
13.69KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch02/._2.4添加特殊上下文tokens.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.1 在未标记的数据上进行预训练.ipynb
63.19KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.1 在未标记的数据上进行预训练.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch05/.keep
-
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._.keep
212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.3.ipynb
23.63KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.3.ipynb
212B
llms-from-scratch-cn-main/Translated_Book/ch05/5.2.ipynb
14.26KB
__MACOSX/llms-from-scratch-cn-main/Translated_Book/ch05/._5.2.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/llama3-from-scratch.ipynb
289.18KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._llama3-from-scratch.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/LICENSE
1.05KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._LICENSE
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/requirements.txt
48B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._requirements.txt
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._images
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/params.txt
182B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._params.txt
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/params.json
212B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._params.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/tokenizer.model
2.08MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._tokenizer.model
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/README.md
44.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/._README.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/modeling_phi3.py
70.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._modeling_phi3.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/phi-3.ipynb
8.7KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._phi-3.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/configuration_phi3.py
9.25KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi-3/._configuration_phi3.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/configuration_olmo.py
7.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._configuration_olmo.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/olmo.ipynb
6.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._olmo.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/modeling_olmo.py
57.71KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/olmo/._modeling_olmo.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/configuration_minicpm.py
2.4KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._configuration_minicpm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer_config.json
1.11KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer_config.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/special_tokens_map.json
414B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._special_tokens_map.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPM.ipynb
56.76KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPM.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/gitattributes
1.52KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._gitattributes
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/config.json
712B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._config.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer.json
5.92MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPM.py
31.54KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPM.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/generation_config.json
113B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._generation_config.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/tokenizer.model
1.9MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._tokenizer.model
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/README.md
11.31KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._README.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/MiniCPMTest.ipynb
9.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/MiniCPM/._MiniCPMTest.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/model.py
21.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/._model.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/readme.md
8.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v1/._readme.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/RWKV_v6_demo.ipynb
15.14KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._RWKV_v6_demo.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/rwkv_vocab_v20230424.txt
1.04MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._rwkv_vocab_v20230424.txt
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._img
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/RWKV-v6-guide.ipynb
21.63KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/._RWKV-v6-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/tokenization_gptpangu_bak.py
4.58KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._tokenization_gptpangu_bak.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/modeling_gptpangu.py
21.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._modeling_gptpangu.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/pangu.ipynb
12.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._pangu.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/tokenization_gptpangu.py
4.15KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._tokenization_gptpangu.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/configuration_gptpangu.py
1.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/pangu/._configuration_gptpangu.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/demo.ipynb
10.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._demo.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/model.py
12.17KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._model.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/README.md
1.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/mamba/._README.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v5.py
9.98KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v5.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v1.py
21.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v1.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v4.py
6.42KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v4.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/readme.md
11.02KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._readme.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v3.py
8.92KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v3.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v6.py
9.43KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v6.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/model_v2.py
8.03KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-compare/._model_v2.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/chatglm4.ipynb
11.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._chatglm4.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/configuration_chatglm.py
2.21KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._configuration_chatglm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/tokenization_chatglm.py
15.28KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._tokenization_chatglm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/chatglm4-guide.ipynb
188.91KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._chatglm4-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/modeling_chatglm.py
51.74KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM4/._modeling_chatglm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/glm.py
46.9KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._glm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenizer_config.json
1.38KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenizer_config.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/quantization.py
14.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._quantization.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenization_chatglm.py
12.69KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenization_chatglm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/configuration_chatglm_full.py
1.07KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._configuration_chatglm_full.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/tokenizer.model
994.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._tokenizer.model
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/README.md
1.43KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._README.md
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._img
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/加载模型权重.ipynb
79.31KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/._加载模型权重.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/img/.keep
1B
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/img/._.keep
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/openelm.ipynb
11.18KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._openelm.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/configuration_openelm.py
13.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._configuration_openelm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/modeling_openelm.py
38.32KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/openelm/._modeling_openelm.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/gptj.ipynb
8.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._gptj.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/modeling_gptj.py
60.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._modeling_gptj.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/configuration_gptj.py
7.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/gptj/._configuration_gptj.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/model_run.py
11.06KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._model_run.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/20B_tokenizer.json
2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._20B_tokenizer.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/model.py
8.97KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._model.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/rwkv-v3-guide.ipynb
27.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._rwkv-v3-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/rwkv-v3.ipynb
10.57KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._rwkv-v3.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/utils.py
3.98KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v3/._utils.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/20B_tokenizer.json
2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/._20B_tokenizer.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/rwkv-v4-guide.ipynb
21.21KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v4/._rwkv-v4-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/rwkv_vocab_v20230424.txt
1.04MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._rwkv_vocab_v20230424.txt
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._img
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/RWKV_v5_demo.ipynb
21.4KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._RWKV_v5_demo.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/RWKV-v5-guide.ipynb
42.41KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/._RWKV-v5-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/20B_tokenizer.json
2.35MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._20B_tokenizer.json
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/rwkv-v2-guide.ipynb
33.25KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._rwkv-v2-guide.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/model.py
8.03KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._model.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/
-
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._img
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/rwkv-v2.ipynb
35.39KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/._rwkv-v2.ipynb
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/modeling_phi.py
66.49KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._modeling_phi.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/configuration_phi.py
8.26KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._configuration_phi.py
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/phi.ipynb
13.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/phi/._phi.ipynb
212B
llms-from-scratch-cn-main/Book/ch06/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch06/._.keep
212B
llms-from-scratch-cn-main/Book/ch01/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch01/._.keep
212B
llms-from-scratch-cn-main/Book/ch04/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch04/._.keep
212B
llms-from-scratch-cn-main/Book/ch03/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch03/._.keep
212B
llms-from-scratch-cn-main/Book/ch02/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch02/._.keep
212B
llms-from-scratch-cn-main/Book/ch05/.keep
-
__MACOSX/llms-from-scratch-cn-main/Book/ch05/._.keep
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._03_model-evaluation
212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._05_dataset-generation
212B
llms-from-scratch-cn-main/Codes/ch07/README.md
740B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._02_dataset-utilities
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/._04_preference-tuning-with-dpo
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._02_bonus_additional-experiments
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/._03_bonus_imdb-classification
212B
llms-from-scratch-cn-main/Codes/ch01/README.md
84B
__MACOSX/llms-from-scratch-cn-main/Codes/ch01/._README.md
212B
llms-from-scratch-cn-main/Codes/appendix-B/README.md
829B
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-B/._README.md
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch04/README.md
147B
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/._README.md
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch03/README.md
120B
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/._README.md
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._02_bonus_bytepair-encoder
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._03_bonus_embedding-vs-matmul
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch02/README.md
500B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._README.md
212B
llms-from-scratch-cn-main/Codes/ch02/09_summary/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/._09_summary
212B
llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._04_learning_rate_schedulers
212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._03_bonus_pretraining_on_gutenberg
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._01_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/ch05/README.md
600B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._README.md
212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._05_bonus_hparam_tuning
212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/._02_alternative_weight_loading
212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/
-
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._03_main-chapter-code
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/
-
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._01_optional-python-setup-preferences
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/
-
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/._02_installing-python-libraries
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/archi.png
845.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._archi.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/ropesplit.png
401.41KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._ropesplit.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/tokens.png
488.49KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._tokens.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/keys.png
430.16KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._keys.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/embeddings.png
470.5KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._embeddings.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/attention.png
202.27KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._attention.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_39_0.png
26.96KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_39_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_41_0.png
25.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_41_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/heads.png
799.73KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._heads.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/last_norm.png
1003.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._last_norm.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/god.png
1.21MB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._god.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qkv.png
497.17KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qkv.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/swiglu.png
604.83KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._swiglu.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/freq_cis.png
813.92KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._freq_cis.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/model.png
658.84KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._model.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_42_0.png
27.37KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_42_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/rms.png
340.74KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._rms.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/softmax.png
190.99KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._softmax.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/value.png
199.91KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._value.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/weightmatrix.png
379.86KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._weightmatrix.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qsplit.png
551.01KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qsplit.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/keys0.png
422.6KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._keys0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/q_per_token.png
483.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._q_per_token.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_30_0.png
48.6KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_30_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/finallayer.png
799.14KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._finallayer.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/norm.png
308.67KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._norm.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/stacked.png
383.59KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._stacked.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/v0.png
188.19KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._v0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/a10.png
633.97KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._a10.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/42.png
772.73KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._42.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_54_0.png
27.37KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_54_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/afterattention.png
289.26KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._afterattention.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/rope.png
516.22KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._rope.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/norm_after.png
297.39KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._norm_after.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/mask.png
471.46KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._mask.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_52_0.png
25.81KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_52_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/implllama3_50_0.png
26.94KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._implllama3_50_0.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/karpathyminbpe.png
787.45KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._karpathyminbpe.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/qkmatmul.png
189.33KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/llama3/images/._qkmatmul.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/01.png
100.34KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v6/img/._01.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/img.png
111.38KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/ChatGLM3/img/._img.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/01.png
100.34KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v5/img/._01.png
212B
llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/01.png
231.82KB
__MACOSX/llms-from-scratch-cn-main/Model_Architecture_Discussions/rwkv-v2/img/._01.png
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/exercise-solutions.ipynb
36.83KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/ch07.ipynb
125.91KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._ch07.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/previous_chapters.py
17.6KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/instruction-data.json
198.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._instruction-data.json
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/exercise_experiments.py
18.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._exercise_experiments.py
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/ollama_evaluate.py
3.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._ollama_evaluate.py
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/load-finetuned-model.ipynb
6.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._load-finetuned-model.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/README.md
3.36KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/gpt_download.py
5.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._gpt_download.py
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/instruction-data-with-response.json
28.59KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._instruction-data-with-response.json
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/tests.py
597B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._tests.py
212B
llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/gpt_instruction_finetuning.py
11.14KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/01_main-chapter-code/._gpt_instruction_finetuning.py
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/config.json
115B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._config.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/eval-example-data.json
36.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._eval-example-data.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._scores
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/README.md
1.02KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/llm-instruction-eval-openai.ipynb
20.12KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._llm-instruction-eval-openai.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/llm-instruction-eval-ollama.ipynb
23.12KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._llm-instruction-eval-ollama.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/requirements-extra.txt
28B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/._requirements-extra.txt
212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/llama3-ollama.ipynb
29.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._llama3-ollama.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/README.md
295B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/instruction-data-llama3-7b.json
10KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/05_dataset-generation/._instruction-data-llama3-7b.json
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/config.json
115B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._config.json
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/instruction-examples-modified.json
53.55KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._instruction-examples-modified.json
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/README.md
2.16KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/create-passive-voice-entries.ipynb
11.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._create-passive-voice-entries.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/instruction-examples.json
38.43KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._instruction-examples.json
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/requirements-extra.txt
47B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._requirements-extra.txt
212B
llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/find-near-duplicates.py
5.08KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/02_dataset-utilities/._find-near-duplicates.py
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
179.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._dpo-from-scratch.ipynb
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/instruction-data-with-preference.json
377.9KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._instruction-data-with-preference.json
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/previous_chapters.py
17.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/README.md
366B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._README.md
212B
llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb
21.23KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/04_preference-tuning-with-dpo/._create-preference-data-ollama.ipynb
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/exercise-solutions.ipynb
5.1KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/previous_chapters.py
11.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/ch06.ipynb
137.77KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._ch06.ipynb
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/README.md
700B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/gpt-class-finetune.py
15.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._gpt-class-finetune.py
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/gpt_download.py
3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._gpt_download.py
212B
llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/tests.py
597B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/01_main-chapter-code/._tests.py
212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/previous_chapters.py
13.21KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/additional-experiments.py
20.45KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._additional-experiments.py
212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/README.md
8.64KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._README.md
212B
llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/gpt_download.py
3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/02_bonus_additional-experiments/._gpt_download.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-sklearn-logreg.py
2.83KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-sklearn-logreg.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/previous_chapters.py
11.75KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/sklearn-baseline.ipynb
7.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._sklearn-baseline.ipynb
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/download-prepare-dataset.py
3.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._download-prepare-dataset.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/README.md
3.44KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._README.md
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/gpt_download.py
3.76KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._gpt_download.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-bert-hf.py
10.71KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-bert-hf.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/train-gpt.py
12.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._train-gpt.py
212B
llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/requirements-extra.txt
40B
__MACOSX/llms-from-scratch-cn-main/Codes/ch06/03_bonus_imdb-classification/._requirements-extra.txt
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/exercise-solutions.ipynb
11.57KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/previous_chapters.py
3.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/ch04.ipynb
82.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._ch04.ipynb
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/README.md
502B
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._figures
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/gpt.py
9.39KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/._gpt.py
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/ch03.ipynb
71.89KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._ch03.ipynb
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/exercise-solutions.ipynb
7.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/small-text-sample.txt
1.92KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._small-text-sample.txt
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/README.md
264B
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._figures
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/multihead-attention.ipynb
15.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/._multihead-attention.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._gpt2_model
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/bpe_openai_gpt2.py
7.81KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._bpe_openai_gpt2.py
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/README.md
233B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._README.md
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/compare-bpe-tiktoken.ipynb
10.8KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/._compare-bpe-tiktoken.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._images
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/README.md
218B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._README.md
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/embeddings-and-linear-layers.ipynb
12.33KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/._embeddings-and-linear-layers.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/dataloader.ipynb
4.67KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._dataloader.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/exercise-solutions.ipynb
7.27KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/ch02.ipynb
45.42KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._ch02.ipynb
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/README.md
221B
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/the-verdict.txt
20KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/01_main-chapter-code/._the-verdict.txt
212B
llms-from-scratch-cn-main/Codes/ch02/09_summary/09_summary.ipynb
2.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/09_summary/._09_summary.ipynb
212B
llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/README.md
506B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/04_learning_rate_schedulers/._README.md
212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/prepare_dataset.py
2.82KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._prepare_dataset.py
212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/previous_chapters.py
11.02KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/README.md
6.14KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._README.md
212B
llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/pretraining_simple.py
8.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/03_bonus_pretraining_on_gutenberg/._pretraining_simple.py
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/ch05.ipynb
143.95KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._ch05.ipynb
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/
-
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._images
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/previous_chapters.py
9.35KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/README.md
578B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._README.md
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_train.py
7.91KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_train.py
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_download.py
3.49KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_download.py
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/gpt_generate.py
9.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._gpt_generate.py
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/tests.py
1.24KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/._tests.py
212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/hparam_search.py
7.46KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._hparam_search.py
212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/previous_chapters.py
9.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/README.md
745B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._README.md
212B
llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/the-verdict.txt
20KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/05_bonus_hparam_tuning/._the-verdict.txt
212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/weight-loading-hf-transformers.ipynb
11.17KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._weight-loading-hf-transformers.ipynb
212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/previous_chapters.py
9.88KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._previous_chapters.py
212B
llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/README.md
319B
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/02_alternative_weight_loading/._README.md
212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/code-part2.ipynb
11.36KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._code-part2.ipynb
212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/exercise-solutions.ipynb
3.71KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._exercise-solutions.ipynb
212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/code-part1.ipynb
30.47KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._code-part1.ipynb
212B
llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/DDP-script.py
5.09KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/03_main-chapter-code/._DDP-script.py
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/README.md
3.48KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/._README.md
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/
-
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/._figures
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/requirements.txt
137B
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._requirements.txt
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/README.md
2.11KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._README.md
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/
-
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._figures
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/python_environment_check.ipynb
1.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._python_environment_check.ipynb
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/python_environment_check.py
2.22KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/._python_environment_check.py
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/llama3-8b-model-2-response.json
393B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._llama3-8b-model-2-response.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/llama3-8b-model-1-response.json
402B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._llama3-8b-model-1-response.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/gpt4-model-1-response.json
445B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._gpt4-model-1-response.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/gpt4-model-2-response.json
408B
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._gpt4-model-2-response.json
212B
llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/correlation-analysis.ipynb
33.8KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch07/03_model-evaluation/scores/._correlation-analysis.ipynb
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/overview-after-ln.webp
20.73KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._overview-after-ln.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/gpt.webp
29.85KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._gpt.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-final.webp
20.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-final.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/use-gpt.webp
14.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._use-gpt.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/shortcut-example.webp
32.1KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._shortcut-example.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model.webp
25.04KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/iterative-generate.webp
23.72KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._iterative-generate.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/chapter-steps.webp
29.38KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._chapter-steps.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/layernorm2.webp
13.96KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._layernorm2.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/generate-text.webp
36.26KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._generate-text.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/transformer-block.webp
25.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._transformer-block.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-2.webp
14.87KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-2.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/mental-model-3.webp
21.01KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._mental-model-3.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/iterative-gen.webp
17.92KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._iterative-gen.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/gpt-in-out.webp
20.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._gpt-in-out.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/layernorm.webp
26.98KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._layernorm.webp
212B
llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/ffn.webp
24.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch04/01_main-chapter-code/figures/._ffn.webp
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-3.png
53.18KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-3.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-2.png
60.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-2.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/dot-product.png
93.4KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._dot-product.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-1.png
52.21KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-1.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/weight-selfattn-4.png
53.85KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._weight-selfattn-4.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/attention.png
66.62KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._attention.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/masked.png
59.09KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._masked.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/single-head.png
71.34KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._single-head.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/multi-head.png
59.77KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._multi-head.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/attention-matrix.png
136.29KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._attention-matrix.png
212B
llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/dropout.png
62.86KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch03/01_main-chapter-code/figures/._dropout.png
212B
llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/encoder.json
1017.87KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/02_bonus_bytepair-encoder/gpt2_model/._encoder.json
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/4.png
290.55KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._4.png
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/5.png
288.61KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._5.png
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/2.png
132.57KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._2.png
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/3.png
216.44KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._3.png
212B
llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/1.png
133.33KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch02/03_bonus_embedding-vs-matmul/images/._1.png
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-1.webp
86.94KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-1.webp
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-3.webp
58.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-3.webp
212B
llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/img-2.webp
72.46KB
__MACOSX/llms-from-scratch-cn-main/Codes/ch05/01_main-chapter-code/images/._img-2.webp
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/download.png
174.07KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._download.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/pytorch-installer.jpg
94.51KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._pytorch-installer.jpg
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/new-env.png
185.38KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._new-env.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/miniforge-install.png
258.47KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._miniforge-install.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/check-pip.png
219.68KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._check-pip.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/conda-install.png
186.52KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._conda-install.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/activate-env.png
180KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/01_optional-python-setup-preferences/figures/._activate-env.png
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/watermark.jpg
35.99KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._watermark.jpg
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/pytorch-installer.jpg
94.51KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._pytorch-installer.jpg
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/jupyter-issues.jpg
102.72KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._jupyter-issues.jpg
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/check_2.jpg
78.97KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._check_2.jpg
212B
llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/check_1.jpg
107.24KB
__MACOSX/llms-from-scratch-cn-main/Codes/appendix-A/02_installing-python-libraries/figures/._check_1.jpg
212B

资源内容介绍

如果你想从0手写代码,构建大语言模型,本项目很适合你。 本项目"LLMs From Scratch" 是由 Datawhale 提供的一个从头开始构建类似 ChatGPT 大型语言模型(LLM)的实践教程。 我们旨在通过详细的指导、代码示例和深度学习资源,帮助开发者和研究者掌握创建大语言模型和大语言模型架构的核心技术。 本项目包括了从0逐步构建GLM4\Llama3\RWKV6的教程,从0构建大模型,一起深入理解大模型原理。
# 从头开始实现llama3在这个文件中,我逐个张量和矩阵地从头实现了llama3。本地可以运行:llama3-from-scratch.ipynb<br>此外,我将直接从meta提供给llama3的模型文件中加载张量,你需要在运行此文件之前下载权重。这是下载权重的官方链接: [点击这里下载权重](https://llama.meta.com/llama-downloads/)<div> <img src="images/archi.png"/></div>https://hf-mirror.com/NousResearch/Meta-Llama-3-8Bhttps://gitee.com/hf-models/Meta-Llama-3-8B-Instruct/## 分词器我不打算实现一个BPE分词器(但是Andrej Karpathy有一个非常干净的实现)。<br>他的实现链接: [点击这里查看他的实现](https://github.com/karpathy/minbpe)<div> <img src="images/karpathyminbpe.png" width="600"/></div>```python%env HF_ENDPOINT = "https://hf-mirror.com"``` env: HF_ENDPOINT="https://hf-mirror.com"```python%pip install blobfile -q``` Note: you may need to restart the kernel to use updated packages.```pythonfrom pathlib import Pathimport tiktokenfrom tiktoken.load import load_tiktoken_bpeimport torchimport jsonimport matplotlib.pyplot as plttokenizer_path = "./tokenizer.model"special_tokens = [ "<|begin_of_text|>", "<|end_of_text|>", "<|reserved_special_token_0|>", "<|reserved_special_token_1|>", "<|reserved_special_token_2|>", "<|reserved_special_token_3|>", "<|start_header_id|>", "<|end_header_id|>", "<|reserved_special_token_4|>", "<|eot_id|>", # end of turn ] + [f"<|reserved_special_token_{i}|>" for i in range(5, 256 - 5)]mergeable_ranks = load_tiktoken_bpe(tokenizer_path)tokenizer = tiktoken.Encoding( name=Path(tokenizer_path).name, pat_str=r"(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1,3}| ?[^\s\p{L}\p{N}]+[\r\n]*|\s*[\r\n]+|\s+(?!\S)|\s+", mergeable_ranks=mergeable_ranks, special_tokens={token: len(mergeable_ranks) + i for i, token in enumerate(special_tokens)},)tokenizer.decode(tokenizer.encode("hello world!"))``` 'hello world!'## 读取模型文件通常,读取模型文件取决于模型类的编写方式以及其中的变量名。<br>但由于我们是从头开始实现llama3,我们将逐个张量地读取文件。<div> <img src="images/model.png" width="600"/></div>可以在这里下载模型:https://gitee.com/hf-models/Meta-Llama-3-8B-Instruct/blob/main/original/consolidated.00.pth```python!wget 'https://lfs.gitee.com/api/lfs/storage/projects/34266234/be52262c9289304f3e8240e0749bf257bc04264405a86cd4de38efb9068724ee?Expires=1716626632&Signature=xgDOu9JHNM6ECazR3nA4NQHwXs%2BiG%2BCtnzza6ekSuqs%3D&FileName=consolidated.00.pth'``` --2024-05-25 16:24:15-- https://lfs.gitee.com/api/lfs/storage/projects/34266234/be52262c9289304f3e8240e0749bf257bc04264405a86cd4de38efb9068724ee?Expires=1716626632&Signature=xgDOu9JHNM6ECazR3nA4NQHwXs%2BiG%2BCtnzza6ekSuqs%3D&FileName=consolidated.00.pth Resolving lfs.gitee.com (lfs.gitee.com)... 180.76.198.180 Connecting to lfs.gitee.com (lfs.gitee.com)|180.76.198.180|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 16060617592 (15G) [application/octet-stream] Saving to: ‘be52262c9289304f3e8240e0749bf257bc04264405a86cd4de38efb9068724ee?Expires=1716626632&Signature=xgDOu9JHNM6ECazR3nA4NQHwXs+iG+Ctnzza6ekSuqs=&FileName=consolidated.00.pth’ 0% [ ] 105,193,134 453KB/s eta 11h 21m^C我的机器12s可以载入,接下来仅用cpu进行推理,我这边内存30G足够了,然后cpu推理一个词大约30s,稍微慢了一些,不过我们主要理解原理```pythonmodel = torch.load("/data1/ckw/consolidated.00.pth")print(json.dumps(list(model.keys())[:20], indent=4))``` [ "tok_embeddings.weight", "layers.0.attention.wq.weight", "layers.0.attention.wk.weight", "layers.0.attention.wv.weight", "layers.0.attention.wo.weight", "layers.0.feed_forward.w1.weight", "layers.0.feed_forward.w3.weight", "layers.0.feed_forward.w2.weight", "layers.0.attention_norm.weight", "layers.0.ffn_norm.weight", "layers.1.attention.wq.weight", "layers.1.attention.wk.weight", "layers.1.attention.wv.weight", "layers.1.attention.wo.weight", "layers.1.feed_forward.w1.weight", "layers.1.feed_forward.w3.weight", "layers.1.feed_forward.w2.weight", "layers.1.attention_norm.weight", "layers.1.ffn_norm.weight", "layers.2.attention.wq.weight" ]```pythonwith open("./params.json", "r") as f: config = json.load(f)config``` {'dim': 4096, 'n_layers': 32, 'n_heads': 32, 'n_kv_heads': 8, 'vocab_size': 128256, 'multiple_of': 1024, 'ffn_dim_multiplier': 1.3, 'norm_eps': 1e-05, 'rope_theta': 500000.0}## 我们使用这个配置来推断模型的细节,比如:1. 模型有32个Transformer层2. 每个多头注意力块有32个头3. 词汇表大小,等等```pythondim = config["dim"]n_layers = config["n_layers"]n_heads = config["n_heads"]n_kv_heads = config["n_kv_heads"]vocab_size = config["vocab_size"]multiple_of = config["multiple_of"]ffn_dim_multiplier = config["ffn_dim_multiplier"]norm_eps = config["norm_eps"]rope_theta = torch.tensor(config["rope_theta"])```## 将文本转换为标记这里我们使用tiktoken(我认为是OpenAI的一个库)作为分词器<div> <img src="images/tokens.png" width="600"/></div>```pythonprompt = "the answer to the ultimate question of life, the universe, and everything is "tokens = [128000] + tokenizer.encode(prompt)print(tokens)tokens = torch.tensor(tokens)prompt_split_as_tokens = [tokenizer.decode([token.item()]) for token in tokens]print(prompt_split_as_tokens)``` [128000, 1820, 4320, 311, 279, 17139, 3488, 315, 2324, 11, 279, 15861, 11, 323, 4395, 374, 220] ['<|begin_of_text|>', 'the', ' answer', ' to', ' the', ' ultimate', ' question', ' of', ' life', ',', ' the', ' universe', ',', ' and', ' everything', ' is', ' ']## 将标记转换为它们的嵌入向量这是代码库中我唯一使用内置神经网络模块的部分。<br>无论如何,我们的[17x1]标记现在是[17x4096],即长度为4096的17个嵌入向量(每个标记一个)。<br><br>注意: 跟踪形状,这样可以更容易理解所有内容<div> <img src="images/embeddings.png" width="600"/></div>```pythonembedding_layer = torch.nn.Embedding(vocab_size, dim)embedding_layer.weight.data.copy_(model["tok_embeddings.weight"])token_embeddings_unnormalized = embedding_layer(tokens).to(torch.bfloat16)token_embeddings_unnormalized.shape``` torch.Size([17, 4096])## 然后我们使用RMS归一化来标准化嵌入向量请注意,在此步骤之后,形状不会改变,只是值被标准化了。<br>需要记住的一些事情,我们需要一个norm_eps(来自配置),因为我们不希望意外地将RMS设置为0并除以0。<br>以下是公式:<div> <img src="images/rms.png" width="600"/></div>```python# def rms_norm(tensor, norm_weights):# rms = (tensor.pow(2).mean(-1, keepdim=True) + norm_eps)**0.5# return tensor * (norm_weights / rms)def rms_norm(tensor, norm_weights): return (tensor * torch.rsqrt(tensor.pow(2).mean(-1, keepdim=True) + norm_eps)) * norm_weights```# 构建Transformer的第一层### 标准化你会看到我从模型字典中访问layer.0(这是第一层)。<br>无论如何,所以在我们标准化后,形状仍然是[17x4096],与嵌入向量相同,但是标准化了<div> <img src="images/norm.png" width="600"/></div>```pythontoken_embeddings = rms_norm(token_embeddings_unnormalized, model["layers.0.attention_norm.weight"])to

用户评论 (0)

发表评论

captcha

相关资源

gd32 letter shell cmbacktrace

gd32 letter shell cmbacktrace

2.2MB14积分

利用opensees 进行动力时程分析,通过地震动得到地震响应 内容包括1.桥墩模型源代码 2.动力时程分析和主程序代码

利用opensees 进行动力时程分析,通过地震动得到地震响应。内容包括1.桥墩模型源代码。2.动力时程分析和主程序代码。3.已经处理好的地震动22条。

12.53KB43积分

四轮分布式驱动车辆复合制动分层控制,分布式驱动电动汽车复合制动控制策略,建立七自由度整车模型、魔术轮胎模型、电机模型、电池模型

四轮分布式驱动车辆复合制动分层控制,分布式驱动电动汽车复合制动控制策略,建立七自由度整车模型、魔术轮胎模型、电机模型、电池模型,研究上下层机电复合控制策略。不仅前轮会有再生制动力,同样后轮也会有再生制动力,因此在上一节所述的三种制动力分配基础分析可以得出结论:前、后轮均能进行再生制动的复合制动系统,应使实际制动力分配曲线接近 I 曲线,并且通过合理调整液压制动力与回馈制动力的分配关系,在保证制动稳定性的同时,实现能量回收的最大化。第一步是进行汽车前、后轮间制动力分配,为了保证制动稳定性,要使得前、后轮制动力尽可能的符合 I 曲线;第二步是在第一步的基础上进行电、液制动力分配,为了保证能量回收率,应当使电机制动占尽可能多的份额。制动力上层控制器保证前、后轮滑移率相同,从而最大程度上保证车辆的制动稳定性,即不会出现前轮或后轮提前抱死的制动失稳工况。Braking torque when ABS:紧急制动时的前后轴制动力分配Braking torque in normal:一般制动时的前后轴制动力分配ABS or normal braking judge:紧急制动和一般制动判别

270.83KB10积分

MATLAB代码:多微电网优化调度关键词:多微电网 优化调度参考文档:《面向配电网的多微电网协调运行与优化》基本复现仿真

MATLAB代码:多微电网优化调度关键词:多微电网 优化调度参考文档:《面向配电网的多微电网协调运行与优化》基本复现仿真平台: MATLAB主要内容:代码主要做的是面向配电网的多微电网协调运行与优化,把多微电网看成一个整体参与配电网优化调度,并针对峰平谷三个时段的不同电价提出了各时段的多微电网联合协调调度策略,并根据该调度策略建立数学模型,以多微电网系统总运营成本最小胃目标函数进行优化。出图效果也非常清楚,具体可以看下图。

268.78KB50积分