学习 dive-into-llms

项目简介

这个项目是 Dive into LLMs（中文名：《动手学大模型》） — 一个由上海交通大学 NLP / AI 课程团队发起、在 GitHub 上开源的 “大语言模型 (LLM) 编程实践教程” 系列。 (GitHub)

老马啸西风2025/11/3大约 10 分钟

dive-into-llms-01-python 基础知识入门

学习 dive-into-llms

chat

python 基础

要想真正“入门大模型”，Python 编程基础是必不可少的地基。

因为几乎所有的大模型框架（Transformers、PyTorch、TensorFlow、LangChain、OpenAI API 等）都基于 Python 实现。

老马啸西风2025/11/3大约 4 分钟

dive-into-llms-02-deeplearning 深度学习基础知识入门

学习 dive-into-llms

chat

深度学习基础知识入门

这一步正是「从会用大模型 → 理解大模型」的关键转折。

你想要快速建立「深度学习基础认知」——不是搞学术，而是能听懂大模型背后的逻辑。

那我们就用最通俗易懂的方式讲清楚。

🧠 一、深度学习到底是什么？

老马啸西风2025/11/3大约 4 分钟

dive-into-llms-03-自然语言处理／大模型的一些背景概念

学习 dive-into-llms

chat

对自然语言处理／大模型的一些背景概念有初步认识：什么是预训练、微调、Transformer 架构、模型推理、量化、部署等等。

非常棒 👍，你现在已经到了学习大模型最关键的一步 —— 理解 自然语言处理（NLP）与大模型的核心概念体系。

我来用尽可能通俗、系统、工程化的方式，帮你快速建立一个「从 0 到能看懂大模型原理」的完整认知地图👇

老马啸西风2025/11/3大约 5 分钟

dive-into-llms-04-Transformers 简介

https://huggingface.co/docs/transformers/main/zh/index

Transformers 简介

为 PyTorch 打造的先进的机器学习工具.

🤗 Transformers 提供了可以轻松地下载并且训练先进的预训练模型的 API 和工具。

使用预训练模型可以减少计算消耗和碳排放，并且节省从头训练所需要的时间和资源。

老马啸西风2025/11/3大约 2 分钟

dive-into-llms-05-PyTorch 入门介绍

https://pytorch.org/

chat

是什么？

我来系统、易懂地帮你介绍一下 PyTorch —— 它是几乎所有大模型（LLM）开发者都要掌握的核心框架。

💡 一、PyTorch 是什么？

PyTorch 是一个 基于 Python 的深度学习框架，由 Facebook（现在的 Meta）AI 研究院在 2016 年发布。
它主要用于：

构建神经网络模型

进行 GPU 加速训练

实现灵活的科学计算和自动微分

老马啸西风2025/11/3大约 3 分钟

dive-into-llms-06-Transformers 基本环境安装快速入门

快来使用 🤗 Transformers 吧！

无论你是开发人员还是日常用户，这篇快速上手教程都将帮助你入门并且向你展示如何使用 pipeline() 进行推理，使用 AutoClass 加载一个预训练模型和预处理器，以及使用 PyTorch 或 TensorFlow 快速训练一个模型。

如果你是一个初学者，我们建议你接下来查看我们的教程或者课程，来更深入地了解在这里介绍到的概念。

准备工作

在开始之前，确保你已经安装了所有必要的库：

!pip install transformers datasets evaluate accelerate

老马啸西风2025/11/3大约 3 分钟

dive-into-llms-07-Transformers pipeline 入门例子

python

我直接命令行测试，很不幸，开门失败。

PS C:\Users\Administrator> python
Python 3.13.0a5 (tags/v3.13.0a5:076d169, Mar 12 2024, 21:29:03) [MSC v.1938 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import pipeline
D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\requests\__init__.py:86: RequestsDependencyWarning: Unable to find acceptable character detection dependency (chardet or charset_normalizer).
  warnings.warn(
Traceback (most recent call last):
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\_core\__init__.py", line 22, in <module>
    from . import multiarray
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\_core\multiarray.py", line 11, in <module>
    from . import _multiarray_umath, overrides
ImportError: DLL load failed while importing _multiarray_umath: 找不到指定的程序。

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\__init__.py", line 125, in <module>
    from numpy.__config__ import show_config
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\__config__.py", line 4, in <module>
    from numpy._core._multiarray_umath import (
    ...<3 lines>...
    )
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\_core\__init__.py", line 48, in <module>
    raise ImportError(msg) from exc
ImportError:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions failed. This error can happen for
many reasons, often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

    https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

  * The Python version is: Python3.13 from "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\python.exe"
  * The NumPy version is: "2.3.4"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: DLL load failed while importing _multiarray_umath: 找不到指定的程序。


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    from transformers import pipeline
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\__init__.py", line 27, in <module>
    from . import dependency_versions_check
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\dependency_versions_check.py", line 16, in <module>
    from .utils.versions import require_version, require_version_core
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\utils\__init__.py", line 24, in <module>
    from .auto_docstring import (
    ...<10 lines>...
    )
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\utils\auto_docstring.py", line 30, in <module>
    from .generic import ModelOutput
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\utils\generic.py", line 31, in <module>
    import numpy as np
  File "D:\Users\Administrator\AppData\Local\Programs\Python\Python313\Lib\site-packages\numpy\__init__.py", line 130, in <module>
    raise ImportError(msg) from e
ImportError: Erro

老马啸西风2025/11/3大约 6 分钟

dive-into-llms-06-Transformers pipeline 介绍

支持的任务

任务	描述	模态	Pipeline 示例
文本分类	为给定的文本序列分配一个标签	NLP	`pipeline(task="sentiment-analysis")`
文本生成	根据给定的提示生成文本	NLP	`pipeline(task="text-generation")`
命名实体识别	为序列里的每个 token 分配一个标签（人、组织、地址等等）	NLP	`pipeline(task="ner")`
问答系统	通过给定的上下文和问题，在文本中提取答案	NLP	`pipeline(task="question-answering")`
掩盖填充	预测出在序列中被掩盖的 token	NLP	`pipeline(task="fill-mask")`
文本摘要	为文本序列或文档生成总结	NLP	`pipeline(task="summarization")`
文本翻译	将文本从一种语言翻译为另一种语言	NLP	`pipeline(task="translation")`
图像分类	为图像分配一个标签	计算机视觉	`pipeline(task="image-classification")`
图像分割	为图像中每个像素分配标签（语义、全景、实例分割）	计算机视觉	`pipeline(task="image-segmentation")`
目标检测	预测图像中目标对象的边界框和类别	计算机视觉	`pipeline(task="object-detection")`
音频分类	给音频文件分配一个标签	音频	`pipeline(task="audio-classification")`
自动语音识别	将音频文件中的语音转录为文本	音频	`pipeline(task="automatic-speech-recognition")`
视觉问答	给定图像和问题，回答与图像相关的问题	多模态	`pipeline(task="vqa")`

老马啸西风2025/11/3大约 8 分钟

dive-into-llms-104-给出从机器学习、深度学习、LLM 从开始到现在的整个发展历史

AI 发展历史

理解 机器学习 → 深度学习 → 大语言模型（LLM） 的发展历史，能帮你从“全景视角”看清整个 AI 领域的演化逻辑。

下面我会系统梳理从 1950s 到 2025 年的技术脉络，包括关键人物、重要论文、标志性模型和时代转折点。

我们可以把整个历程分成 七个时代。

🧭 一、AI 全景时间线（1950s–2025）

时代	时间	代表阶段	核心特征
🌱 1. 萌芽期	1950–1980	早期人工智能 / 符号主义	基于逻辑和规则的“推理式 AI”
🧩 2. 统计学习期	1980–2000	传统机器学习崛起	统计方法 + 特征工程
🔥 3. 深度学习复兴期	2006–2012	神经网络复兴	多层神经网络训练突破
🚀 4. 深度学习爆发期	2012–2018	图像识别 / 语音 / NLP 爆发	CNN、RNN、LSTM、Seq2Seq、Attention
🧠 5. Transformer 时代	2017–2020	“Attention is All You Need”	序列建模范式变革
🌍 6. LLM 时代	2020–2023	GPT、BERT、T5	预训练 + 指令微调
🤖 7. Agent & Multimodal 时代	2023–2025+	GPT-4、Gemini、Claude、Mistral	智能体、多模态、推理与工具使用

老马啸西风2025/11/3大约 6 分钟