grand summary（03:21:46 - 03:31:24）
Deep Dive into LLMs like ChatGPT

This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version.

Instructor
Andrej was a founding member at OpenAI (2015) and then Sr. Director of AI at Tesla (2017-2022), and is now a founder at Eureka Labs, which is building an AI-native school. His goal in this video is to raise knowledge and understanding of the state of the art in AI, and empower people to effectively use the latest and greatest in their work.
Find more at https://karpathy.ai/ and https://x.com/karpathy

Chapters
00:00:00 introduction
00:01:00 pretraining data (internet)
00:07:47 tokenization
00:14:27 neural network I/O
00:20:11 neural network internals
00:26:01 inference
00:31:09 GPT-2: training and inference
00:42:52 Llama 3.1 base model inference
00:59:23 pretraining to post-training
01:01:06 post-training data (conversations)
01:20:32 hallucinations, tool use, knowledge/working memory
01:41:46 knowledge of self
01:46:56 models need tokens to think
02:01:11 tokenization revisited: models struggle with spelling
02:04:53 jagged intelligence
02:07:28 supervised finetuning to reinforcement learning
02:14:42 reinforcement learning
02:27:47 DeepSeek-R1
02:42:07 AlphaGo
02:48:26 reinforcement learning from human feedback (RLHF)
03:09:39 preview of things to come
03:15:15 keeping track of LLMs
03:18:34 where to find LLMs
03:21:46 grand summary

Links
- ChatGPT https://chatgpt.com/
- FineWeb (pretraining dataset): https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1
- Tiktokenizer: https://tiktokenizer.vercel.app/
- Transformer Neural Net 3D visualizer: https://bbycroft.net/llm
- llm.c Let's Reproduce GPT-2 https://github.com/karpathy/llm.c/discussions/677
- Llama 3 paper from Meta: https://arxiv.org/abs/2407.21783
- Hyperbolic, for inference of base model: https://app.hyperbolic.xyz/
- InstructGPT paper on SFT: https://arxiv.org/abs/2203.02155
- HuggingFace inference playground: https://huggingface.co/spaces/huggingface/inference-playground
- DeepSeek-R1 paper: https://arxiv.org/abs/2501.12948
- TogetherAI Playground for open model inference: https://api.together.xyz/playground
- AlphaGo paper (PDF): https://discovery.ucl.ac.uk/id/eprint/10045895/1/agz_unformatted_nature.pdf
- AlphaGo Move 37 video: https://www.youtube.com/watch?v=HT-UZkiOLv8
- LM Arena for model rankings: https://lmarena.ai/
- AI News Newsletter: https://buttondown.com/ainews
- LMStudio for local inference https://lmstudio.ai/

- The visualization UI I was using in the video: https://excalidraw.com/
- The specific file of Excalidraw we built up: https://drive.google.com/file/d/1EZh5hNDzxMMy05uLhVryk061QYQGTxiN/view?usp=sharing
- Discord channel for Eureka Labs and this video: https://discord.gg/3zy8kqD9Cp

Educational Use Licensing
This video is freely available for educational and internal training purposes. Educators, students, schools, universities, nonprofit institutions, businesses, and individual learners may use this content freely for lessons, courses, internal training, and learning activities, provided they do not engage in commercial resale, redistribution, external commercial use, or modify content to misrepresent its intent.

#llm #chatgpt #ai #deep dive #deep learning #introduction #large language model

grand summary（03:21:46 - 03:31:24）Deep Dive into LLMs like ChatGPT

introduction

- Introduction

pretraining data (internet)

- LLM Pre-training

But what if the ultimate joke about pelicans is actually 'the the the the the the,' but we simply don't have enough intelligence to understand it—just like an unusual move in the game of Go? XD

wow amazing hours so much in few hours .. Saved me hours of research and insprie me for more ..great work looking forward for new such interesting videos..

tokenization

neural network I/O

- Neural Net & Training

neural network internals

inference

GPT-2: training and inference

Somewhere around , you said something about training 1 million tokens. Do you mean you train chunks of 1 million tokens to generate output or you train different tokens that add up to a million to generate output?

- GPUs & Model Costs

Llama 3.1 base model inference

: Parallel universes !!! Just loving these analogies - awesome !

pretraining to post-training

post-training data (conversations)

- Build LLM Assistant

"something went wrong" 😂 lol I love that he left this in there!

his genuine laugh at ChatGPT error is so pure and spontaneous. How can someone not love Karpathy!!?? Sir you are pure Gold for humanity.

hallucinations, tool use, knowledge/working memory

The chapter about hallucinations was so insightful. Never heard about it as an issue of the dataset, i.e., it wasn't trained to say "I don't know" and how one can test the knowledge of the model. Thanks!

Thanks for the informative video! I have a question about training language models for tool use, specifically regarding the process you described around

knowledge of self

models need tokens to think

@ that is elucidating! This is the first time I’ve heard of this concept. Thank you Andrej.

This teacher is very good at giving cute examples Appreciate it and I agree it.

tokenization revisited: models struggle with spelling

Wow.. love this explanation about why these models fail at character related and counting related task

jagged intelligence

supervised finetuning to reinforcement learning

- Model Training in Practice

reinforcement learning

DeepSeek-R1

Deepseek says “$3 is a bit expensive for an apple, but maybe they’re organic or something” 😂

What a treat!!! At , haha when you say this is very busy very ugly because of google not being able to nail that was epic hahah

AlphaGo

Thank you for the video Andrej! One small note: at , the dashed line in the AlphaGo Zero plot is the Elo of the version of AlphaGo that *defeated* Lee in 2016 (not the Elo of Lee himself).

reinforcement learning from human feedback (RLHF)

Tiny typo "let's add it to the dataset and give it an ordering that's extremely like a score of 5" -> SHOULD BE "let's add it to the dataset and give it an ordering that's extremely like a score of 1"

preview of things to come

keeping track of LLMs

if you have come till this time stamp then finish the video and go and build something with LLMs.😊

where to find LLMs

grand summary

In principle these models are capable of analogies no human has had. Wow😮

Thank you Andrej for this! Please continue putting contents like this and you are one of the best teachers in this space who can explain in this level of detail. The entire is pure gold and very grateful that you are putting this level of time and effort ❤

Andrej Karpathy

Timetable

よく話題になっている単語

grand summary（03:21:46 - 03:31:24）
Deep Dive into LLMs like ChatGPT

Thank you for the video Andrej! One small note: at , the dashed line in the AlphaGo Zero plot is the Elo of the version of AlphaGo that defeated Lee in 2016 (not the Elo of Lee himself).