タイムテーブル

- Predict, sample, repeat

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:00:00 - 00:03:03

- GPT stands for Generative Pretrained Transformer and is a core neural network model.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:00:04 - 00:02:35

- - "Visual Exploration in Chapters"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:00:31 - 00:01:34

Can I ask what you used in for text-to-voice?Learning up on speech synthesizers myself, the one you used here sounds fairly good at its job, considering how natural it sounds.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @emin7101 様　

00:00:50 - 00:27:14

@, there is a transliteration from English to Chinese, but the grammar is technically incorrect. As someone that speaks Chinese, and those also at Mandarin Blueprint YT channel also do, it's important to note machine translation has a ways to go.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @linguist8623 様　

00:01:25 - 00:27:14

- - Title: Predicting Next Passage: Technical Terms RemovedPrediction: The passage will discuss the use of machine learning algorithms in medical imaging.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:01:34 - 00:01:45

- - "Predicting Next Word: A Different Goal"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:01:45 - 00:02:09

- - "Mystifying Success with Added Technology"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:02:09 - 00:02:18

- - "Generate Story with Seed Text"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:02:18 - 00:03:03

more of the story

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @kitefrog 様　

00:02:30 - 00:27:14

- Overview of data flow through a transformer

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:02:35 - 00:05:01

see underlying distribution

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @muslimridealong1975 様　

00:02:58 - 00:27:14

- Inside a transformer

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:03:03 - 00:06:36

- - "Transformer Data Flow Overview"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:03:03 - 00:03:55

You: “let’s kick things off…”Me: “holy F I thought that WAS the deep dive.”

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @aariuswins 様　

00:03:04 - 00:27:14

- - "Attention Block for Vector Sequence"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:03:55 - 00:05:35

Referring: ", each other and pass information back and forth to update their values." I understood based on the matrix calculations shown later in the video that while inference the information only moves forward (because of masking)while in training only it goes back and forth...?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @nisarkhanatwork 様　

00:04:01 - 00:27:14

brown at

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @DavidHowell-t5h 様　

00:04:15 - 00:27:14

- GPT-3 works by predicting next text based on snippets

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:05:01 - 00:07:28

- - "Predicting Next Text Chunks with AI"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:05:35 - 00:05:40

- - Predicting Next Text with Seed

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:05:40 - 00:05:45

- - Predictive Game of Sampling

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:05:45 - 00:05:50

- - "Repeating Appended Data"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:05:50 - 00:06:33

why for example at not the word with the highest probability is chosen but one with a value far lower?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @Peter-vg6tv 様　

00:05:55 - 00:27:14

Chile reference!! ❤

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @Sebax 様　

00:06:11 - 00:27:14

Santiago mentioned 🗣🗣🗣 📢📢📢📢📢📢📢🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @rlira8072 様　

00:06:11 - 00:27:14

- - "Exploring Chapter Details"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:06:33 - 00:06:44

- Chapter layout

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:06:36 - 00:07:20

- - "Reviewing Background Knowledge for Second Nature"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:06:44 - 00:06:53

- - "Skip to the Good Stuff with Background Knowledge"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:06:53 - 00:06:57

- - "Heart of Deep Learning: Attention Blocks"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:06:57 - 00:07:36

mentioned at , but don't see any Chapter 7 on your website. I assume you are still working on it.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bmomjian 様　

00:07:17 - 00:27:14

- The premise of Deep Learning

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:07:20 - 00:12:27

- Deep learning models use data to determine model behavior through tunable parameters.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:07:28 - 00:10:04

- - "Model Behavior Analysis"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:07:36 - 00:07:41

- - Predicting Image Labels with AI Model

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:07:41 - 00:08:34

- - Predicting House Prices with Two Continuous Parameters

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:08:34 - 00:09:21

What's the middle image ? The left one is linear regression and the right most one is deep learning but the middle one didn't get any mention. Is it referring to decision trees?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @heretoinfinity9300 様　

00:09:09 - 00:27:14

12800 Einträgen * Vektor mit 12800 Einträgen = dot product zwischen cat und einem singulären Wort (wichtig: betrifft nur die embedding matrix!) ab12800 Einträgen kommt in die unembedding matrix, dann wird do product mit jedem von den 50000 Token gebildet und alles in Wahrscheinlichkeitswerten umgerechnet (1)

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:09:19 - 00:12:50

- - "Continuing with Previous Context"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:09:21 - 00:09:31

- - "Explaining Choices Through Format Knowledge"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:09:31 - 00:10:12

hold on, why do they have to be real? Why wouldn't complex numbers also work

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @dingus42 様　

00:09:45 - 00:27:14

- Deep learning models use weighted sums and non-linear functions to process data

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:10:04 - 00:12:36

- - "Probabilistic Modeling of Next Tokens"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:10:12 - 00:11:31

, what is the source for this information can you please share. I really need it

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @PrathamInCloud 様　

00:11:14 - 00:27:14

- - "Charming Model Despite Size"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:11:31 - 00:12:29

- Word embeddings

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:12:27 - 00:18:25

- - "Breaking Up Input Text"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:12:29 - 00:12:38

- Words are converted into vectors for machine learning understanding.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:12:36 - 00:15:08

- - "Tokens Include Word Pieces & Punctuation"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:12:38 - 00:12:43

- - "Broken into Words"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:12:43 - 00:12:59

embedding matrix

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:12:50 - 00:16:54

- - "Training AI Models with 50k Embeddings"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:12:59 - 00:13:29

damn that was schnice

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @primalknight 様　

00:13:20 - 00:27:14

- - "New Foundation Unveiled"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:13:29 - 00:15:12

: Use of gemsim to get closest words to tower

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @disouzam_bh 様　

00:14:57 - 00:27:14

The King - Queen analogy () — It's a classic, but it still gives me goosebumps. The way word embeddings capture relationships like gender and royalty? Chef's kiss. 👑

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FIXProtocol 様　

00:15:02 - 00:24:03

- Model learns to associate directions with specific concepts

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:15:08 - 00:17:36

- - "Female Monarch? Find It Here!"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:15:12 - 00:16:37

@ Everything I've been studying for the past three months suddenly snapped into focus.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @JDLuke 様　

00:15:25 - 00:27:14

Living for the Kyne reference at

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @alt8791 様　

00:15:40 - 00:27:14

bro! 😂

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @LoyalGalaxyLP 様　

00:16:00 - 00:27:14

- - "Dot Product: A Way to Measure Vector Angle"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:16:37 - 00:17:06

ein Token Vektor mit ca.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:16:54 - 00:18:00

- - "Hypothesize Embedding Model"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:17:06 - 00:18:13

- GPT-3 utilizes a large embedding matrix for word representations.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:17:36 - 00:19:54

Getting increasingly higher results from the dot products of the plurality vector and increasing numbers is crazy!

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @SLR_96 様　

00:17:36 - 00:27:14

Größe der embedding matrix

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:18:00 - 00:19:30

- - "617 Million Weights in Total"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:18:13 - 00:19:16

note: words

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @seriyanto 様　

00:18:18 - 00:27:14

- Embeddings beyond words

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:18:25 - 00:20:22

This the the clearest layman explanation of how attention works that I've ever seen. Amazing.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @codediporpal 様　

00:18:45 - 00:27:14

- - "Predictive AI Model Empowerment"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:19:16 - 00:19:51

12800 Einträgen * Vektor mit 12800 Einträgen = dot product zwischen cat und einem singulären Wort (wichtig: betrifft nur die embedding matrix!) ab20:58!!! letzter Vektor mit 12800 Einträgen kommt in die unembedding matrix, dann wird do product mit jedem von den 50000 Token gebildet und alles in Wahrscheinlichkeitswerten umgerechnet 24:03 Temperatur (5)

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:19:30 - 00:20:58

- - Trained With Context Size for GPT-3

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:19:51 - 00:20:08

- GPT-3 is trained with a context size of 2048

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:19:54 - 00:22:10

- - "Incorporating predictions for fluent conversation"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:20:08 - 00:20:28

At I'm curious about what the original question was. I was so surprised to see restaurants in Santiago, and specifically Ñuñoa!!! Hahahaha that's very specific! 🇨🇱

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @johan2 様　

00:20:19 - 00:27:14

- Unembedding

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:20:22 - 00:22:22

Unembedding explained () — This one gets overlooked a lot, but it's like the decoder ring of the whole system. I appreciate the spotlight.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FIXProtocol 様　

00:20:23 - 00:27:14

- - "Predict Next Token Probability"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:20:28 - 00:20:50

- - "Snape Highly Rated Among Harry Potter Fans"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:20:50 - 00:21:42

Disagree with Snape. The correct answer is Umbridge. But I assume that since Snape has far more occurrences than Umbridge in the training set, thus far more association can be established between Snape and a negative traits. This is something AI need to learn to grasp the amplitude of a character’s emotional impact to the readers lol

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @rocky_wang 様　

00:20:50 - 00:27:14

!!! letzter Vektor mit

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:20:58 - 00:24:03

at around there was a thing i do not understand that why the other vectors of the last layer are not used only the last word vector is interacted with the Unembedding matrix ?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @dakshmor 様　

00:21:00 - 00:27:14

- - "Unimbedding Matrix: A Key Player"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:21:42 - 00:22:19

As an ML researcher this is an amazing video ❤. But please allow me to nitpick a little at

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @BobbyL2k 様　

00:21:45 - 00:27:14

- Understanding the purpose and function of the softmax function in deep learning

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:22:10 - 00:24:31

- - Total Billions Ahead: Mini-Lesson Ends Chapter

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:22:19 - 00:22:34

- Softmax with temperature

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:22:22 - 00:26:03

Я много не понимаю что касается математических вычислений и что бы понять мне надо будет несколько раз пересмотреть этот цикл роликов и прочитать сопутствующую литературу. В "Softmax" переводит диапазон от минимума к максимуму в "процентное" соотношение от 0 к 1, а температура по сути даёт возможность градиентно инвертировать эти значения (при условии если не ограничивать диапазон).

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @triton121 様　

00:22:22 - 00:27:14

- - "Next Word Distribution"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:22:34 - 00:23:08

- - "Smaller values near 0 with 1"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:23:08 - 00:23:18

Minor correction about how softmax works. Exponents can get out of hand very quickly and overwhelm the computer's memory. So to prevent that, what we can do is simply subtract the maximum value in our vector from all the other values, effectively turning the new maximum value into 0 and thus e^0 =1, with all the other values being smaller than 1. Then, when you use softmax to turn this new vector into a probability distribution, it's identical to what you would've gotten originally, but without the massive computational problem. It's really ingenious if you think about it.Disclaimer: I'm not an AI expert. I'm just reiterating what I learned from a sentdex video.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @feynstein1004 様　

00:23:15 - 00:27:14

- - "Sum Positive Values"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:23:18 - 00:23:55

Shouldn't the x values start from 0 too since when we are expressing the aggregate, we are starting from x0 and going till n-1? Like shouldn't it be x0 to x6 for the 7 values?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @kazifahim3045 様　

00:23:22 - 00:27:14

- - "ChatchyPt Creates Next Word"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:23:55 - 00:25:22

Softmax with temperature () — Loved how you broke this down. Watching a robot sweat between “cat” and “bratwurst” at high temps? Hilarious and accurate.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FIXProtocol 様　

00:24:03 - 00:20:23

Temperatur

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @FG-fc1yz 様　

00:24:03 - 00:27:14

At around the mark, I noticed a slight discrepancy in the indexing of the softmax . Currently, it starts from e^x1 instead of e^x0. To maintain consistency, you might consider adjusting the limits to either start from 1 to N or begin from e^x0. It's a minor detail, but I thought it might enhance the overall clarity of your presentation. Keep up the excellent work!

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @ShreyansJain-gf3ei 様　

00:24:14 - 00:27:14

- Isn't it impossible to set temperature to zero, because you can't devide by zero?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @jiteshvora86 様　

00:24:27 - 00:27:14

?

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @LONELY-x9k 様　

00:24:30 - 00:27:14

- GPT-3 uses temperature to control word generation.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:24:31 - 00:26:49

, the tempeature cannot be 0, closer to zero, the "softmax with tempeature" will sharpen the largest value

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @jushkunjuret4386 様　

00:24:32 - 00:27:14

A little, humble remark: At , you're talking about the Temperature in Softmax. I think setting T to 0 does not work, as it would lead to "division by zero". Maybe I get it wrong ...

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @markusweinberger3682 様　

00:24:33 - 00:27:14

*approximately 0 (last time I checked you couldn't divide by zero) (Even though Im smartassing I find this video freakin awesome!!)

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @kr8432 様　

00:24:40 - 00:27:14

- - "Maximizing Next Token Predictions with GPT-3"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:25:22 - 00:25:57

- - Next Word Prediction Logits

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:25:57 - 00:26:02

- - "Laying Foundations for Attention Understanding"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:26:02 - 00:26:30

- Up next

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　

00:26:03 - 00:27:14

- - "Next Chapter Awaits Smooth Ride"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:26:30 - 00:26:36

- - "Next Chapter Available for Review"

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @bleuonbase 様　

00:26:36 - 00:27:14

- Dive into attention and support the channel

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @8h45k4r 様　

00:26:49 - 00:27:14

Just a small note for the temperature hyperparameter at . T can never be actually 0 as division by 0 doesn't make sense. But as T approaches zero i.e. T limit of 0 in calculus terms, as it becomes infinitesimally small the highest value will outpace others by such a magnitude that it forms the bulk of the summation hence its division by the sum "approaches 1" as T approaches 0. T magnifies the highest value at the expense of others.

Transformers (how LLMs work) explained visually | DL5

2024年04月02日　 @mpnikhil 様　

00:27:00 - 00:27:14

3Blue1Brown

Timetable

よく話題になっている単語

- Predict, sample, repeat

- GPT stands for Generative Pretrained Transformer and is a core neural network model.

- - "Visual Exploration in Chapters"

Can I ask what you used in for text-to-voice?Learning up on speech synthesizers myself, the one you used here sounds fairly good at its job, considering how natural it sounds.

@, there is a transliteration from English to Chinese, but the grammar is technically incorrect. As someone that speaks Chinese, and those also at Mandarin Blueprint YT channel also do, it's important to note machine translation has a ways to go.

- - Title: Predicting Next Passage: Technical Terms RemovedPrediction: The passage will discuss the use of machine learning algorithms in medical imaging.

- - "Predicting Next Word: A Different Goal"

- - "Mystifying Success with Added Technology"

- - "Generate Story with Seed Text"

more of the story

- Overview of data flow through a transformer

see underlying distribution

- Inside a transformer

- - "Transformer Data Flow Overview"

You: “let’s kick things off…”Me: “holy F I thought that WAS the deep dive.”

- - "Attention Block for Vector Sequence"

Referring: ", each other and pass information back and forth to update their values." I understood based on the matrix calculations shown later in the video that while inference the information only moves forward (because of masking)while in training only it goes back and forth...?

brown at

- GPT-3 works by predicting next text based on snippets

- - "Predicting Next Text Chunks with AI"

- - Predicting Next Text with Seed

- - Predictive Game of Sampling

- - "Repeating Appended Data"

why for example at not the word with the highest probability is chosen but one with a value far lower?

Chile reference!! ❤

Santiago mentioned 🗣🗣🗣 📢📢📢📢📢📢📢🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥

- - "Exploring Chapter Details"

- Chapter layout

- - "Reviewing Background Knowledge for Second Nature"

- - "Skip to the Good Stuff with Background Knowledge"

- - "Heart of Deep Learning: Attention Blocks"

mentioned at , but don't see any Chapter 7 on your website. I assume you are still working on it.

- The premise of Deep Learning

- Deep learning models use data to determine model behavior through tunable parameters.

- - "Model Behavior Analysis"

- - Predicting Image Labels with AI Model

- - Predicting House Prices with Two Continuous Parameters

What's the middle image ? The left one is linear regression and the right most one is deep learning but the middle one didn't get any mention. Is it referring to decision trees?

- - "Continuing with Previous Context"

- - "Explaining Choices Through Format Knowledge"

hold on, why do they have to be real? Why wouldn't complex numbers also work

- Deep learning models use weighted sums and non-linear functions to process data

- - "Probabilistic Modeling of Next Tokens"

, what is the source for this information can you please share. I really need it

- - "Charming Model Despite Size"

- Word embeddings

- - "Breaking Up Input Text"

- Words are converted into vectors for machine learning understanding.

- - "Tokens Include Word Pieces & Punctuation"

- - "Broken into Words"

embedding matrix

- - "Training AI Models with 50k Embeddings"

damn that was schnice

- - "New Foundation Unveiled"

: Use of gemsim to get closest words to tower

The King - Queen analogy () — It's a classic, but it still gives me goosebumps. The way word embeddings capture relationships like gender and royalty? Chef's kiss. 👑

- Model learns to associate directions with specific concepts

- - "Female Monarch? Find It Here!"

@ Everything I've been studying for the past three months suddenly snapped into focus.

Living for the Kyne reference at

bro! 😂

- - "Dot Product: A Way to Measure Vector Angle"

ein Token Vektor mit ca.

- - "Hypothesize Embedding Model"

- GPT-3 utilizes a large embedding matrix for word representations.

Getting increasingly higher results from the dot products of the plurality vector and increasing numbers is crazy!

Größe der embedding matrix

- - "617 Million Weights in Total"

note: words

- Embeddings beyond words

This the the clearest layman explanation of how attention works that I've ever seen. Amazing.

- - "Predictive AI Model Empowerment"

- - Trained With Context Size for GPT-3

- GPT-3 is trained with a context size of 2048

- - "Incorporating predictions for fluent conversation"

At I'm curious about what the original question was. I was so surprised to see restaurants in Santiago, and specifically Ñuñoa!!! Hahahaha that's very specific! 🇨🇱

- Unembedding