タイムテーブル

- Where facts in LLMs live

How might LLMs store facts | DL7

2024年08月31日　

00:00:00 - 00:02:15

Starts at

How might LLMs store facts | DL7

2024年08月31日　 @donson3326 様　

00:00:01 - 00:22:43

Wait we don't actually know how it works fully?

How might LLMs store facts | DL7

2024年08月31日　 @chunlingjohnnyliu2889 様　

00:00:45 - 00:22:43

hold, they build the LLM, and don’t know how the facts are cataloged? This is gonna be a doozy.

How might LLMs store facts | DL7

2024年08月31日　 @donewithprecision785 様　

00:00:56 - 00:22:43

- Quick refresher on transformers

How might LLMs store facts | DL7

2024年08月31日　

00:02:15 - 00:04:39

My understanding is that an LLM does not in fact store facts, but through the process of predicting word associations through being trained on an absolutely astoundingly large set of examples, it "stores" the likelihood that the word "basketball" is historically the most likely next word in the series. It doesn't have any grasp of a concept of basketball in any sort of meaningful or even static way. This is exactly the problem I'm trying to solve, and honestly I think I found a solution. I just don't know yet how reliable it is on a large scale, or how economic it is in terms of the required computing power. We'll see.

How might LLMs store facts | DL7

2024年08月31日　 @MrRavaging 様　

00:02:15 - 00:22:43

to

How might LLMs store facts | DL7

2024年08月31日　 @baransam1 様　

00:02:35 - 00:02:45

what would be simplified is half a circumference with two half circumferences inside it, into infinity... that would show some precision...as to how the machine AI is doing this efficiently...while increasing the accuracy of predictions... the selection line on the graph..sequence of vectors lines...attention lines

How might LLMs store facts | DL7

2024年08月31日　 @LordWarden170 様　

00:02:44 - 00:22:43

: The tokens (words) convey context information in each other making the embedding a richer/nuanced version than a simple meaning of the word. When this animation is shown, the arrows are shown moving from a later token to earlier token as well. Isn't this contradictory to the concept introduced during masking where it is said that only earlier words are allowed to enrich the later words. (This is a common animation shown multiple time in this series).

How might LLMs store facts | DL7

2024年08月31日　 @baransam1 様　

00:02:45 - 00:22:43

Only the joker would pick stranger over stronger

How might LLMs store facts | DL7

2024年08月31日　 @robinmitchell6803 様　

00:03:16 - 00:22:43

Live in a "high dimension" Please expand.

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:03:38 - 00:22:43

I was unable to reproduce woman-man ~= aunt-uncle using either OpenAIEmbedding model 'text-embedding-3-small' or the older 'text-embedding-ada-002' model using LangChain. Cosine similarity of 0.29. I tried lots of pairings: aunt-uncle, woman-man, sister-brother, and queen-king. All had cosine similarities in the range 0.29 to 0.38. Happy to share my work if you're curious.

How might LLMs store facts | DL7

2024年08月31日　 @davidmorse5411 様　

00:03:41 - 00:22:43

this subtraction of vector makes me wonder if all of category theory can be described using linear algebra

How might LLMs store facts | DL7

2024年08月31日　 @alejrandom6592 様　

00:03:41 - 00:22:43

If we consider this higher-dimensional embedding space, in which each direction encodes a specific meaning, each vector in this space represents a certain distinct concept, right? (a vector 'meaning' man, woman, uncle, or aunt, as per the example at ).

How might LLMs store facts | DL7

2024年08月31日　 @daantromp5195 様　

00:03:44 - 00:22:43

Kinda like how neural synapses work...when neurons wire together, they fire together and then it tickles one of the adjacent "dormant" neurons and it lights up with a memory like, "Oh yeah! Totally forgot about that until you just mentioned it again to me...." right?

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:03:59 - 00:22:43

I wonder, what word sits in the center? What is [0, 0, 0, ..., 0] ?

How might LLMs store facts | DL7

2024年08月31日　 @timeflex 様　

00:04:10 - 00:22:43

- Assumptions for our toy example

How might LLMs store facts | DL7

2024年08月31日　

00:04:39 - 00:06:07

This is unironically how I understand "the spectrum" of autism, for example.

How might LLMs store facts | DL7

2024年08月31日　 @KillianTwew 様　

00:05:08 - 00:22:43

in are you impliying that the vectors are not normalized and therefore a dot product of 1 does not mean they are parallel? So what we call semantic similarity is not a measure of pointing towards the same direction? So it can be that the dot product is 1 in several directions at the same time

How might LLMs store facts | DL7

2024年08月31日　 @enriquebalpstraffon 様　

00:05:10 - 00:22:43

I don't understand the assumptions made around about dot products. Why dot product being 1 is used to mean that the vector encodes that particular direction/concept? I would have thought that the vector needs to be parallel to that concept vector to assume it encodes that concept. But then a vector would only be able to encode one concept. Is this why dot product=1 is just sort of conventionally chosen?

How might LLMs store facts | DL7

2024年08月31日　 @gauravfotedar 様　

00:05:30 - 00:22:43

How can dot product of a vector with both "Michael" and "Jordan" be 1 when earlier it was said that "Michael" and "Jordan" are nearly orthogonal to each other?

How might LLMs store facts | DL7

2024年08月31日　 @vivekrai1974 様　

00:05:53 - 00:22:43

- Inside a multilayer perceptron

How might LLMs store facts | DL7

2024年08月31日　

00:06:07 - 00:15:38

Does that sequence of high-dimension vectors (let’s call it a 1D array) in the MLP behave as its own tensor in the LLM?

How might LLMs store facts | DL7

2024年08月31日　 @mistahtom 様　

00:07:00 - 00:22:43

You are telling me that the AI Is akinator

How might LLMs store facts | DL7

2024年08月31日　 @joaquincurrais4856 様　

00:08:47 - 00:22:43

Who determines "bias" or is it a "vector" with a "code" as well?

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:09:33 - 00:22:43

) otherwise all borders go through

How might LLMs store facts | DL7

2024年08月31日　 @anti-troll-software6151 様　

00:10:30 - 00:22:43

, "...continuing with the deep learning tradition of overly fancy names..." 😂🤣😂

How might LLMs store facts | DL7

2024年08月31日　 @baxile 様　

00:10:49 - 00:22:43

So this is just an "if, then" function?

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:11:33 - 00:22:43

The Bias exists to move the border between yes or no (see 0. It is literally the b in y=mx+b. Without it all y=mx go through (0,0)

How might LLMs store facts | DL7

2024年08月31日　 @anti-troll-software6151 様　

00:13:44 - 00:10:30

So the weights are simultaneously nudged to form vector encodings for output words as columns, and patterns in rows to get the values of how much each column should be used based on multiplication by input?

How might LLMs store facts | DL7

2024年08月31日　 @Valentin-d1j 様　

00:13:53 - 00:22:43

As for the question of what the bias does - it's just a control at what height you put the threshold of the RELU. This way you can clip the data at different values depending on the context.

How might LLMs store facts | DL7

2024年08月31日　 @bzqp2 様　

00:14:00 - 00:22:43

At I think this might be a misinterpretation. The MLP block uses the same 50,000 neurons for all tokens in the sequence and not 50,000 neurons per token. @3blue1brown is that correct?

How might LLMs store facts | DL7

2024年08月31日　 @alienhunter4870 様　

00:14:40 - 00:22:43

I'm wondering if the phrasing here is a bit misleading. Unless i'm missing something, the block has 50000 neurons but the sequnece of tokens is passed through it, meaning you get number of activations multiplied with number of tokens, not neurons per se. This part might lead someone to thing that those neurons are different for each tokens but they are not. only activations.Regardless, this is an excellent video.

How might LLMs store facts | DL7

2024年08月31日　 @marksverdhei 様　

00:14:40 - 00:22:43

Are "bias" parts of speech "adjectives" and "adverbs?"

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:14:50 - 00:22:43

the one piece is real

How might LLMs store facts | DL7

2024年08月31日　 @samuelgunter 様　

00:15:10 - 00:22:43

ITS REAL

How might LLMs store facts | DL7

2024年08月31日　 @kylewood4001 様　

00:15:11 - 00:22:43

- Counting parameters

How might LLMs store facts | DL7

2024年08月31日　

00:15:38 - 00:17:04

at the parameters for the FF network are counted. Are these the parameters for the FF network of 1 token? If so, does this mean that the total number of parameters, including shared parameters, is much higher?

How might LLMs store facts | DL7

2024年08月31日　 @thomasv92 様　

00:16:08 - 00:22:43

Great video! In case anybody is wondering how to count parameters of the Llama models, use the same math as in but keep in mind that Llama has a third projection in its MLP, the 'Gate-projection', of the same size as the Up- or Down-projections.

How might LLMs store facts | DL7

2024年08月31日　 @zw2249 様　

00:16:47 - 00:22:43

- Superposition

How might LLMs store facts | DL7

2024年08月31日　

00:17:04 - 00:21:37

the superposition chapter is great... Watch it guys n girls

How might LLMs store facts | DL7

2024年08月31日　 @FreakAzoiyd 様　

00:17:04 - 00:22:43

@ Re: Superposition - dimensions not being completely independent but rather related.Here's a way to understand superposition that imo was not really clear in the video.

How might LLMs store facts | DL7

2024年08月31日　 @stevenlynch3456 様　

00:17:45 - 00:22:43

Where could someone find the source material or "footnotes/bibliography" found for each LLM's main base for facts and standardized information deemed "valid" by independent accredited main international sources or bodies of information?

How might LLMs store facts | DL7

2024年08月31日　 @TheSpiritualCollective444 様　

00:17:45 - 00:22:43

Is there a way to estimate the amount of additional “dimensions” you get by having 89-91 degrees versus 90 degrees

How might LLMs store facts | DL7

2024年08月31日　 @carlinw 様　

00:18:05 - 00:22:43

this part is really cool!

How might LLMs store facts | DL7

2024年08月31日　 @johnchessant3012 様　

00:18:28 - 00:22:43

.. such that all the vectors are orthogonal is illuminating (~).It suggests that the surface area of the n-dim sphere being partitioned into a vast quantity of locally flat 'Gaussians' (central limit;-) of similarity directions.Once you have that, plus the layer depth to discriminate conceptual level, one gets to see how it works, though doesn't have any explanatory capability because its vocabulary (numeric vectors) does not bake in the human explanatory phrasings we use (all very 'physician heal thyself' given it's an LLM!)

How might LLMs store facts | DL7

2024年08月31日　 @philipoakley5498 様　

00:18:50 - 00:22:43

can't i make it in JavaScript? ^^

How might LLMs store facts | DL7

2024年08月31日　 @Melkanea 様　

00:19:05 - 00:22:43

Important correction: There's an error in the scrappy code I was demoing around , such that in fact not all pairs of vectors end up in that (89°, 91°) range. A few pairs get shot out to have dot products near ±1, hiding in the wings of the plot. I was using a bad cost function that didn't appreciably punish those cases. On closer inspection, it appears not to be possible to get 100k vectors in 100d to be as "nearly orthogonal" as this. 100 dimensions seems to be too low, at least for the (89°, 91°), for the Johnson-Lindenstrauss lemma to really kick in.

How might LLMs store facts | DL7

2024年08月31日　 @3blue1brown 様　

00:19:50 - 00:22:43

a bit unclear how addition of noise to "vectors perpendicularity" can create space for additional features .. can somebody help me to understand that ?

How might LLMs store facts | DL7

2024年08月31日　 @tempdeltavalue 様　

00:19:54 - 00:22:43

another way to imagine is shooting an arrow in space, and shooting a second arrow in 0.001° different direction. The first inch is nothing, nor is the first 20. But as it goes feet and miles out, it'll eventually be so far apart that it's hard to believe they came from the same bow.Also chaotic pendulums, such as a pendulum on the end of a pendulum. Slight changes ends up with completely different movement.

How might LLMs store facts | DL7

2024年08月31日　 @rmt3589 様　

00:20:18 - 00:22:43

more & more it feels as if current networks are mainly our first bookschelves.

How might LLMs store facts | DL7

2024年08月31日　 @Melkanea 様　

00:20:30 - 00:22:43

times as many independent ideas." 💥

How might LLMs store facts | DL7

2024年08月31日　 @FilippoVitaleIT 様　

00:20:31 - 00:22:43

- Bell LaboratoriesI am currently interning at Bell Laboratories :)A fun fact- Yann Lecun created CNNs at part of his internship at Bell Labs

How might LLMs store facts | DL7

2024年08月31日　 @PramodhRachuri 様　

00:21:10 - 00:22:43

This reminds me of bloom filters

How might LLMs store facts | DL7

2024年08月31日　 @alicederyn 様　

00:21:10 - 00:22:43

hey ya'll! 🇺🇿

How might LLMs store facts | DL7

2024年08月31日　 @yapsdotgg 様　

00:21:18 - 00:22:43

it seems obvious to me that a superposition would store more data, not because of nearly perpendicular vectors, but because you're effectively moving from a unary system to a higher base. Same reason you can count to 10, or to 1023, on the same amount of fingers

How might LLMs store facts | DL7

2024年08月31日　 @rlrfproductions 様　

00:21:20 - 00:22:43

E . Skip to there to understand the issue.. Training GPT-4 in 2022 should have taken around a cool thousand years. Then Huang says something silly: He says "Well, they have used a stack of 8000 H100 GPU's and it only took three months" - forgetting that the H100 was only on the drawing board back in 2022 when GPT-4 was trained. Now read a little about the latest discoveries in Brain sciences and I mean especially focus on N400 and P600.. And you tell me how to explain Dan, Rob, Max and Dennis. I'm gonna leave this up to you, as I'm sure you understand what I'm getting at.

How might LLMs store facts | DL7

2024年08月31日　 @nyyotam4057 様　

00:21:33 - 00:22:43

- Up next

How might LLMs store facts | DL7

2024年08月31日　

00:21:37 - 00:22:43

also important in the training process is the concept of self supervised learning to harness the mass of unlabelled data (books in nlp)

How might LLMs store facts | DL7

2024年08月31日　 @hugob8180 様　

00:21:45 - 00:22:43

Holograms coming! :D

How might LLMs store facts | DL7

2024年08月31日　 @PewrityLab 様　

00:22:10 - 00:22:43

uhhh holograms, I'm so excited,and that is on top of a excellent video.I'm amazed how you manage to consistently keep such a high standart :D

How might LLMs store facts | DL7

2024年08月31日　 @AkantorJojo 様　

00:22:15 - 00:22:43

I may have learnt something new here. Grant is saying that the skip connection in the MLP is actually enabling the transformation of the original vector into another vector with enriched contextual meaning. Specifically, in , he is saying that via the summation of the skip connection, the MLP has somehow learnt the directional vector to be added onto the original vector "Michael Jordan", to produced a new output vector that adds "basketball" information. I was originally of the impression that skip connections are only to combat vanishing gradient, and expedite learning. But now Grant is emphasizing it is doing much more!

How might LLMs store facts | DL7

2024年08月31日　 @kwew1 様　

00:22:42 - 00:22:43

- End of Harriet Nembhard's introduction

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:00:00 - 00:00:45

- The cliché

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:00:45 - 00:02:28

putting a face with a voice, First has always not what I imagined when listening/watching your videos! Second, from here on out when watching, I will always see 3B1B narrating his videos in a cap and gown!

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @Dr_Larken 様　

00:01:11 - 00:15:30

Following your dreams requires more than just passion.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:01:57 - 00:03:54

At ... I love how, after Grant points out the 'nerdiness' of the audience, continuing to (joking) say: "... in the vector space of all possible advice", the camera centers on two lovely nerds sitting, without batting an eyelid, thinking: "yeees? ...".

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @zethodderskov 様　

00:02:18 - 00:15:30

- The shifting goal

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:02:28 - 00:05:57

Following your dreams requires pragmatic concerns beyond inspiration.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:03:54 - 00:05:51

a guy in the audience sighs after hearing that the goal of their life changes today. You can see the stress growing in the faces of the audience.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @asifasmatnibir 様　

00:03:55 - 00:15:30

Transition from personal growth to adding value to others

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:05:51 - 00:07:48

- Action precedes motivation

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:05:57 - 00:07:02

wow. Well that hits hard. I hope I'll remember it.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @robinmc142 様　

00:06:21 - 00:15:30

- Timing

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:07:02 - 00:10:47

Action precedes motivation in finding a career you love

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:07:48 - 00:09:45

Survivorship bias affects the advice of pursuing high-risk, high-reward paths.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:09:45 - 00:11:42

- Know your influence

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:10:47 - 00:12:05

"Cut Defence Ties". Glad to see the Student Body Politic has some positive values... 😀 And a cool idea, btw, I've not seen that in the UK, a slogan top of the mortar-board. But speaking with a tassle??? I'm not sure they do that here, either... 🤔

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @lawrence18uk 様　

00:11:02 - 00:15:30

Success is a function of the value you bring to others

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:11:42 - 00:13:39

lol my calculus teacher told me I should consider a double major or at least a minor in math. I took his advice as a compliment, and completely ignored it. Now I am unemployed training to become a data scientist! Trust those who have more experience than you!

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @2AitchSquared 様　

00:11:44 - 00:15:30

that's such an amazing mentality.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @robinmc142 様　

00:11:58 - 00:15:30

- Anticipate change

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　

00:12:05 - 00:15:30

"CUT DEFENSE TIES" in the audience hell yeah

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @anarcho-yorpism 様　

00:12:16 - 00:15:30

Influence the dreams of others and be adaptable to change

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @HarryPotter-nj3mf 様　

00:13:39 - 00:15:30

This is the best commencement speech I've ever heard. Also, the audience reaction at cracked me up.

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @johnchessant3012 様　

00:13:43 - 00:15:30

, like me so that I can listen this part again

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @TheKhagendra 様　

00:14:12 - 00:15:30

"following not the dreams but the opportunities"

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @superkaran20 様　

00:14:56 - 00:15:30

the dude who greets and makes you comfortable in unknown party

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @johnhammer8668 様　

00:15:17 - 00:15:30

Love the dude at showing his enthusiasm for the class of 2024

What "Follow Your Dreams" Misses | Harvey Mudd Commencement Speech 2024

2024年05月18日　 @salpicaomesquinho 様　

00:15:22 - 00:15:30