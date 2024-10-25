video size: 800x450 730x576 1024x576 1280x720 1920x1080 custom size x Advanced Embed Example Add the following CSS to the header block of your HTML document.



Description: In this edition of QuEST, Michael Robinson will discuss topological features in large language models



Key Moments and Questions in the video include:

Acknowledgement of colleagues from DARPA and Galois

Manifolds in machine learning

LLM token space is higher dimensional

Manifold spaces tend to be negatively curved

LLM turn text into vectors

Transformers turn vectors into new text

How do we turn the text into vectors?

We think of LLM as being trained on all human language, but they have not

GPT2 Open source LLM as the source for model

ChatGPT2 used as the example

Tokens have topology and geometry

Words are a categorical variable

Vectors are a numerical variable

Mixing data types can lead to some problems

Why care about the token space?

Not all tokens correspond to a valid vector

Estimating dimensions

Volume of a sphere

Log of Volume vs log of radius curves

Ricci scalar curvature

Stratifications are visible

GPT2 uses a state space that is not a manifold

Dollar sign shown different in GPT2 because the $ is used in code where other currency symbols are not

GPT2’s 768 dimensions unwrapped using tSNE

Tokens with leading spaces

Beginnings of words show up in separate piece of low dimension

Visual similarity to hyperbolic plane

LLEMMA7B dimensions

Plotting dimension

Dark space are non-printing characters

Thinking about how neural activation patterns work

We have been thinking about manifold learning out of mathematical convenience

State spaces are not manifolds

Open presentation to conversation