You Probably Need Distraction Too

May 12, 2024

Attention-based LLMs re-train their learned patterns with attention to context distance, meaning they infer patterns, then collections between concepts as they train. As I understand it this is the thesis of the paper “Attention is All You Need” and it’s new machine learning pattern, transformers. ChatGPT is a generative (it writes things, makes content) pre-trained (they’re not training it on the fly, it has to fit context and is locked to inference) transformer (the attention thingy majig).

The result is pure, refined crystallized intelligence, on the scale of one person knowing next to everything that’s ever been written about on the internet. But that’s not all of intelligence, is it? You can’t have that ‘Pre-trained’ bit and also consider something to have any fluid intelligence at all.

We’ll have to figure out how to make both parts of the brain, and when we do, even if the training it can derive is very fast to take affect, it’ll still have a long ramp time if it’s reliant on the ability to talk with and learn from actual people.

Let’s hope it doesn’t happen; if we replace our thinking facilities altogether we will all be doomed! /s (but maybe)