Emergent abilities in LLMs

There are two extremes these days - one party claims that LLMs has magical emergent abilities, another claims that AI is overhyped and will end soon.

The real situation is actually very simple. I said that before in several talks but never saw this simple explanation anywhere. Emergent abilities exist, but they are not magical. LLMs are a real thing, certainly not a hype.

It is actually pretty straightforward why LLMs “reason” or, to be more exact, can operate on complex concepts. By processing huge amount of texts with variety of cost functions they build an internal representation where those concepts are represented as a simple nodes (neurons or groups). So LLMs really distill knowledge and build semantic graph. Alternatively you can think about them as a very good principal component analysis that can extract many important aspects and their relations. I said that before that multiobjective is quite important here, it helps to find unrelated concepts faster and Whisper is a good example of it.

Once knowledge is distilled you can build on top of that.

There were many attempts to build semantic graph before, but manual effort never succeeded because of scale. The real huge advancement is that automated process works.

Many blame recent video generation LLMs for misunderstanding physics. Its a temporary thing, soon they will understand physics very well.