Boosting and Stable Diffusion

“The wind blows to the south and turns to the north; round and round it goes, ever returning on its course.” - Ecclesiastes 1:6

“Hundred years day and night spins Carousel Earth. Hundred years all the winds return to their own circle.” - Russian Mary Poppins.

Hearing this song on TV on Christmas afernoon an interesting case of return to the roots came today to my mind. Long before when deep learning was an emerging thing I remember a discussion on Hacker News about deep learning and one not very popular comment that deep learning is actually boosting well known to machine learning community:

Since the fall of AI, there are two groups of people in this topic – one trying to make some reproducible, robust results with well defined algorithms and second importing random ideas from the first group onto some questionably defined ANN model and getting all the hype because of the “neural” buzzword. “Deep learning” is actually called boosting and has been around for years. (2012)

This topic had been raised few times over years without much populariy. There was a chase for building many layers, although community stopped around a hundred, it is harder to train more that that.

And then trying to build more layers people came to denoising diffusion probabilistic models. The idea is that we start from a very noisy image and gradually shape it into very detailed pictures. The process can take thousand steps but essentially we start with a blank page, then introduce main shape, then figure out every single pixel of the image. It looks like this:

Stable Diffusion Process

There are many good tutorials around explaining the details, one of them is from Yannic Kilchner, but there are many others.

The statistical process part in the diffusion process doesn’t seem very imporant, what is important is top-down model of the reality when we start with overall shape and go down to smaller and smaller things. This process is very efficient and allows to learn very complex world of images. It is exciting for me how similar the processes of boosting and diffusion are. Top-down feature engineering seems like a really promising way and we might see more of them in modern efficient deep learning.