Written by
Nickolay Shmyrev
on
Boosting and Stable Diffusion
“The wind blows to the south and turns to the north; round and round it goes, ever returning on its course.” - Ecclesiastes 1:6
“Hundred years day and night spins Carousel Earth. Hundred years all the winds return to their own circle.” - Russian Mary Poppins.
Hearing this song on TV on Christmas afernoon an interesting case of
return to the roots came today to my mind. Long before when deep learning
was an emerging thing I remember a discussion on Hacker News about deep
learning and one not very popular comment that deep learning is actually
boosting well known to machine learning community:
Since the fall of AI, there are two groups of people in this topic – one trying to make some reproducible, robust results with well defined algorithms and second importing random ideas from the first group onto some questionably defined ANN model and getting all the hype because of the “neural” buzzword. “Deep learning” is actually called boosting and has been around for years.
https://news.ycombinator.com/item?id=4826267 (2012)
This topic had been raised few times over years without much populariy. There was a chase for building many layers, although
community stopped around a hundred, it is harder to train more that that.
And then trying to build more layers people came to denoising diffusion
probabilistic models. The idea is that we start from a very noisy image
and gradually shape it into very detailed pictures. The process can take
thousand steps but essentially we start with a blank page, then introduce
main shape, then figure out every single pixel of the image. It looks
like this:
There are many good tutorials around explaining the details, one of them is from Yannic Kilchner, but there are many others.
The statistical process part in the diffusion process doesn’t seem very
imporant, what is important is top-down model of the reality when we
start with overall shape and go down to smaller and smaller things. This
process is very efficient and allows to learn very complex world of
images. It is exciting for me how similar the processes of boosting and
diffusion are. Top-down feature engineering seems like a really promising
way and we might see more of them in modern efficient deep learning.