Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How would you generate a picture of Noun + Noun in the first place in order to train the LLM with what it would look like? What's happening during that 1 estimated second?


Use any of the image generation models (eg Nanobanana, Midjourney, or ChatGPT) to generate a picture of a noun on a noun. Simonw's test is to have a Language (text) model generate a Scalar Vector Graphic, which the language model has to do by writing curves and colors, like draw a spline from point 150,100 to 200,300 of type cubic, using width 20, color orange.

In that hypothetical second is freaking fascinating. It's a denoising algorithm, and then a bunch of linear algebra, and out pops a picture of a pelican on a bicycle. Stable diffusion does this quite handily. https://stablediffusionweb.com/image/6520628-pelican-bicycle...


its pelicans all the way down


This is why everyone trains their LLM on another LLM. It's all about the pelicans.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: