Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you squint your eyes it's a fixed iteration ODE solver. I'd love to see a generalization on this and the Universal Transformer metioned re-envisioned as flow-matching/optimal transport models.


How would flow matching work? In language we have inputs and outputs but it's not clear what the intermediate points are since it's a discrete space.


One of the core ideas behind LLMs is that language is not a discrete space, but instead a multidimensional vector field where you can easily interpolate as needed. It's one of the reasons LLMs readily make up words that don't exist when translating text for example.


Not the input and output though, which is the important part for flow matching modeling. Unless you're proposing flow matching over the latent space?


This makes me think it would be nice to see some kinda child of modern transformer architecture and neural ODEs. There was such interesting work a few years ago on how neural ode/pdes could be seen as a sort of continuous limit of layer depth. Maybe models could learn cool stuff if the embeddings were somehow dynamical model solutions or something.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: