Will diffusion models disrupt LLMs?

We might be on the verge of having even more efficient state of the art large language models.

Autoregressive transformer models have dominated Large Language Models leaderboards for years. They produce text left to right, progressively, using the produced text to produce next token etc.

Diffusion models, which dominate image and video, produce the full output at each iteration and refine it step by step. This coarse-to-fine approach starting from noise then optimally denoising is state of the art for vision and video, but was lagging auto-regressive transformer approaches, until now…

Inception released Mercury, diffusion Large Language Models which exhibit very strong performance, while being magnitudes faster to run than auto-regressive models.

It is very encouraging to see alternative LLMs perform as well and be as fast. Stay tuned.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top