Will diffusion models disrupt LLMs?

We might be on the verge of having even more efficient state of the art large language models.

Autoregressive transformer models have dominated Large Language Models leaderboards for years. They produce text left to right, progressively, using the produced text to produce next token etc.

Diffusion models, which dominate image and video, produce the full output at each iteration and refine it step by step. This coarse-to-fine approach starting from noise then optimally denoising is state of the art for vision and video, but was lagging auto-regressive transformer approaches, until now…

Inception released Mercury, diffusion Large Language Models which exhibit very strong performance, while being magnitudes faster to run than auto-regressive models.

It is very encouraging to see alternative LLMs perform as well and be as fast. Stay tuned.

Emmanuel Hauptmann

Emmanuel Hauptmann is CIO and Head of Systematic Equities at RAM AI. He co-founded the company in 2007 and has led the development of the firm’s systematic investment and AI platform since.

Related Posts

Leave a Comment Cancel Reply