Diffusion models generate images by gradually refining random noise through a series of denoising steps. These models work by learning the reverse process of a Markovian diffusion process, where noise is iteratively removed using trained neural networks.
Forward Process: Noise is systematically added to an image over multiple steps until it becomes unrecognizable.
Reverse Process: A neural network learns to predict and remove noise step by step, reconstructing the original image.
Unlike GANs, which suffer from mode collapse, diffusion models provide higher-quality, diverse image outputs and are widely used in tools like DALL·E 2, Stable Diffusion, and Imagen.