Stable Diffusion XL Turbo can create AI images in real time as you type

Alfonso Maruccia

Posts: 1,025   +301
Staff
What just happened? Stability AI is accelerating the release pace for its uncanny generative services. The company is introducing a new image generation model that can seemingly display and modify AI images as quickly as the user can type – on the right hardware platform, that is.

SDXL Turbo is the latest text-to-image AI model developed by Stability AI. The work-in-progress generative service employs a novel distillation technique called Adversarial Diffusion Distillation (ADD) that provides users with the enticing ability to generate image outputs in a single step, a significant improvement over the 20-50 steps required by the previous model.

Stability AI said SDXL Turbo can generate visual outputs in "real-time" while maintaining high sample fidelity. It's important to note that the service is not yet intended for commercial use, and there's a research paper available that provides detailed information about the new ADD technique.

By incorporating ADD technology, SDXL Turbo gains several advantages shared with Generative Adversarial Networks (GANs) while avoiding artifacts or blurriness often observed in other distillation methods. Stability AI conducted comparisons between different model variants by generating outputs with the same prompt. Human evaluators then had to choose the output that most closely resembled the textual prompt instructions.

Additional tests were later conducted to evaluate image quality. These blind tests revealed that SDXL Turbo could deliver superior results compared to the LCM-XL model in just one step instead of four, and even outperformed a 50-step configuration of SDXL with only four steps. Thanks to these results, Stability AI can now assert that SDXL Turbo surpasses state-of-the-art multi-step models with "substantially" lower computational requirements.

SDXL Turbo not only maintains image quality but also provides significant improvements to inference speed. On an Nvidia A100 AI GPU accelerator, the generative service can generate a 512 x 512 image in just 207 ms, including prompt encoding, denoising, decoding, and FP16.

AI enthusiasts can now explore the capabilities of the new generative model on Stability AI's image editing platform, Clipdrop. The service is compatible with most modern browsers, the company states, and is currently available for free during its beta phase. While Stability AI is open to potential commercial applications of the new model, interested parties will need to contact the company directly for further details.

Permalink to story.

 
Back