Stable Diffusion 3.0 update (almost) perfects typography in generated images

zohaibahd · Feb 22, 2024

Why it matters: AI image generation is leaving the uncanny valley behind. Stability AI is rapidly advancing, making fake visuals truly indistinguishable from reality with its latest project. However, as rivals like Dall-E and Midjourney also enhance their capabilities, it's evident that this isn't just about achieving the clearest text; it's about leading the next wave of AI innovation.

Stability AI is tantalizing AI art enthusiasts with an early preview of its next-generation text-to-image model, Stable Diffusion 3.0. The startup has opened a waitlist for early access to the upgraded AI system, which promises crisper images, improved multi-subject handling, and significantly enhanced text rendering.

Typography has long been an Achilles' heel for AI image generation models like Stable Diffusion, even as they've become nearly indistinguishable from reality in other aspects. However, Stability AI asserts that the new 3.0 edition will offer a substantial improvement in rendering legible text and ensuring accurate spellings within generated visuals.

One example highlighted in the press release particularly caught our eye: an image of a city bus that looks virtually impossible to distinguish from an actual photograph, complete with impeccable text rendering on the road sign and the vehicle's side. While there are still minor imperfections (the license plate appears distorted), the overall quality represents a quantum leap from the model's predecessors.

That may not sound surprising when considering that, under the hood, Stable Diffusion 3.0 represents a major architectural overhaul from its predecessors. It employs a new "diffusion transformer" approach, similar to OpenAI's recent Sora model – a stark departure from the original Stable Diffusion architecture, according to Stability AI CEO Emad Mostaque, who spoke with VentureBeat.

Stable Diffusion 3.0 also integrates other cutting-edge techniques like "flow matching" – a novel method for training AI systems to better model complex data distributions. The researchers behind flow matching claim it enables faster training, more efficient sampling, and improved overall performance compared to traditional diffusion methods.

The revamped model suite will span a range of 800 million to 8 billion parameters when it eventually sees a full release. But before that public launch, Stability AI is putting the model through its paces with a closed preview to gather feedback and strengthen safety guardrails. The startup has implemented numerous safeguards for this preview release, with more in development through collaboration with researchers, experts, and, of course, its own community.

Stability AI's ambitions don't stop here, though. Mostaque has hinted that the new Stable Diffusion model will underpin the company's forthcoming work in 3D modeling, video synthesis, and other novel AI visual capabilities.

Interested parties can sign up for the waitlist.

Permalink to story.

https://www.techspot.com/news/101986-stable-diffusion-30-update-almost-perfects-typography-images.html

Dr Roboto · Feb 22, 2024

This is one of those rare times where I wish the government would be ahead of the curve and have a law requiring some type of watermarking or other technique to verify that an image is real or AI. The speed at which these technologies are advancing is astonishing.

I say this because I am continually baffled how easily humans are faked out with this stuff. As in the general population seems to have zero critical thinking skills. They see something on social media and instantly it must be true, even though we are continually made aware of AI's ability to fake things. The upcoming US presidential election is guaranteed to be full of of fake AI images and propaganda.

MasterAce · Feb 22, 2024

All the windows on the side of the bus are different, one side has two windshield wiper arms, other one. And what is the unsymmetric contraption on top of the bumper? And who would place the text to overlap with the wheel?

Photorealistic maybe, realistic not really.

GodisanAtheist · Feb 22, 2024

Dr Roboto said:
This is one of those rare times where I wish the government would be ahead of the curve and have a law requiring some type of watermarking or other technique to verify that an image is real or AI. The speed at which these technologies are advancing is astonishing.

I say this because I am continually baffled how easily humans are faked out with this stuff. As in the general population seems to have zero critical thinking skills. They see something on social media and instantly it must be true, even though we are continually made aware of AI's ability to fake things. The upcoming US presidential election is guaranteed to be full of of fake AI images and propaganda.

- Yeah, Social Media was like the primer, and AI generated content will be the fissile material, we're primed for a societal nuke the likes of which we've never seen.

I've long advocated that the post WW2 "golden years" all the old timers look back to were underpinned strongly by the fact that everyone got their information from only a handful of sources, and those sources all aligned on the facts. CBS/NBC/ABC, didn't matter who you watched, you got the same set of facts and you were free to reach wildly different conclusions.

The proliferation of alternative/cable news outlets in the late 80's and 90's started the ball rolling on different segments of the population not just arriving at different conclusions, but literally operating of an alternate slate of facts than other parts of the population. Social media democratized this, and AI will basically rip any semblance of a shared reality to pieces.

from the 40's thru the 2020's we saw a world getting smaller and more interconnected. I think from the 2020's thru ??? we will see the world get big, isolated, and hyper localized again as all faith in our information sources slowly evaporates.

Lew Zealand · Feb 22, 2024

GodisanAtheist said:
I've long advocated that the post WW2 "golden years" all the old timers look back to were underpinned strongly by the fact that everyone got their information from only a handful of sources, and those sources all aligned on the facts. CBS/NBC/ABC, didn't matter who you watched, you got the same set of facts and you were free to reach wildly different conclusions.

The proliferation of alternative/cable news outlets in the late 80's and 90's started the ball rolling on different segments of the population not just arriving at different conclusions, but literally operating of an alternate slate of facts than other parts of the population. Social media democratized this, and AI will basically rip any semblance of a shared reality to pieces.

Have you looked at why that's the case? Search up the Fairness Doctrine of the FCC.

Established in 1949
Abolished in 1987

Pretty much explains all of it.

Abaidor · Feb 23, 2024

MasterAce said:
All the windows on the side of the bus are different, one side has two windshield wiper arms, other one. And what is the unsymmetric contraption on top of the bumper? And who would place the text to overlap with the wheel?

Photorealistic maybe, realistic not really.

IMHO, at the moment, such images can only be used in fast moving material where the viewer won't notice.

pcnthuziast · Feb 23, 2024

Potentially a very powerful weapon for political disinformation.

gdavid65 · Feb 23, 2024

Dr Roboto said:
This is one of those rare times where I wish the government would be ahead of the curve and have a law requiring some type of watermarking or other technique to verify that an image is real or AI. The speed at which these technologies are advancing is astonishing.

I say this because I am continually baffled how easily humans are faked out with this stuff. As in the general population seems to have zero critical thinking skills. They see something on social media and instantly it must be true, even though we are continually made aware of AI's ability to fake things. The upcoming US presidential election is guaranteed to be full of of fake AI images and propaganda.

The general human is ****ing stupid.. the general computer is not.

Stingy McDuck · Feb 23, 2024

Dr Roboto said:
I say this because I am continually baffled how easily humans are faked out with this stuff. As in the general population seems to have zero critical thinking skills. They see something on social media and instantly it must be true, even though we are continually made aware of AI's ability to fake things. The upcoming US presidential election is guaranteed to be full of of fake AI images and propaganda.

The sad thing is that many people don't want to confirm or learn if something is fake, they just want a "fact" that aligns with their own personal views. It's really sad.

Puiu · Feb 23, 2024

Dr Roboto said:
This is one of those rare times where I wish the government would be ahead of the curve and have a law requiring some type of watermarking or other technique to verify that an image is real or AI. The speed at which these technologies are advancing is astonishing.

I say this because I am continually baffled how easily humans are faked out with this stuff. As in the general population seems to have zero critical thinking skills. They see something on social media and instantly it must be true, even though we are continually made aware of AI's ability to fake things. The upcoming US presidential election is guaranteed to be full of of fake AI images and propaganda.

When you can just take a screenshot... watermarking will do nothing but catch a few less tech-savvy dum-dums. I seriously have no idea what can be done to prevent fakes.

The only thing I can think of is for the generator to insert a secret pixel pattern that covers the entire image, undetectable by humans, than can be read akin to a QR code by specialised software/hardware. (and even this can just be destroyed by lowering the quality of the original AI image which happens all the time online)

Stable Diffusion 3.0 update (almost) perfects typography in generated images

zohaibahd

Posts: 976 +19

Dr Roboto

Posts: 390 +1,049

MasterAce

Posts: 293 +412

GodisanAtheist

Posts: 1,226 +2,585

Lew Zealand

Posts: 2,936 +4,014

Abaidor

Posts: 33 +26

pcnthuziast

Posts: 1,640 +1,424

gdavid65

Posts: 243 +502

Stingy McDuck

Posts: 322 +305

Puiu

Posts: 7,548 +7,145

Similar threads

Latest posts