Nvidia's latest AI technology transforms words into realistic images

jsilva

Posts: 325   +2
In context: Nvidia's GauGAN technology has already shown what it's capable of, turning simple sketches into photorealistic images. Since then, we saw it being employed in Nvidia Canvas, but it seems the GPU giant is aiming at higher grounds with its AI, launching a new version capable of turning words into images.

Nvidia showed its GauGAN technology for the first time in 2019, but only recently we saw it being featured in a product available for the general public. Named Canvas, this piece of software can be very fun to use, allowing users to create amazing photo-like images with basic sketches.

A few months have passed since Canvas' announcement, but work on GauGAN has continued and it's now hitting version 2.0. The technology has become even more impressive, as it's now capable of turning words into photorealistic images, providing a similar result to the one we get from using the draw-to-image feature.

As seen in the video above, write something on the text box, and an image will be generated immediately based on your words. Add an adjective or replace a noun in the phrase, and the image will change accordingly.

For added personalization, users can combine text and draw-to-image features. Using written words to generate the base and drawings to detail the image, you can change the shape, size and textures of any object within the image.

To achieve these results, Nvidia's GauGAN 2 text-to-image feature uses a generative adversarial network-based AI model that "combines segmentation mapping, inpainting and text-to-image generation." This model was trained using 10 million landscape images, so it should be well prepared for whatever you throw at it (or not).

You can give it a try using Nvidia's AI interactive demo for GauGAN 2 using your web browser. To play with it, you'll have to first agree with Nvidia's terms and conditions (check the box at the bottom of the page).

Permalink to story.

 
Seems like a lot of work and processing power to just tagging some jpegs and mp4 files with some metadata and throwing something together in mongodb in an afternoon.
 
I tried the link and it just drew a little square pixelated face with a frown and said the web site was down :(
 
Back