AI creates realistic pictures from pure text

The system makes it faster and easier to create photorealistic AI art.
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Graphics processing unit maker NVIDIA has debuted a new way to create AI art. The program, called GauGAN2, can create photorealistic images using a text interface — in other words, type what you want to see and the software generates a picture of it.

“The deep learning model behind GauGAN allows anyone to channel their imagination into photorealistic masterpieces — and it’s easier than ever,” NVIDIA’s Isha Salian wrote in a blog post.

Generating AI art: The system uses deep learning to power its AI art algorithm. 

Deep learning is a specific form of machine learning — where an AI “learns” from large amounts of data — which is modeled after the human brain.

The AI can create realistic images using a text interface —type what you want to see and the software generates a picture of it.

Much like how your brain uses groups of neurons working in unison to puzzle through problems and generate thoughts, a deep learning AI uses what are called “neural nets” to perform some specific function. Deep learning is especially good at picking out images, or creating them.

Text to art: NVIDIA’s AI can turn ordinary text into images, which can then be edited or filled out with more details. 

“Simply type a phrase like ‘sunset at a beach’ and AI generates the scene in real time,” Salian wrote. Adding adjectives like “rocky” and “rainy” will cause GauGAN2 to modify the AI art instantly.

GauGAN2 will create a map of the images (rocks, sun, clouds, sand, water) in the scene, each of which can then be modified and edited by you, either with further text or a hands-on, Photoshop-like editor. This could allow you to take a realistic desert scene and, by popping an extra sun up in the sky, creating a landscape shot of Tatooine (Salian’s example).

Credit: Annelisa Leinbach

The frontiers of AI art: As The Next Web notes, GauGAN2 currently works best with simple descriptions of nature. 

Put in something a bit more complicated, like Tiernan Ray over at ZDNet did, and the end results are abstracted fever dreamscapes filled with Dali-esque amoebas (more a feature for AI art than a bug, in my opinion).

GauGAN2 is the second iteration of an AI originally released in 2019. The first GauGAN used segmentation mapping to help users create AI art. You could create a landscape piecemeal by drawing it in simple ways, like drawing in MS Paint, and GauGAN would fill in your segments with photoreal images, Ray explains.

NVIDIA says GauGAN2 is the first AI of its kind to be able to interpret commands using multiple methods, or modalities. 

“This makes it faster and easier to turn an artist’s vision into a high-quality AI-generated image,” Salian wrote.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
The West needs more water. This Nobel winner may have the answer.
Paul Migrom has an Emmy, a Nobel, and a successful company. There’s one more big problem on the to-do list.
Can we automate science? Sam Rodriques is already doing it.
People need to anticipate the revolution that’s coming in how humans and AI will collaborate to create discoveries, argues Sam Rodrigues.
AI is now designing chips for AI
AI-designed microchips have more power, lower cost, and are changing the tech landscape.
Why futurist Amy Webb sees a “technology supercycle” headed our way
Amy Webb’s data suggests we are on the cusp of a new tech revolution that will reshape the world in much the same way the steam engine and internet did in the past.
AI chatbots may ease the world’s loneliness (if they don’t make it worse)
AI chatbots may have certain advantages when roleplaying as our friends. They may also come with downsides that make our loneliness worse.
Up Next
text to code
Subscribe to Freethink for more great stories