TrendPulse Logo

ChatGPT's new Images 2.0 model is surprisingly good at generating text

Source: TechCrunchView Original
technologyApril 21, 2026

It used to be easy enough to distinguish between human-made and AI-generated imagery — just two years ago, you couldn’t use image models to create a menu for a Mexican restaurant without inventing new culinary delights like “enchuita,” “churiros,” “burrto,” and “margartas.”

Now, when I ask the brand new ChatGPT Images 2.0 model for a menu of Mexican food, it creates something that could immediately be used in a restaurant without customers noticing that something’s off. (However, ceviche priced at $13.50 might make me question the quality of the fish.)

Image Credits:ChatGPT Images 2.0

For comparison, here’s the result I got from DALL-E 3 two years ago (at the time, ChatGPT did not generate images):

Image Credits:Microsoft Designer (DALL-E 3)

AI image generators have historically struggled to spell because they generally used diffusion models, which work by reconstructing images from noise.

“The diffusion models […] are reconstructing a given input,” Asmelash Teka Hadgu, founder and CEO of Lesan AI, told TechCrunch in 2024. “We can assume writings on an image are a very, very tiny part, so the image generator learns the patterns that cover more of these pixels.”

Researchers have since explored other mechanisms for image generation, like autoregressive models, which make predictions about what an image should look like and function more like an LLM.

Unfortunately, OpenAI declined to answer a question in a press briefing this week about what kind of model is powering ChatGPT Images 2.0.

Techcrunch event

Meet your next investor or portfolio startup at Disrupt

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410.

Meet your next investor or portfolio startup at Disrupt

Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410.

San Francisco, CA

|

October 13-15, 2026

REGISTER NOW

The company did, however, explain that the new model has “thinking capabilities,” which give it the ability to search the web, make multiple images from one prompt, and double-check its creations — this allows Images 2.0 to create marketing assets in various sizes, as well as multi-paneled comic strips.

OpenAI also says that Images has a stronger understanding of non-Latin text rendering in languages like Japanese, Korean, Hindi, and Bengali. The model’s knowledge cuts off in December 2025, which could impact how accurately it can generate certain prompts involving recent news.

“Images 2.0 brings an unprecedented level of specificity and fidelity to image creation. It can not only conceptualize more sophisticated images, but it actually brings that vision to life effectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models: small text, iconography, UI elements, dense compositions, and subtle stylistic constraints, all at up to 2K resolution,” OpenAI said in a press release.

These capabilities mean that image generation isn’t as rapid as typing a question to ChatGPT, but generating something complex like a multi-paneled comic still takes just a few minutes.

All ChatGPT and Codex users will be able to access Images 2.0 starting Tuesday; paid users will be able to generate more advanced outputs. The company will also make the gpt-image-2 API available, with pricing dependent on the quality and resolution of outputs.

Topics

AI, ChatGPT, image generation, OpenAI

When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

Amanda Silberling

Senior Writer

Amanda Silberling is a senior writer at TechCrunch covering the intersection of technology and culture. She has also written for publications like Polygon, MTV, the Kenyon Review, NPR, and Business Insider. She is the co-host of Wow If True, a podcast about internet culture, with science fiction author Isabel J. Kim. Prior to joining TechCrunch, she worked as a grassroots organizer, museum educator, and film festival coordinator. She holds a B.A. in English from the University of Pennsylvania and served as a Princeton in Asia Fellow in Laos.

You can contact or verify outreach from Amanda by emailing amanda@techcrunch.com or via encrypted message at @amanda.100 on Signal.

View Bio

April 30

San Francisco, CA

StrictlyVC kicks off the year in SF. Get in the room for unfiltered fireside chats with industry leaders, insider VC insights, and high-valu