Bottom line: New data from Appfigures shows image-generation features are driving growth in mobile AI apps, outpacing traditional model upgrades. According to the firm, releases centered on image models generate 6.5 times more downloads than standard updates focused on language or reasoning improvements.
This reflects a shift from earlier phases of the AI boom, when advances in conversational models and features like voice drove adoption. Those improvements still matter, but they no longer spark the same immediate interest as tools that generate visual content.
Recent launches from major platforms illustrate the change. Google's Gemini app saw a sharp rise in installs after introducing its image model, Nano Banana. In the 28 days following the release of the Gemini 2.5 Flash image model, downloads increased by more than 22 million, a jump of over four times its typical rate during a similar period.
OpenAI's ChatGPT experienced a similar boost after rolling out its GPT-4o image-generation capabilities. The app added more than 12 million installs in the first 28 days after launch. Appfigures found the spike was roughly 4.5 times larger than the increases tied to GPT-4o, GPT-4.5, and GPT-5 model updates. The data suggests visual features are more effective at attracting new users than incremental improvements in text performance.
The pattern extends beyond still images. Meta's AI product Vibes, which focuses on short-form video generated by AI, brought in an estimated 2.6 million additional downloads within a month of its September 2025 debut. While technically a video feature, it fits into the same category of visual AI tools built around quick, shareable content.
The data also points to a gap between growth and monetization. A spike in downloads does not automatically translate into revenue. Gemini's Nano Banana release, despite its strong adoption numbers, generated only about $181,000 in estimated consumer spending during its first 28 days. Meta's Vibes also drove installs but showed no meaningful increase in revenue.
ChatGPT stands out as the exception. Its GPT-4o image model not only attracted users but also drove spending, generating an estimated $70 million above its baseline in the same 28-day window. The gap suggests that while image features bring users in, converting them to paying customers depends on how those features are built into the product.
Not every spike follows this pattern. DeepSeek's R1 model, released in January 2025, led to 28 million downloads, but the surge was driven by industry attention rather than a specific feature like image generation. The model gained traction largely due to its lower-cost training approach, which drew broad interest across the tech community.
Even so, the trend is clear. Visual AI features are becoming a main entry point for users, especially on mobile where speed and shareability matter more. Improvements to underlying models still matter, but they're increasingly happening behind the scenes, while image and video features are what grab users' attention.


