Here's why Nano Banana Pro wants to turn everyone into a visual thinker

According to tech influencer Nate B Jones, we are done with sad AI diagrams and blurry AI charts. Google’s new model, Nano Banana Pro, is his latest “jaw on the floor” moment. His claim is simple: all your old assumptions about AI visuals, from “it cannot do text” to “it cannot do diagrams” to “it cannot handle long prompts”, are now “wrong”.

Nano Banana Pro is billed as a visual reasoning model, not a traditional diffusion image generator. In Jones’s words, it “understands layout” and “understands diagrams”, along with typography, data, brand grammar and “style universes”. He describes it as “a layout engine with a diagram engine with a data visualization engine and a style engine all inside one model.”

In practice, that means it is supposed to spit out finished visual artefacts in a single shot. Think dashboards, diagrams, editorial spreads and blueprints, where text, images and charts are all treated as “co-equal and composable elements”. Jones says it can “separate really dense multi-constraint prompts into an orderly fashion and execute on them without collapse”. In other words, you can dump a messy design brief into the prompt box and the model will not immediately panic.

He talks about Nano Banana Pro as if Tableau, InDesign and Figma “all had a baby”. Inside that metaphorical child sit a few “engines” that he thinks are the real breakthroughs.

First is the layout engine. The model appears to understand “grids, gutters, margins, columns”. It “maintains alignment and spacing and type hierarchy” and can crank out structured one pagers without mangling everything into a moodboard. Jones calls it “magic”, although he immediately walks that back and suggests the more boring answer is likely “good old pre-training” and “classic reinforcement learning techniques” taken to scale.

Next is the diagram engine. Jones says Nano Banana Pro can convert structured text into diagram form in one go. He claims he fed it an academic AI paper from arXiv about “adversarial prompting in poetry” and got a “nice little visual of what the paper called out” in a single shot. Silly topic, he concedes, but a useful demonstration of “structured text into clean diagrams”.

The text and typography engine is where things start to sound like heresy to anyone who has watched image models massacre words. Jones insists it can do “sharp text at small sizes” and “multi-line paragraphs”. It apparently handles charts, handwriting and even weird prompts like “backwards and upside down in perspective as Shakespeare was writing something facing you on the desk”. “I do not know how they did that,” he admits, but the result impressed him.

There is also a data visualisation engine. Nano Banana Pro can apparently take numbers from, say, an earnings report and turn them into accurate charts. Jones notes that “we do that all the time” and that it “has been painful for a long time. Not anymore.” His most corporate-flex example: he pasted an entire Google earnings 10-Q into the tool and “it turned the entire earning statement into a usable infographic” in one shot.

On top of that sits a style engine. Ask for Lego, blueprint or retro sci fi and it will hold that style across multiple iterations, Jones says. In his tests it handled a “corkboard style” with handwritten notes on top, and it “understands and applies brand palettes and logos”. His conclusion on this point: “This is going to be huge for marketers.”

The final piece is what he calls a representation transformer. You can describe a concept once, then ask for it as “a blueprint or an infographic or a magazine spread or a storyboard or yes a Lego scene” and Nano Banana Pro will maintain “semantic integrity across all of those representations”. In his view, “surfaces are really becoming interchangeable” and the choice of format “almost becomes a parameter”.

If this all sounds slightly too powerful, there is at least one familiar catch. Access is awkward. Nano Banana Pro currently lives inside Google AI Studio, and, as Jones puts it, “they helpfully ask you to provide an API key to use the tool”. He is diplomatic but clearly amused when he says he wishes he could tell you that Google had finally made this “as easy to access as ChatGPT. They have not.” His workaround is a Substack note on “how to get a Google API key”, which he insists is “not scary”.

There is a reason for the friction, he suggests. Nano Banana Pro is “a sort of token spendy model” that can generate full 4K images. Jones contrasts this with earlier Nano Banana outputs that were “like a 500 pixel image” that “does not stand up” once you zoom in. That limitation, he argues, is “increasingly going away”.

For Jones, the important shift is not just prettier images. It is about shortcuts to finished artefacts, not drafts. “AI is jumping from helpful assistant to finished output generator here,” he says, because the visuals now reach the fidelity required “for executives, for clients, for onboarding, for teaching”. He predicts that workflows will “collapse” because you can go straight from prompt to diagram, dashboard, concept art or editorial layout, with no intermediate sketching and redlining.

He is bullish on the impact inside companies. In his view this “is going to eliminate design bottlenecks like crazy” because “anyone can now produce pro grade visuals” and it “reduces a lot of dependency on design bandwidth”. He is quick to concede that “an excellent senior designer is going to run circles around anything that AI can generate”, but points out that there are “so few excellent senior designers” and a lot of everyday work that is “not super meaningful” yet still has to get done before a client meeting.

His Google earnings example is the purest form of this argument. “I pasted the PDF in,” he says, and got “a usable infographic” that summarised the quarter. “One shot.” It is not glamorous, but it is the kind of thing that burns many human hours today.

Because Nano Banana Pro is available via API, Jones immediately jumps to agent scenarios. If agents can call this model, they can “generate diagrams”, “generate dashboards”, “summarize PDFs visually” and “update onboarding assets”. In his words, “there is an entire class of visual communication that just became machine native.”

The bigger theme he keeps returning to is democratised visual thinking. “Previously you had to kind of be good at visuals to do visual thinking or else you were a consumer of visual thinking,” he says. Now “everybody can communicate in a sophisticated visual mode.” That means “cheap disposable surfaces” that you can iterate on, storyboards for complex concepts, and an explosion of “mechanical cutaways, architectural blueprints” and “sophisticated UX flows” in slides that no longer feature the infamous six-fingered hand.

If this starts to sound like design utopia, he grounds it with a more modest promise. “We are not going to have to suffer through so many bad PowerPoints,” he says. “The client presentations are going to suck less.” It is a very specific vision of progress.

He also spends a surprising amount of time on prompting tactics, which is a polite way of saying you will still need to think before you type. His recommendations:

Use complex block structured prompts with clear sections for task, style and layout.
Always define your work surface. Instead of “make a diagram”, say “create a left to right architecture diagram” and specify clusters, swimlanes and labels.
Provide component lists for dashboards and similar layouts. For example: “KPI blocks, mini pie charts, icons, summary panel.”
Add constraints such as “do not overlap labels”, “AI text must be sharp at small sizes” and “keep even spacing between nodes” if you care about consistency.
Feed it structured input like lists, tables, hierarchies and metrics since “Nano Banana loves structured input” and can translate that structure visually.
Be explicit about style, where designers are already much better than the rest of us at naming and describing what they want.

If you want a prompt template, his advice is to “separate the what” (the task), “the how” (style, layout, components) and “the why” (interpretation). You can “attach a few images” as references, tell Nano Banana Pro whether to use them verbatim or as inspiration, then “let it go to town”.

Perhaps the most grounded part of the whole monologue is his admission that you do not need to become a prompt savant. “You do need more sophisticated prompts for more sophisticated work,” he says, but “just a simple prompt will still produce good work in this model.” For him, that is “always a mark of a good model, a useful model. It does not take a PhD to prompt it to get useful results.”

Related reading

There is still some classic influencer hype here. “We have solved visual reasoning,” he declares at one point. That is a large claim for a single model behind an API key that most people have not touched yet. But beneath the breathless tone, there is a clear shift: a model that can read structured text, understand layout and output clean 4K visuals with real, legible typography is not just another “pretty picture of a dragon” generator.

If Nano Banana Pro does even half of what Jones shows off, the future of bad corporate diagrams may finally be in trouble.

Subscribe to Our Newsletter

Here's why Nano Banana Pro wants to turn everyone into a visual thinker

Oracle shares surge as AI infrastructure demand triggers earnings beat and upgrade

Fanatics Flag Football Classic adds more NFL stars and Youtubers to rosters

Microsoft has shown 'privacy preserving' widespread Copilot use for health and caregiving

NVIDIA 'virtualizes' game development with new RTX PRO Server

What it means: Nvidia bets on Mira Murati's comeback with a gigawatt chip deal worth tens of billions

Explore topics

Tech

Artificial Intelligence

Business

Entertainment & Sport

Top tags