Cowart turns AI image iteration into a project-local canvas workflow instead of another chat attachment loop

updates

Cowart brings a tldraw-powered canvas into Codex, keeps the canvas state inside the active project, and makes image generation plus annotation-driven edits feel like a durable workspace instead of scattered chat context.

GitHub README capture for zhongerxin/Cowart

Most AI image workflows still inherit the same limitation from chat interfaces: the visual state of the work is fragile. You describe an idea, upload a screenshot, get an output, then lose the surrounding context in a scrollback full of attachments and prompts. That is workable for one-off generations, but it is not a great product surface for iterative design. Cowart caught my eye because it treats that gap as a workspace problem instead of a prompting problem.

At a glance, Cowart is a local infinite-canvas plugin for Codex built on top of tldraw. That description is accurate, but it undersells the more interesting idea. The project is not just adding a whiteboard beside an agent. It is trying to make visual thinking, image generation, and annotation-driven revision live inside the same project surface as the rest of the work.

That shift matters. Once a canvas becomes part of the project instead of a separate SaaS tool or throwaway artifact, visual iteration starts to feel closer to source material than to chat residue.

What the project actually ships

Cowart runs a local web service, opens a tldraw canvas for the active project, and stores the canvas state under the project's own canvas/ directory. The README and startup script are very explicit about that storage model: page state lives in project-local JSON, and page assets sit beside it inside page-level asset folders.

That is already a meaningful product choice. Many agent-adjacent design tools either centralize state in their own app-specific storage or make generated images feel detached from the workspace that produced them. Cowart instead anchors the visual layer to the active project directory.

The plugin also exposes MCP-backed operations that are surprisingly practical. It can read canvas selection state, insert generated images into the selected holder, and support a workflow where an annotated screenshot becomes the input for a clean revised image placed beside the original. The interface described in the repo is narrow, but that narrowness is a strength: it focuses on a few concrete loops that real users actually repeat.

In practice, the workflow looks like this:

  • open a local canvas from Codex
  • create or select an AI image holder on the canvas
  • ask Codex to generate directly into that selected slot
  • annotate an existing image on the canvas
  • send the annotation screenshot back through the agent to produce a cleaned-up revision

That is not a giant platform. It is a tighter and more opinionated loop around ideation, review, and revision.

Why project-local persistence is the real insight

The strongest part of Cowart is not that it uses tldraw, and not even that it can trigger image generation. The strongest part is that it keeps the visual workspace in the active project.

That sounds small, but it changes the feel of the tool dramatically.

When visual artifacts live inside the project, they become easier to reason about as durable working state. A canvas is no longer "the thing I opened in another browser tab" or "the image I uploaded three prompts ago." It becomes a traceable part of the project workspace, with page files, assets, and structure that belong to the work itself.

For builders, this matters because a lot of AI-assisted creative work still breaks down at the handoff points. Planning happens in one place. Generated images appear in another. Notes and arrows live in screenshots. Final assets get copied into yet another folder. The workflow functions, but the state is fragmented.

Cowart is interesting because it reduces that fragmentation. It makes the canvas, the generated images, and the page-local asset history feel more like one working surface. Even if the files are not meant to be hand-edited directly, their location inside the project creates a stronger sense that the visual system belongs to the work rather than to the tool.

Why this feels more useful than a generic whiteboard

There are plenty of canvas tools already. The reason Cowart stands out is that it is not selling an infinite canvas as a blank collaboration metaphor. It is packaging a canvas as an agent-operable interface.

That difference shows up in the repo's primitives. The canvas is local. Selection state can be read. Images can be inserted into selected holders. Annotation screenshots can drive cleaned revisions. Those are not abstract promises about creativity. They are concrete affordances that let an agent interact with visual state in a repeatable way.

That makes Cowart feel closer to a visual control surface than to a note-taking board. It gives the agent a bounded place to act, and it gives the human a clearer way to express intent without forcing every edit request into prose alone.

This is the product lesson I like most here: sometimes the right upgrade for AI workflows is not a bigger model or a cleverer prompt, but a better stateful surface between the human and the agent.

Where the Codex plugin framing matters

Cowart is explicitly packaged as a Codex plugin, with skills, MCP wiring, and a default prompt flow for opening the canvas, generating into a holder, and revising from annotations. That packaging matters because it avoids turning the project into a vague "works with AI somehow" claim.

Instead, the repo defines a fairly opinionated environment:

  • Codex is the host
  • a local Vite app is the visual surface
  • MCP tools bridge the canvas state back into the agent loop
  • project-local folders hold the resulting state and assets

That stack is readable. Builders can understand what part is UI, what part is orchestration, and what part is storage. I tend to trust projects more when the boundary lines are that clear.

There is also a subtle adoption advantage here. Because the canvas runs locally and stores data in the current project, the repo avoids some of the usual setup anxiety around cloud whiteboards, accounts, or remote synchronization just to try the core idea. For experimentation-heavy workflows, lower setup friction often matters more than feature depth.

What kind of work this could be good for

Cowart feels especially relevant for workflows where words are necessary but not sufficient.

That could mean:

  • product mockup exploration where a human wants to point, circle, and compare variants quickly
  • creative direction loops where images need structured revision instead of one-shot prompt roulette
  • documentation or tutorial authoring where visual artifacts should stay close to the project that produced them
  • agent-assisted UI ideation where placement and composition matter enough that prose alone becomes clumsy

The repo is still small, but the underlying pattern is bigger than this first use case. A lot of agent products struggle because the only shared medium is chat. Cowart suggests a more durable alternative: keep a visual working surface local, inspectable, and tied to the project, then let the agent operate on that state.

That is a useful direction whether the output is concept art, interface layout ideas, or simply cleaner communication during iterative review.

The boundaries are clear, and that helps

Cowart does not pretend to be a full design suite. The repo focuses on opening the canvas, managing project-local state, generating into image holders, and revising from annotation screenshots. That is enough to make the core idea legible without overpromising.

The project is also early. The plugin manifest is still at 0.1.x, the scope is intentionally narrow, and the README is much more about demonstrating the loop than claiming mature ecosystem coverage. That should keep expectations grounded.

But early does not mean unimportant. Some of the most valuable open-source projects start by making one workflow feel materially better rather than by trying to absorb an entire category on day one. Cowart has that shape. It is small, opinionated, and pointed at a real friction point in agent-assisted creative work.

Why builders should care

What makes Cowart worth watching is not just that it adds a canvas to Codex. It is that it treats visual iteration as durable project state.

That is a better mental model than the usual attachment-driven loop. Instead of scattering intent across screenshots, prompts, and generated outputs, the repo tries to pull those pieces back into one local surface the agent can read from and write to.

For builders working on AI-native tools, that is the interesting takeaway. The next layer of agent UX probably is not only about better model capability. It is also about better surfaces for shared state. Cowart shows one promising version of that idea: a project-local canvas that turns visual collaboration with an agent into something more structured, repeatable, and ownable.

Repo

GitHub: https://github.com/zhongerxin/Cowart