
Can a marketer replace a developer with AI coding tools?
We all saw the Super Bowl ads. Type a sentence, here's your app. Describe your idea, watch it build itself. The narrative being pushed right now is that the technical barrier is gone, that anyone with a clear enough vision can build whatever they need without a developer in the room.
These are quite the claims to say the least. We wanted to see what AI development is like for ourselves, so we tested a few tools.
The setup
On a recent episode of Discussing Stupid, Virgil put three tools to the same task: Claude, ChatGPT, and GenSpark.
The prompt was straightforward: build a visually appealing, fully accessible accordion web component using a real article as the source content.
An accordion is one of the most common patterns on the web, those expandable drawers you see on FAQ pages everywhere. It is not anything exotic, it would be exactly the kind of thing a marketer might try to build on their own.
What actually happened
All three tools produced something. That is worth acknowledging, because it is not always the case.
In earlier episodes of this series we tested AI on content creation, image generation, wireframing, video, and the failure rate has been pretty high. Maybe it's because so much time has passed and AI models have simply gotten better over the course of the season, but with AI code, it seems that the floor is higher. No tool produced something completely unusable. All three outputs could have been handed to a developer and worked with, which is pretty impressive.
However, as kind of alluded to above, none of them were ready to ship, and the ways they fell short are exactly the kinds of things a marketer would not catch.
Claude's results
Claude built something visually strong. It had a good layout, and the best screen reader performance of the three. Panels were numbered correctly, focus behavior was handled well, and if a drawer was closed a screen reader could not access the content inside it, which is exactly how it should work.
But the drawers would not stay open - there was a JavaScript bug causing them to close the moment you clicked them. And the color scheme, three shades of green (Claude really seems to like greens), failed contrast checks entirely.
If you do not know what you are looking at in the code, you would have no idea where to start with the bug. You might not even realize it was a JavaScript issue.
Check out what Claude made for us here: https://links.discussingstupid.com/s3e12-claude
ChatGPT's results
ChatGPT produced something that worked mechanically. The drawers opened and closed, the content was there, and it was the only one with no color contrast issues.
But, it was by far the most boring output visually - plain, flat, the kind of thing you would not feel good putting in front of anyone. And despite getting the visual basics right, it had the worst screen reader compliance of the three. Panels were not numbered correctly, focus behavior was off, and users relying on assistive technology would have had a genuinely frustrating experience.
Visually you would never know, it passed a basic eyeball test, which is exactly the problem.
Check out what ChatGPT made for us here: https://links.discussingstupid.com/s3e12-chatgpt
GenSpark's results
GenSpark surprised everyone. Visually it was the strongest of the three, being the nicest looking, the most polished, and the animation actually worked.
The follow-up suggestions it offered after generating the component were also genuinely useful. It flagged its own animation issues, offered to fix them, suggested a mobile-responsive version, and asked about touch interactions. That kind of self-awareness from a tool is new and worth noting.
It still had a contrast issue and its screen reader compliance landed in the middle of the pack, but overall it was the closest to something you could actually hand to a developer and move forward with quickly.
Check out what GenSpark made for us here: https://links.discussingstupid.com/s3e12-genspark
What the test showed
A seasoned developer reviewing these outputs would know within a few minutes what was wrong and how to fix it. They would know the JavaScript issue is probably a scoping problem or a missing condition. They would know to run a screen reader test before calling it done. They would know color contrast is not a style preference, it is a compliance requirement.
The marketer looking at the same outputs would see something that mostly looks right, which is the real risk - not that AI produces garbage, it often does not anymore - the risk is that it produces something that looks good enough to ship, and the person evaluating it does not have the context to know what they are missing.
That is because the value of AI in coding is directly tied to what you already know going in. Think about what a developer can actually do when using these tools effectively. A dev knows:
- What to ask for specifically
- How to describe the functionality they need with enough specificity to get something useful back
- How to look at the output and spot what is wrong
- Where the generated code lives in the broader project, how it interacts with everything else, and what breaks if it is not quite right
If you can do all of that, AI makes you significantly faster. If you cannot, AI just gets you to the wrong place faster.
Where it does make sense for non-developers
This is not an argument that marketers should stay away from these tools entirely. There is a version of this that makes real sense for non-developers. Internal tools, prototypes nobody is going to rely on, quick experiments you are not putting in front of customers, low-stakes stuff where the consequence of it being a little broken is manageable. In those cases, AI absolutely lowers the barrier enough to be worth trying.
But a production-ready website component? A customer-facing feature? Something that needs to meet accessibility standards, work across devices, and hold up under real use? That is a different conversation. The gap between "AI generated this" and "this is ready" is still filled almost entirely by expertise.
The question worth watching
The more interesting question, and the one we genuinely do not have a complete answer to, is what happens as these tools keep improving? The gap is real today, how wide will it be in a year (or even a few months at this rate)? That is worth watching closely, and it is a big part of why we keep running these tests.
For developers, the picture is already pretty good. The repetitive work, the boilerplate, the standard patterns they have built a hundred times - AI handles a lot of that now, and handles it well. That time goes somewhere else. That is a real win, and it compounds fast.
For marketers who want to go further than that, the tools are more accessible than they have ever been. But go in with a realistic lens on how much you don't know.
And if you are a marketer who is genuinely building and shipping things with AI, don't hesitate to reach out, that is a conversation we'd love to have.
We got into all of this on the latest episode of Discussing Stupid with our most frequent guest and High Monkey Solutions Architect, Chad, who works with these tools every day. If this is a conversation you are already in the middle of, it is worth a listen.