Vibe Coding
The Complete Guide to AI-Native Software Development
22 chapters. 200+ prompts. Updated monthly. The only vibe coding resource that evolves as fast as the field.
Choose Your Plan
The vibe coding landscape changes every week. Your subscription keeps you current.
- ✓ First 3 chapters
- ✓ 10 sample prompts
- ✓ 2 video tutorials
- ✓ Interactive quiz
- ✓ All 22 chapters
- ✓ 200+ prompt library
- ✓ Video tutorials
- ✓ Monthly updates
- ✓ Tool comparison matrix
- ✓ Security playbook
- ✓ Everything in Monthly
- ✓ Bonus resources
- ✓ Early access to new content
- ✓ Priority support
30-day money-back guarantee. Cancel anytime. Payments handled securely by Lemon Squeezy (Merchant of Record). All prices in USD.
Frequently Asked Questions
Everything you need to know before you start.
Get a free chapter + weekly vibe coding insights
Join the mailing list for a bonus chapter on AI tool selection, plus weekly curated updates on the vibe coding landscape.
✓ You're in! Check your inbox for the bonus chapter.
No spam. Unsubscribe anytime. Part of the EndOfCoding ecosystem.
01. The Moment Everything Changed
On February 2, 2025, Andrej Karpathy — former OpenAI co-founder, former Tesla AI director, and one of the most respected voices in machine learning — posted what would become one of the most consequential tweets in software development history:
"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. I just see stuff, say stuff, run stuff, and copy-paste stuff, and it mostly works." — Andrej Karpathy, February 2, 2025
Within weeks, the term had gone viral. Within a month, Merriam-Webster added "vibe coding" as a slang and trending term. By December 2025, Collins English Dictionary named it their Word of the Year.
But vibe coding didn't just enter the dictionary. It entered the economy. It entered boardrooms. It entered the workflows of millions of developers. And it sparked one of the fiercest debates the software industry has seen in decades.
The Timeline
02. What Vibe Coding Actually Is
Strip away the hype, and vibe coding is a specific practice with specific characteristics.
Vibe coding is an AI-assisted software development approach where a developer describes what they want in natural language, an AI model generates the code, and the developer evaluates the result through execution rather than code review. The developer does not read, edit, or attempt to understand the generated code. They test whether it works, and if it doesn't, they feed the error back to the AI.
</div>
Karpathy described his own workflow precisely:
"I 'Accept All' always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. If it doesn't, I just revert to the last working state and re-prompt with more context."
The Three Core Loops
Vibe coding operates on three nested feedback loops:
**2.** Accept the generated code without reading it
**3.** Run it
**4.** Does it work? Ship it. Doesn't work? Move to Loop 2.
This is the happy path. For simple features, you may never leave this loop.
</div>
**2.** Accept the fix without reading it
**3.** Run it again
**4.** Repeat until resolved or move to Loop 3.
Most errors resolve within 1-3 iterations of this loop. The AI sees the error, understands the context, and fixes it.
</div>
**2.** Describe the desired outcome differently, with more context
**3.** Return to Loop 1
This is the escape hatch. If the AI gets stuck in a loop of broken fixes, go back to a clean state and try a different approach. This is why checkpoints matter — always have a rollback point.
</div>
What Vibe Coding Is NOT
Not using GitHub Copilot for autocomplete — that's AI-augmented coding (Level 1)
Not asking ChatGPT to explain code — that's using AI as a learning tool
Not reviewing AI-generated code before accepting — that's AI-collaborative coding (Level 2)
Not no-code/low-code platforms — those use visual builders, not natural language to code
Vibe coding is specifically: natural language in, code out, test behavior, never read the code.
03. The Philosophy: Trusting the Machine
Vibe coding isn't just a technique. It's a philosophical stance about the relationship between developers and code.
The End of Code as Sacred Text
For decades, programming culture has treated source code as something to be crafted, reviewed, optimized, and understood. Code reviews are rituals. Clean code is a moral virtue. Understanding every line is a professional obligation.
Vibe coding rejects this entirely. It treats code as a disposable intermediary between human intent and running software. The code doesn't matter. The behavior matters.
This is not as radical as it sounds. Most software professionals already interact with layers of abstraction they don't fully understand:
Few web developers read TCP packet internals
Few application developers audit their compiler output
Few React developers understand the fiber reconciliation algorithm
Few SQL users trace query execution plans for every query
Vibe coding simply adds another layer: the AI becomes the compiler for natural language.
The Four Pillars
🎯Intent Over Implementation"What should this do?" replaces "How should I build this?"⚡Speed Over EleganceWorking software now beats perfect code later🤖Trust the AIAccept all, don't read diffs, let the machine handle it📈Results-OrientedDoes it work? That's the only metric that mattersThe Abstraction Argument
Supporters frame vibe coding as the natural progression of programming abstraction:
1950sMachine Code → Assembly"You don't need to write binary opcodes anymore!"1970sAssembly → C"You don't need to manage registers anymore!"1990sC → Python / Java"You don't need to manage memory anymore!"2010sFrameworks / Cloud"You don't need to manage servers anymore!"2025Natural Language → Code"You don't need to write code anymore!"At each transition, purists warned that developers were losing essential skills. At each transition, the expanded abstraction enabled more people to build more things.
⚠️**The counter-argument is real, though:** Every previous abstraction still had deterministic behavior. Assembly always compiles the same way. C always allocates memory the same way. AI code generation is probabilistic — the same prompt can produce different code each time, with different bugs. This is a genuinely new kind of abstraction layer.
04. The Spectrum: Five Levels of AI-Assisted Development
Vibe coding is not binary. In practice, developers operate along a spectrum. Understanding where you sit — and where you should sit for a given project — is critical.
**When to use:** Security-critical code, regulatory requirements, environments where AI tools are prohibited.
</div>
**Tools:** GitHub Copilot, VS Code AI extensions
**Code understanding:** 100% — you review everything
**When to use:** Production code, team projects, anything you need to maintain
</div>
**Tools:** Cursor Composer, Claude Code, Codex CLI
**Code understanding:** 70-90% — you review most things
**When to use:** Professional development, startup codebases, any code that needs to scale
</div>
**Tools:** Cursor Agent, Claude Code, Bolt.new
**Code understanding:** 30-60% — architecture yes, implementation details no
**When to use:** MVPs, internal tools, prototypes headed toward production
</div>
**Tools:** Bolt.new, Lovable, Replit Agent, v0
**Code understanding:** 0-10% — you only test behavior
**When to use:** Personal projects, throwaway prototypes, hackathons, idea validation
</div>
**Tools:** Devin, Google Jules, OpenAI Codex (cloud mode)
**Code understanding:** Review-based — you check the output, not the process
**When to use:** Routine tasks, migrations, test generation, documentation, with human review gate
</div>
</div>
Take the interactive quiz at the end of this ebook to find out.
<button class="quiz-btn quiz-btn-primary" style="margin-top:0.5rem;" onclick="goTo('ch-quiz')">Take the Quiz →</button>
05. The Tools: A Complete Landscape (2025–2026)
The tooling ecosystem for AI-assisted development has exploded. The market is consolidating fast — with Cursor seeking a ~$50B valuation at $2B+ ARR, Lovable at $6.6B, Cognition at $10.2B, and billion-dollar acquisition battles playing out in real time. Anthropic's acquisition of Bun (the fast JavaScript runtime) signals Claude Code's push into native runtime integration. Here's the current state of play across every major category.
AI-Native IDEs
Autonomous Coding Agents
/loop command adds cron-like scheduled tasks — turning Claude Code into a background worker for PR reviews, deployment monitoring, and recurring analysis. 1-million-token context window. Max output increased to 64k tokens for Opus 4.6 (128k upper bound for Opus 4.6 and Sonnet 4.6). MCP servers can now request structured input mid-task via interactive dialogs. Skills.md enables persistent agent behaviors. Early April 2026: Anthropic acquires Bun (the fast JavaScript runtime built by Jarred Sumner) — bringing native Bun integration and faster JS execution directly into Claude Code workflows. Claude overtook ChatGPT as the #1 AI app on the App Store. Revenue surpassed $2.5B ARR (named world's most disruptive company, Time March 2026). In a Mozilla partnership, Claude Opus 4.6 autonomously found 22 CVEs in Firefox's C++ codebase. April 4, 2026 — OpenClaw Policy Change: Anthropic announced that Claude Code subscription limits no longer apply to third-party harnesses such as OpenClaw. Users of third-party Claude Code integrations must move to pay-as-you-go billing; a $200/mo Max subscription was reportedly being used to run $1,000–$5,000 of agent compute. Affected users received a one-time credit. Additional April updates: PowerShell tool for Windows (opt-in preview), flicker-free alt-screen rendering, named subagents in @ mentions, 60% faster Write tool diff computation. Note: Pentagon labeled Anthropic a supply-chain risk in March 2026 over weapons/surveillance policy; defense tech contractors migrating away. April 14, 2026 — Routines Launch: Anthropic launched Routines — saved configurations combining a prompt, repositories, and connectors that run automatically on a schedule or GitHub events on Anthropic's cloud infrastructure (no local machine required). Use cases: automated PR reviews, overnight test triage, weekly repo health audits. Plan limits: 5/day Pro, 15/day Teams, 25/day Enterprise. Desktop app redesigned simultaneously with integrated terminal, faster diff viewer, in-app file editor, and multi-session support.Browser-Based Builders
The Infrastructure Layer: MCP
</div>
The Model Race (March 2026 Update)
The foundation models powering these tools are advancing on multiple fronts. Key releases in early March 2026:
- GPT-5.4 (OpenAI): Native computer-use, 1M context, Standard/Thinking/Pro variants. Already integrated into Codex CLI and Copilot.
- Gemini 3.1 Flash-Lite (Google): Ultra-low-latency variant designed for inline code completions and real-time suggestions. Powers Windsurf and Jules background tasks.
- GLM-4.7 (Zhipu AI): China's leading code model, competitive with GPT-5 on multilingual programming benchmarks. Growing adoption in Asian markets.
- DeepSeek-V3.2-Speciale (DeepSeek): Open-weight model rivaling proprietary offerings. Strong at multi-file reasoning and long-context code generation.
Open-source LLMs now account for over 60% of production AI deployments — a tipping point driven by DeepSeek, Llama, Qwen, and Mistral. This has shifted the economics: developers increasingly use open-weight models for routine code generation while reserving proprietary models for complex architectural reasoning.
Andrej Karpathy, who coined "vibe coding" in February 2025, introduced a new term in early 2026: "agentic engineering" — the discipline of designing, orchestrating, and supervising autonomous AI agents that write code, run tests, and deploy systems with minimal human intervention. The term has rapidly entered common usage, marking the evolution from "coding with AI" to "engineering with agents."
06. The Agent Revolution
The most significant development since Karpathy's tweet isn't better autocomplete. It's the emergence of autonomous coding agents — AI systems that independently plan, implement, test, and deploy software.
From Copilot to Colleague
/loop command and Claude Managed Agents enable scheduled background tasks. Agents run CI pipelines, triage issues, and maintain codebases overnight. The developer reviews a morning summary of what the AI decided and changed while they slept.What Agents Can Do Today
Modern coding agents reliably handle tasks that would take a junior developer 4-8 hours:
The April 2026 Benchmark Picture
Agent performance has accelerated dramatically. The current public leaderboard (April 2026):
| Model | SWE-bench Verified | Access |
|---|---|---|
| Claude Mythos Preview | 93.9% | Restricted (Project Glasswing) |
| Claude Opus 4.6 | 80.8% | Public |
| Gemini 3.1 Pro | 80.6% | Public |
| GPT-5.4 | 75.0% | Public |
| Kimi K2.5 (open-source) | ~75% | Open |
Kimi K2.5 by Moonshot AI is the current #1 open-source option: 1 trillion parameter MoE architecture with 32 billion active parameters, competitive with frontier models at a fraction of the inference cost.
New Agent Orchestration Frameworks (April 2026)
Two major frameworks launched in April 2026 that reshape how multi-agent systems are built:
- Google Agent Development Kit (ADK):
google/adk-python— 8,200+ stars on launch week. Purpose-built for multi-agent orchestration with native Gemini integration and MCP support. Best for complex agent pipelines with multiple specialized sub-agents. - Meta llama-stack: Standardized agent runtime for Llama 4 models. Defines interfaces for tool calling, memory, and agent orchestration that work across the open-source ecosystem.
- Claude Managed Agents: Anthropic's managed runtime at $0.08/session-hour plus token costs. Provides sandboxed execution, state management, and permission scoping. Testing shows 10 percentage point improvement in task success rates over standard prompting.
The practical implication: you no longer need to build agent infrastructure from scratch. These frameworks handle the hard parts — state, retries, tool routing, parallelization — so you can focus on the task logic.
What Agents Still Struggle With
Cognition's own 2025 performance review of Devin put it well:
"Devin is senior-level at codebase understanding but junior at execution."
- Ambiguous requirements — agents make assumptions that may not match intent
- Complex architectural decisions — they can implement but struggle with system-level design
- Cross-system integration — tasks requiring deep understanding of multiple interconnected systems
- Security context — knowing when something is dangerous requires deployment context, not just code patterns
The Parallel Execution Advantage
Unlike human developers, agents can run multiple instances simultaneously, work 24/7, and process entire backlogs of tickets overnight.
07. Vibe Coding in Practice: Real Workflows
Theory is interesting. Practice is what matters. Here are four concrete workflows for different scenarios.
**Scenario:** You have a product idea and want a working prototype by Monday.
**Tools:** Bolt.new or Cursor + Claude • **Level:** 3-4
1. Write a detailed description (spend 20-30 min — it's the most important step)
Include: target users, core features, data model, key screens, visual style
Paste into Bolt.new or Cursor Composer
Iterate through natural language: "Make the sidebar collapsible" / "Add dark mode"
Deploy to Vercel or Netlify
Share with potential users for feedback
Build a job application tracker. I'm applying to software engineering positions and need to track: company name, position title, application date, status (applied/phone screen/onsite/offer/rejected), salary range, notes, and next action date. I want a clean dashboard showing all applications in a table with sorting and filtering. Include a kanban view grouped by status. Use a modern blue/slate color scheme. Store in localStorage. Make it responsive for mobile.
</div>
<div class="tab-content" id="wf2">
#### The Startup MVP
**Scenario:** Building a real product for real users, fast.
**Tools:** Claude Code + Cursor + v0 • **Level:** 2-3
1. Start with a product requirements document (even a rough one)
2. Use v0 to prototype key UI screens
3. Use Claude Code to scaffold the full architecture
4. Build feature-by-feature, testing each before moving on
5. Review auth code and data handling; accept UI code freely
6. Deploy to real hosting, set up monitoring
7. Plan a "hardening phase" for security-critical paths
<div class="callout warning">
<div class="callout-icon">⚠️</div>
<div class="callout-content">**The trap:** Skipping step 7. Many YC startups vibe-coded their MVPs successfully but faced "development hell" when trying to scale without hardening.
</div>
</div>
</div>
<div class="tab-content" id="wf3">
#### The Enterprise Integration
**Scenario:** Adding a feature to an existing production codebase.
**Tools:** Claude Code or Devin + CI/CD pipeline • **Level:** 5 with human gate
1. Create a detailed ticket with acceptance criteria
2. Assign to an AI agent (Devin, Claude Code, or Jules)
3. Agent analyzes codebase, creates a plan, implements the change
4. Agent runs existing test suite and fixes failures
5. Agent opens a pull request
6. Human reviews: security, performance, architecture, edge cases
7. Merge after human approval
This is Level 5 but with human review as the final gate. It's how most enterprises adopt AI coding in 2026.
</div>
<div class="tab-content" id="wf4">
#### The Solo Creator
**Scenario:** You're not a developer. You have an idea for an app.
**Tools:** Lovable, Bolt.new, or Replit Agent • **Level:** 4
1. Describe your application as if explaining it to a friend
2. Let the builder create the first version
3. Use it yourself — note what's wrong or missing
4. Describe changes in plain language
5. Repeat until satisfied
6. Deploy using the platform's built-in hosting
<div class="callout danger">
<div class="callout-icon">🔴</div>
<div class="callout-content">**Critical:** If your app handles user data, sensitive information, or payments, hire a security professional to review it before going live. The Lovable vulnerability study (170/1,645 apps) shows this isn't hypothetical.
</div>
</div>
</div>
08. Real-World Case Studies
These are documented, real examples — not hypotheticals.
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
09. The Numbers: Adoption and Impact
The data tells a clear story: AI-assisted development isn't a trend. It's a structural shift.
Adoption
AI Market Share (March–April 2026)
The Agentic Model Race (April 2026)
Four major model releases in a single month reshaped the competitive landscape. The race is no longer about raw benchmark scores — it's about how many agents a model can orchestrate and how long it can sustain autonomous work.
The signal: In one month, the public record for coding agent benchmarks shifted from Claude Opus 4.6 (80.8%) to GPT-6 (95%+). Both figures may be superseded by Anthropic's restricted Mythos model (93.9% SWE-bench, April 7). Multi-agent swarm scaling — exemplified by Kimi K2.6's 300-agent architecture — is the new frontier.
Revenue & Growth
Valuations (2026)
Productivity
Developer Sentiment (April 2026)
Cultural Impact
- Collins Dictionary Word of the Year 2026: "Vibe coding" (named again after 2025)
- MIT Technology Review: Named "Generative Coding" a 2026 Breakthrough Technology
- Merriam-Webster: Added as slang/trending term within one month of Karpathy's tweet
- Wikipedia: Full article with extensive sources and analysis
- Wall Street Journal: Reported widespread professional adoption (July 2025)
- Fast Company: Documented the "vibe coding hangover" (September 2025)
- arXiv: "Vibe Coding Kills Open Source" paper sparks open-source funding debate (January 2026)
- VibeX 2026: First academic workshop on vibe coding, scheduled at EASE conference in Glasgow
- Mainstream: Vibe coding is now a recognized methodology taught in bootcamps and referenced in enterprise strategy documents
10. The Dark Side: Security, Debt, and Failure
For every success story, there's a cautionary tale. The risks are real, documented, and in some cases severe.
The Tenzai Security Study
**Key finding:** AI tools avoid generic security flaws but struggle where what makes code safe vs. dangerous depends on context.
</div>
The Acceleration: 35 CVEs in One Month
The security threat from AI-generated code is not static. It is accelerating. In March 2026, security researchers confirmed 35 CVEs directly attributable to AI-generated code — 27 of them from Claude Code alone. Researchers from the CERT/AI Working Group estimate the actual monthly count including triaged-but-unpublished vulnerabilities is 400 to 700 per month.
The trend is steep and mirrors adoption curves:
| Month | Confirmed AI Code CVEs | Estimated Total |
|---|---|---|
| Jan 2026 | 12 | 250–350 |
| Feb 2026 | 21 | 310–450 |
| Mar 2026 | 35 | 400–700 |
The root cause is structural: AI coding tools generate code that compiles and passes tests, but they optimize for functional correctness rather than security context. A model trained on decades of existing internet code learns the prevalence of insecure patterns alongside secure ones — and reproduces them with equal confidence. As AI-generated code's share of all new code climbs toward 41% (GitHub, March 2026), the absolute volume of AI-sourced vulnerabilities scales with it.
The deeper concern: the vulnerability rate is growing faster than the adoption rate, suggesting the tools are getting worse at security relative to their capability growth.
</div>
Documented Security Incidents
AI as Vulnerability Hunter: The Other Side of the Coin
</div>
The Threat Landscape: Ransomware Meets AI
The broader cybersecurity environment compounds the risk of insecure AI-generated code. As of early 2026, there are 124 active ransomware groups — a 49% year-over-year increase. These groups are increasingly using AI to generate phishing lures, analyze codebases for vulnerabilities, and automate lateral movement. The intersection of AI-generated insecure code and AI-accelerated exploitation creates a compounding threat surface.
The AI Slopageddon: Open Source Fights Back
By early 2026, a new phenomenon emerged that open-source maintainers dubbed the "AI Slopageddon" — a flood of low-quality, AI-generated bug reports, pull requests, and security "findings" overwhelming popular projects:
- cURL: Daniel Stenberg reported a deluge of AI-generated vulnerability reports so poor they were "worse than spam" — wasting maintainer time triaging hallucinated CVEs. He began publicly shaming the worst offenders and lobbied HackerOne to penalize AI-slop submissions.
- Ghostty: The terminal emulator project implemented explicit policies rejecting AI-generated contributions after a wave of superficially plausible but fundamentally broken PRs.
- tldraw: The collaborative whiteboard project documented a pattern of AI-generated issues that described bugs that didn't exist, in code paths that didn't exist, with reproduction steps that couldn't work.
The pattern is consistent: AI tools lower the barrier to appearing competent enough to submit contributions, but the submissions lack the understanding that makes them useful. Maintainers are now spending significant time filtering AI slop instead of building software — an ironic cost of the productivity tools meant to help them.
The $1.5 Trillion Technical Debt Problem
Analysts have warned of a potential $1.5 trillion in technical debt by 2027 from AI-generated code:
41% higher code churn — AI code gets rewritten more often
8x increase in duplicated code blocks (GitClear, 2024)
30% of AI suggestions accepted in professional environments
Forrester: 75% of tech leaders will face moderate-to-severe tech debt by 2026
The "Vibe Coding Hangover"
By late 2025, Fast Company reported senior engineers entering "development hell" maintaining vibe-coded systems:
🧬Zombie AppsFunctional but unmaintainable🍝Spaghetti CodeWorks but no coherent structure🚧Complexity CeilingCan't extend without breaking😶Debug ImpossibilityNobody can trace the code they never read
11. The Great Debate
The software community is deeply divided. Understanding the strongest arguments on each side helps you form a nuanced view.
Programming languages have always moved toward higher abstraction. Assembly to C to Python. Each level lets developers focus on intent rather than implementation. Natural language is simply the next layer.
#### "It democratizes creation."
Millions of people have software ideas but lack years of training. Vibe coding lets a nurse build a patient tracking app, a teacher build a classroom tool, a small business owner build inventory management. The expansion of who can create software is historically significant.
#### "The speed advantage is transformative."
A prototype in hours instead of weeks. An MVP in days instead of months. The 25% of YC companies with 95% AI code didn't choose vibe coding for ideology — they chose it because they needed to move fast.
#### "Traditional code isn't as reliable as we pretend."
Human-written code has bugs, security vulnerabilities, and technical debt too. AI-generated code may have different failure modes, but the idea that human code is inherently reliable is a myth.
Software spending is ~60% maintenance. If nobody understands the codebase, maintenance is impossible. You're not saving time — you're borrowing it from the future at a ruinous interest rate.
#### "Security requires understanding, not just testing."
You can test whether a login form works. You can't easily test whether passwords are properly hashed, session tokens are cryptographically secure, or APIs have rate limiting — unless you read the code.
#### "It creates learned helplessness."
Developers who rely entirely on vibe coding lose fundamental skills. When the AI makes a mistake in a novel way, they have no fallback. Fragile teams build fragile systems.
#### "The economics don't work at scale."
Vibe coding is cheap upfront and expensive later. The $1.5 trillion tech debt projection isn't speculation — it's extrapolation from observed code churn, duplication, and architectural degradation.
The most reasonable position — and the one supported by data — is that vibe coding is a powerful tool with a specific and limited appropriate scope.
<div class="callout success">
<div class="callout-icon">✅</div>
<div class="callout-content">
**It excels for:** prototyping, validation, personal tools, learning, hackathons, and small-scale applications with limited security requirements.
</div>
</div>
<div class="callout danger">
<div class="callout-icon">❌</div>
<div class="callout-content">
**It fails for:** production systems at scale, security-sensitive applications, regulated industries, and software that needs multi-year maintenance.
</div>
</div>
**The winning model in 2026:** Vibe code the prototype, then bring in disciplined engineering for the production system. The companies dominating right now — the ones raising at $10B valuations, the ones with $1B ARR in six months — are all betting that this model scales. And the data supports them.
The critics are not wrong about the risks. But they are wrong about the trajectory. Every objection to vibe coding was once made about high-level languages, about frameworks, about cloud computing. The abstraction always wins. The question is never *whether* but *how*.
12. When to Vibe (and When Not To)
🟢 Green Light: Vibe Code Away
- **Prototypes and MVPs** — Validate ideas before investing in production engineering - **Internal tools** — Dashboards, data scripts, one-off analysis - **Personal projects** — Only you use it, only you depend on it - **Learning** — Trying new frameworks, languages, or patterns - **Hackathons** — Speed is everything, longevity is nothing - **UI prototyping** — Design exploration and layout testing - **Automation scripts** — Repetitive tasks that eat your time🟠 Yellow Light: Proceed with Caution
- **Customer-facing apps** — Vibe the prototype, then review and harden - **Small SaaS** — Viable for launch, plan for rewrite - **API integrations** — Fast to build, auth needs human review - **Mobile apps** — UI can be vibe coded; data/security need attention - **Team projects** — Works if one person understands the architecture🔴 Red Light: Don't Vibe Code
- **Financial systems** — Payments, accounting, trading - **Healthcare** — Patient data, clinical decisions, HIPAA - **Auth & authz** — Login systems, permissions, tokens - **Infrastructure** — Server config, network security, deployment - **Regulated industries** — SOX, PCI-DSS, GDPR compliance - **Distributed systems** — Microservices, message queues, cache invalidation - **Cryptography** — Encryption, key management, certificates13. Mastering the Craft: Advanced Techniques
If you're going to vibe code, do it well. These techniques separate productive vibe coders from frustrated ones.
The Art of the Initial Prompt
The single most important factor in vibe coding success. Spend 30 minutes writing a comprehensive description before generating a single line of code.
Weak vs. Strong Prompts
Key Patterns
```
Working: dashboard + project cards + drag-and-drop -> Save/commit BEFORE adding: task checklist feature
</div>
</div>
<div class="expand-section">
<button class="expand-header" onclick="this.parentElement.classList.toggle('open')">
<span class="expand-arrow">▶</span> The "Explain Then Generate" Pattern
</button>
<div class="expand-body">
For complex features, ask the AI to explain its approach before generating code:
```
Before writing any code, explain how you would implement
real-time collaborative editing in this application.
What approach? What trade-offs? Then implement it.
This gives you architectural understanding even in a vibe coding workflow.
</div>
- **Claude Opus 4.6 (via Claude Code)** — Complex reasoning, architecture, large codebases, agent teams for parallel work
GPT-5.2 (via Codex CLI) — Code generation, systematic transformations, sandboxed execution
Gemini 3 Pro / Flash (via Jules or Gemini CLI) — Multimodal (screenshots, diagrams), open-source CLI with skills system
GitHub Copilot Agent Mode — Best for working within existing VS Code workflows with agent capabilities
v0 — React/Next.js UI generation
Bolt.new — Full-stack prototypes you want immediately
**Good:** "When I click 'Add Task', nothing happens. Console shows: `TypeError: Cannot read property 'push' of undefined at TaskList.addTask (app.js:47)`. This started after I added drag-and-drop."
Include: **action** (what you did), **actual** (what happened), **expected** (what should happen), **error** (verbatim), **context** (what changed recently).
14. Building a Sustainable Workflow
Pure vibe coding is fast but fragile. Here's how to build a workflow that's both fast and sustainable.
Vibe code the 80% (UI, boilerplate, standard patterns).
Engineer the 20% (auth, business logic, data integrity, security).
15. The Business of Vibes
Vibe coding isn't just changing how software is built. It's changing the economics of software businesses.
The New Cost Structure
<p style="margin-top:1rem;"><em>This doesn't mean you never need engineers. It means you can validate before investing.</em></p>
The New Archetypes
The Talent Shift
Companies are increasingly hiring for:
Specification specialists — translating business requirements into precise AI prompts
System architects — designing overall structure that AI agents implement
Security engineers — the human review layer catching what AI misses
AI-fluent developers — working effectively with and reviewing AI-generated code
Browse 670+ open AI/LLM positions at LLMHire — the dedicated job board for AI engineers, ML researchers, and prompt engineers.
16. What Comes Next
Now (Early 2026) — Already Happening
AI-native development is the default. 84% of developers use AI tools. The question has shifted from "should we use AI?" to "how do we use it safely?"
Agent teams are here. Claude Code's agent teams feature lets multiple AI agents work in parallel on different aspects of a project. This is the beginning of true AI-human hybrid teams.
The open-source crisis. A January 2026 arXiv paper argues vibe coding threatens the open-source ecosystem: users no longer visit docs, file bugs, or engage with maintainers. Tailwind CSS docs traffic down 40%. Stack Overflow questions in structural decline. How maintainers get paid must change.
Multimodal coding emerges. Voice-driven coding, visual programming interfaces, and screenshot-to-code workflows are entering mainstream tools.
Consolidation is accelerating. The Windsurf saga — a $3B acquisition attempt, Microsoft blocking, Google poaching, Cognition acquiring — signals a market entering its consolidation phase. Wix acquired Base44 for $80M cash. Anthropic acquired Bun.
"Agentic engineering" replaces "vibe coding" for professionals. Karpathy himself has moved beyond the term, now advocating for professionals orchestrating AI agents with oversight, not just vibes.
The IDEsaster wake-up call. 30+ vulnerabilities across every major AI IDE, 24 CVEs, 1.8M developers at risk. AI code is 2.74x more likely to introduce XSS than human code.
AI reviews AI code. Anthropic launched Code Review (March 9, 2026) — a multi-agent system inside Claude Code that automatically catches logic errors in AI-generated code. The "who reviews the reviewer" problem now has a commercial answer.
Claude becomes the enterprise default. Anthropic committed $100 million to the Claude Partner Network (March 12–13, 2026), formalizing partnerships with Accenture, Deloitte, Cognizant, and Infosys. Enterprise AI standardization is no longer theoretical.
Anthropic hits $380B valuation — Claude #1 on App Store. After refusing Pentagon weapons AI contracts, Anthropic became the most disruptive company in the world (TIME, March 2026). Claude overtook ChatGPT as the #1 app on Apple's App Store. The safety-first bet paid off.
Agent documentation tooling matures. DeepLearning.AI (Andrew Ng's team) released Context Hub (March 9, 2026) — an open-source CLI tool that gives coding agents real-time access to current API docs, bridging the gap between training cutoffs and fast-moving APIs.
Near-Term (Late 2026)
- Security tooling catches up. Agentic security tools reviewing AI code in real-time. "Move security into the act of creation."
Standardization emerges. Enterprise governance frameworks for AI-generated code.
Agent orchestration matures. Specialized agents for frontend, backend, testing, security working in concert under a lead agent.
Open-source funding models evolve. New models for compensating maintainers whose libraries power AI-generated code.
Medium-Term (2027-2028)
- Natural language becomes a programming interface. Not replacing code, but a legitimate authoring medium.
AI-human hybrid teams are standard. Every team includes both human engineers and AI agents with defined roles.
The maintenance problem gets addressed. AI tools that understand, refactor, and improve AI-generated code.
Specialized domain models. Finance, healthcare, embedded — each gets domain-specific AI models.
Long-Term (2029+)
- Intent-driven development. Describe outcomes, constraints, quality attributes. AI handles the rest.
Self-healing software. Applications that detect bugs in production and fix themselves.
The abstraction continues. The role evolves from "code author" to "system designer and quality guardian."
🔮**The fundamental question:** AI will write an increasing share of the world's software. The question isn't whether — it's how we ensure it's secure, reliable, and maintainable. The developers who thrive will master both modes: vibe code a prototype on Saturday, architect a production system on Monday.Conclusion
In twelve months, vibe coding went from a tweet to a dictionary entry to a multi-billion-dollar industry. Cursor alone is valued at $29.3 billion. Lovable at $6.6 billion. A vibe-coded startup sold for $80 million. GitHub Copilot has 4.7 million paid subscribers. Now, in early 2026, it has become the defining methodology of a new era in software development.The numbers speak for themselves: Claude Code reached $1B ARR in six months. Cursor surpassed $1B ARR at a $29.3B valuation. Devin surpassed $155M ARR at a $10.2B valuation. GitHub Copilot crossed 4.7 million paid users. These are not experimental products. This is the new infrastructure of software creation.
The promise is real and accelerating: agent teams working in parallel, multimodal coding interfaces, and tools so capable that 75% of Replit's AI users write zero code themselves. The barrier between idea and working software has never been lower.
The challenges are evolving too: the open-source ecosystem faces an existential funding question, security remains a real concern with 69 vulnerabilities found across just 15 AI-built apps, and the "vibe coding hangover" of unmaintainable codebases is a documented phenomenon.
But the answer has become clear. Vibe coding is not a fad to be dismissed or a silver bullet to be worshipped. It is a powerful methodology that belongs in every developer's toolkit. The developers who thrive in 2026 and beyond will be those who master the spectrum — knowing when to vibe code a prototype on Saturday, when to collaborate with agents on Monday, and when to insist on human-reviewed engineering for the critical 20%.
The vibes are real. The exponentials are real. The opportunity is unprecedented.
Embrace the vibes. Engineer the foundations. Build the future.
Chapter 17: The Complete Prompt Library
230+ production-ready prompts for every stage of AI-native development. Updated monthly.
How to Use This Library
Each prompt is tagged with:
- Difficulty: Beginner / Intermediate / Advanced / Expert
- Tool: Which AI tools it works best with
- Time: Expected completion time
- Category: What type of work it handles
The prompts are designed to be copy-pasted directly. Customize the bracketed [sections] for your specific project.
Category 1: Project Kickoff Prompts
1.1 The Complete Spec Prompt (Expert)
Tool: Claude Code, Cursor Composer | Time: 30-60 min generation
I'm building [product name], a [type of application] for [target audience].
## Product Vision
[One-sentence description of what this product does and why it matters]
## Target Users
- Primary: [who, age range, technical skill level, key pain point]
- Secondary: [who, why they'd use it]
## Core Features (MVP - Priority Order)
1. [Feature 1]: [User story: "As a [user], I want to [action] so that [benefit]"]
2. [Feature 2]: [User story]
3. [Feature 3]: [User story]
## Data Model
- [Entity 1]: [fields and types]
- [Entity 2]: [fields and types]
- Relationships: [Entity 1] has many [Entity 2], etc.
## Design Direction
- Style: [modern/minimal/playful/corporate/brutalist]
- Color palette: [primary hex, accent hex, background]
- Typography: [sans-serif/serif/mono, reference sites]
- Layout: [single page / multi-page / dashboard / wizard]
- Responsive: [mobile-first / desktop-first / both]
## Technical Stack
- Framework: [Next.js / React / Vue / Svelte / vanilla]
- Styling: [Tailwind / CSS Modules / styled-components]
- Database: [Supabase / Firebase / localStorage / Prisma+PostgreSQL]
- Auth: [Supabase Auth / NextAuth / Clerk / none]
- Hosting: [Vercel / Netlify / Railway]
## What Success Looks Like
- A user can [core workflow] in under [N] steps
- The app loads in under [N] seconds
- [Specific measurable outcome]
## What This Is NOT
- Not a [common misunderstanding]
- Don't include [feature to avoid]
- Don't over-engineer [aspect]
Build the complete MVP. Start with the data model, then core layout, then features in priority order.
1.2 The Weekend Prototype Prompt (Beginner)
Tool: Bolt.new, Lovable, Replit Agent | Time: 15-30 min
Build a [type of app] that solves this problem: [describe the pain point in one sentence].
The main user is [who] and they need to:
1. [Core action 1]
2. [Core action 2]
3. [Core action 3]
Design: Clean and modern. Use [color] as the accent color. Dark mode preferred.
Store data in localStorage.
Make it work on mobile.
Keep it simple. I'd rather have 3 features that work perfectly than 10 that are buggy.
1.3 The "Clone This" Prompt (Intermediate)
Tool: Cursor, Claude Code | Time: 1-2 hours
Build a simplified version of [well-known app, e.g., Trello/Notion/Slack].
Include ONLY these features from the original:
1. [Feature to clone]
2. [Feature to clone]
3. [Feature to clone]
DO NOT include: [features to skip]
Match the general layout and UX patterns of the original but use your own design.
Use [tech stack]. Deploy-ready for Vercel.
Focus on making the core interaction feel as smooth as the original.
1.4 The Landing Page Prompt (Beginner)
Tool: v0, Bolt.new | Time: 15-30 min
Create a conversion-optimized landing page for [product name].
Product: [One line description]
Target audience: [Who would buy this]
Price: [Price point or "Free"]
Sections (in order):
1. Hero: Headline "[compelling headline]", subheadline "[supporting text]", CTA button "[button text]"
2. Problem: 3 pain points the audience faces
3. Solution: How the product solves each pain point (with icons or illustrations)
4. Social proof: [testimonials / stats / logos / "As seen in"]
5. Features: 3-6 key features with brief descriptions
6. Pricing: [pricing tiers if applicable]
7. FAQ: 4-5 common questions with answers
8. Final CTA: Repeat the main call-to-action
Design: Professional, trustworthy. Primary color [hex]. Lots of whitespace.
Mobile-responsive. Fast-loading (no heavy images).
Include Open Graph meta tags for social sharing.
Category 2: Feature Addition Prompts
2.1 Authentication System (Advanced)
Tool: Claude Code, Cursor | Time: 1-2 hours
Add a complete authentication system to this [framework] application.
Requirements:
- Email/password signup with email verification
- Login with session management (HTTP-only cookies, not localStorage)
- Password requirements: minimum 8 chars, 1 uppercase, 1 number, 1 special char
- "Forgot password" flow with email reset link (expires in 1 hour)
- "Remember me" option (extends session to 30 days, default is 24 hours)
- Rate limiting: max 5 failed attempts per IP per 15 minutes, then 30-min lockout
- CSRF protection on all auth forms
- Secure headers: HSTS, X-Content-Type-Options, X-Frame-Options
Auth provider: [Supabase Auth / NextAuth / Clerk / custom JWT]
Protected routes: [list routes that require auth]
Public routes: [list routes that don't require auth]
After login, redirect to [dashboard/home/previous page].
Show clear error messages for: wrong password, account not found, account locked, email not verified.
Write tests for: successful login, failed login, signup validation, session expiry, rate limiting.
2.2 Payment Integration (Advanced)
Tool: Claude Code | Time: 2-3 hours
Add [Stripe / Paddle] subscription billing to this application.
Products:
- Free tier: [what's included, usage limits]
- Pro tier: $[price]/month - [what's included]
- [Optional: Enterprise tier: $[price]/month - [what's included]]
Implementation:
1. Pricing page showing all tiers with feature comparison
2. Checkout flow: user selects plan -> [Stripe Checkout / Paddle Overlay] -> redirect to success page
3. Webhook handler for: subscription.created, subscription.updated, subscription.cancelled, invoice.payment_failed
4. User dashboard showing: current plan, next billing date, usage this period, upgrade/downgrade buttons
5. Usage tracking: count [what metric] per billing period, enforce limits on free tier
6. Graceful downgrade: when subscription cancelled, access continues until period end
7. Failed payment handling: 3 retry attempts over 7 days, then downgrade to free
Store subscription status in [Supabase / database].
Add middleware to check subscription status on protected API routes.
Show upgrade prompts when free users hit limits.
Environment variables needed:
- [STRIPE_SECRET_KEY / PADDLE_API_KEY]
- [STRIPE_WEBHOOK_SECRET / PADDLE_WEBHOOK_SECRET]
- [STRIPE_PRO_PRICE_ID / PADDLE_PRO_PRICE_ID]
2.3 Real-Time Features (Advanced)
Tool: Claude Code, Cursor | Time: 2-4 hours
Add real-time [collaboration / notifications / live updates] to this application.
What should update in real-time:
- [Specific data that changes: "new messages", "task status changes", "user presence"]
Technology: [Supabase Realtime / Socket.io / Pusher / Server-Sent Events]
Requirements:
- Changes made by User A appear for User B within [1 second / 500ms]
- Show [typing indicators / presence dots / live cursors] for active users
- Handle disconnection gracefully: show "reconnecting..." banner, auto-reconnect with exponential backoff
- Dedup messages that arrive during reconnection
- Don't poll - use persistent connections
- Fallback to polling if WebSocket connection fails
Optimize for:
- [N] concurrent users per [room / document / channel]
- Messages/updates of approximately [size] bytes each
- Mobile networks with intermittent connectivity
Show connection status indicator (green dot = connected, yellow = reconnecting, red = offline).
2.4 Search and Filter System (Intermediate)
Tool: Any | Time: 30-60 min
Add search and filtering to the [items/products/posts] list in this application.
Search:
- Full-text search across: [field 1], [field 2], [field 3]
- Debounced input (300ms delay before searching)
- Show "X results for 'query'" count
- Highlight matching text in results
- Empty state: "No results for 'query'. Try different keywords."
Filters:
- [Filter 1]: [type: dropdown/checkbox/range] with options [list options]
- [Filter 2]: [type] with options [list options]
- [Filter 3]: [type] with options [list options]
- Date range: from/to date pickers
- Sort by: [option 1 / option 2 / option 3], ascending/descending
Behavior:
- Filters combine with AND logic (search + filter1 + filter2)
- Show active filter count as badge on filter button
- "Clear all filters" button when any filter is active
- URL params reflect current filters (shareable filtered views)
- Persist last-used filters in localStorage
Performance:
- Client-side filtering for under 1000 items
- Server-side (API) filtering for larger datasets
- Show loading skeleton while filtering
Category 3: UI/UX Prompts
3.1 Dashboard Layout (Intermediate)
Tool: v0, Cursor | Time: 30-60 min
Build a dashboard layout for [application type].
Layout:
- Left sidebar: navigation menu (collapsible on mobile, icons + labels)
- Top bar: user avatar + dropdown menu, notification bell with count badge, search bar
- Main content area: responsive grid that adapts from 1 to 3 columns
Sidebar navigation items:
1. [Icon] Dashboard (home)
2. [Icon] [Section 1]
3. [Icon] [Section 2]
4. [Icon] [Section 3]
5. [Icon] Settings
6. [Icon] Help
Dashboard home shows:
- Row 1: 4 stat cards ([Metric 1]: [value], [Metric 2]: [value], etc.)
- Row 2: Main chart (line chart showing [metric] over [time period]) + recent activity feed
- Row 3: Quick actions grid (3-4 action cards with icons)
Design: [light/dark] theme. Accent color: [hex].
Use Tailwind CSS. Smooth transitions on sidebar toggle.
Mobile: sidebar becomes a hamburger drawer overlay.
3.2 Form with Validation (Beginner)
Tool: Any | Time: 15-30 min
Build a multi-step form for [purpose, e.g., "user onboarding", "job application", "event registration"].
Steps:
1. [Step name]: Fields: [field1 (type, required?), field2, field3]
2. [Step name]: Fields: [field4, field5, field6]
3. [Step name]: Review all entered data + submit button
Validation:
- Email: valid format + show error immediately on blur
- Phone: format as (XXX) XXX-XXXX as user types
- Required fields: show red border + error message
- [Custom validation]: [describe rule]
UX:
- Progress indicator showing current step (1/3, 2/3, 3/3)
- "Back" and "Next" buttons (Next disabled until current step is valid)
- "Save as draft" option (localStorage)
- Smooth slide transition between steps
- Auto-focus first field on each step
- Show success animation on submit
Accessible: proper labels, aria attributes, keyboard navigation (Tab through fields, Enter to submit).
3.3 Data Table (Intermediate)
Tool: Any | Time: 30-60 min
Build a data table component for displaying [data type, e.g., "user list", "order history", "inventory"].
Columns:
1. [Column]: [type: text/number/date/status/avatar] - [width: narrow/medium/wide]
2. [Column]: [type] - [width]
3. [Column]: [type] - [width]
4. Actions: Edit, Delete, [custom action]
Features:
- Sort by clicking column headers (asc/desc, show arrow indicator)
- Select rows with checkboxes (select all, bulk actions)
- Inline editing: click cell to edit, Enter to save, Escape to cancel
- Pagination: 10/25/50 per page selector, page numbers, total count
- Responsive: on mobile, switch to card layout (one card per row)
- Empty state: illustration + "No [items] yet. Create your first one."
- Loading state: skeleton rows while data loads
Styling: Clean borders, alternating row colors, hover highlight.
Status column: colored badges (green=active, yellow=pending, red=inactive).
Category 4: API and Backend Prompts
4.1 REST API Scaffold (Advanced)
Tool: Claude Code | Time: 1-2 hours
Build a REST API for [application] with these resources:
Resources:
1. [Resource 1, e.g., "Users"]:
- Fields: [id, name, email, role, created_at, updated_at]
- Endpoints: GET /api/users, GET /api/users/:id, POST /api/users, PUT /api/users/:id, DELETE /api/users/:id
2. [Resource 2]:
- Fields: [list fields]
- Endpoints: [list CRUD endpoints]
- Relationships: [belongs_to Resource1, has_many Resource3]
Response format (all endpoints):
Success: { data: {...}, meta: { page, limit, total } }
Error: { error: { code: "VALIDATION_ERROR", message: "Email is required", details: [...] } }
Requirements:
- Input validation with descriptive error messages
- Pagination: ?page=1&limit=20 (default limit=20, max=100)
- Filtering: ?status=active&role=admin
- Sorting: ?sort=created_at&order=desc
- Rate limiting: 100 requests per minute per IP
- CORS configured for [allowed origins]
- Request logging (method, path, status, duration)
Auth: Bearer token in Authorization header.
- Public endpoints: [list]
- Authenticated endpoints: [list]
- Admin-only endpoints: [list]
Framework: [Next.js API routes / Express / Fastify / Hono]
Database: [Supabase / Prisma / Drizzle]
4.2 Database Schema Design (Advanced)
Tool: Claude Code | Time: 30-60 min
Design a database schema for [application type].
Entities:
1. [Entity 1]: [description of what it represents]
- Required fields: [list]
- Optional fields: [list]
- Unique constraints: [list]
2. [Entity 2]: [description]
- Fields: [list]
- References: [Entity 1] (one-to-many / many-to-many)
Business rules:
- [Rule 1, e.g., "A user can only have one active subscription"]
- [Rule 2, e.g., "Orders must have at least one line item"]
- [Rule 3, e.g., "Soft delete for users, hard delete for sessions"]
Generate:
1. SQL migration file with CREATE TABLE statements
2. Indexes for common query patterns: [list queries, e.g., "find users by email", "get orders by date range"]
3. Row-level security policies (if Supabase)
4. Seed data: 10-20 realistic sample records per table
5. TypeScript types matching the schema
Optimize for: [read-heavy / write-heavy / balanced]
Database: [PostgreSQL / MySQL / SQLite]
Category 5: Testing and Quality Prompts
5.1 Comprehensive Test Suite (Advanced)
Tool: Claude Code | Time: 2-4 hours
Write a comprehensive test suite for this [application/module].
Testing framework: [Vitest / Jest / Playwright / Cypress]
Coverage targets:
- Unit tests: all utility functions and business logic (aim for 90%+)
- Integration tests: all API endpoints (happy path + error cases)
- Component tests: all interactive components (user events + state changes)
- E2E tests: [list 3-5 critical user flows]
For each test, include:
- Clear descriptive name: "should [expected behavior] when [condition]"
- Arrange-Act-Assert structure
- Realistic test data (not "test123" or "foo bar")
- Error case coverage (invalid input, timeout, auth failure)
- Edge cases ([list specific edge cases for this app])
Mock strategy:
- External APIs: mock with [MSW / jest.mock / vi.mock]
- Database: use [test database / in-memory / fixtures]
- Time-dependent tests: mock Date.now()
- File system: use temp directories
Run the complete suite after writing. Fix any failures.
Generate a coverage report.
5.2 Security Audit Prompt (Expert)
Tool: Claude Code | Time: 1-2 hours
Perform a security audit of this codebase. Check for:
1. Authentication & Authorization:
- Are passwords hashed with bcrypt/argon2 (not MD5/SHA)?
- Are sessions stored securely (HTTP-only cookies, not localStorage)?
- Is CSRF protection implemented on state-changing requests?
- Are API keys and secrets in environment variables (not hardcoded)?
- Are authorization checks on every protected endpoint (not just frontend)?
2. Input Validation:
- Is all user input validated server-side (not just client-side)?
- Are SQL queries parameterized (no string concatenation)?
- Is HTML output sanitized to prevent XSS?
- Are file uploads validated (type, size, name)?
- Are URL redirects validated against an allowlist?
3. Data Protection:
- Is sensitive data encrypted at rest?
- Is HTTPS enforced (HSTS headers)?
- Are API responses filtered (no password hashes, internal IDs leaking)?
- Is PII handled according to GDPR/CCPA requirements?
- Are error messages generic (no stack traces to users)?
4. Infrastructure:
- Are dependencies up to date (no known CVEs)?
- Are security headers set (CSP, X-Frame-Options, etc.)?
- Is rate limiting configured on auth and API endpoints?
- Are CORS origins restricted (not "*")?
- Are logs sanitized (no passwords or tokens in logs)?
For each issue found:
- Severity: Critical / High / Medium / Low
- Location: file path and line number
- Description: what's wrong and why it matters
- Fix: specific code change to resolve it
- Test: how to verify the fix works
Prioritize fixes by severity. Implement Critical and High fixes immediately.
Category 6: Refactoring and Optimization Prompts
6.1 Performance Optimization (Advanced)
Tool: Claude Code | Time: 1-2 hours
This application is slow. Analyze and optimize performance.
Symptoms:
- [Specific symptom: "initial page load takes 4+ seconds"]
- [Specific symptom: "scrolling is janky with 500+ items"]
- [Specific symptom: "API response takes 2+ seconds"]
Investigate and fix:
1. Bundle size: analyze with [next/bundle-analyzer or similar], remove unused dependencies, implement code splitting
2. Rendering: identify unnecessary re-renders, add React.memo/useMemo/useCallback where appropriate
3. Data fetching: implement caching, pagination, reduce payload sizes
4. Images: lazy load below-fold images, use next/image or responsive srcset, serve WebP
5. Database: add missing indexes, optimize N+1 queries, implement connection pooling
6. Network: enable gzip/brotli, set proper cache headers, minimize HTTP requests
For each optimization:
- Before: [metric measurement]
- After: [expected improvement]
- Method: [specific code change]
Run Lighthouse audit before and after. Target scores: Performance >90, Accessibility >95.
6.2 Code Cleanup (Intermediate)
Tool: Claude Code, Cursor | Time: 1-2 hours
Clean up this codebase without changing any functionality.
Tasks:
1. Remove dead code: unused imports, unreachable functions, commented-out blocks
2. Consolidate duplicated logic: find similar code patterns and extract shared utilities
3. Fix naming: rename variables/functions that don't describe their purpose
4. Organize file structure: group related files, consistent naming conventions
5. Add TypeScript types: replace 'any' with proper types, add interfaces for data shapes
6. Fix linting issues: run [ESLint / Prettier] and fix all warnings/errors
7. Update dependencies: check for outdated packages, update non-breaking versions
8. Add JSDoc comments to exported functions (not internal helpers)
Rules:
- Make small, focused commits (one type of change per commit)
- Run tests after each change to ensure nothing breaks
- Don't refactor code that has pending changes or open PRs
- Keep the diff readable: don't auto-format unrelated files
Category 7: Deployment and DevOps Prompts
7.1 Production Deployment Checklist (Advanced)
Tool: Claude Code | Time: 1-2 hours
Prepare this application for production deployment on [Vercel / AWS / Railway].
Pre-deployment checklist:
1. Environment variables: create .env.example with all required vars (no values), verify all are set in [hosting platform]
2. Error tracking: set up [Sentry / LogRocket / Bugsnag] for runtime error monitoring
3. Analytics: add [Vercel Analytics / Google Analytics / Plausible] for usage tracking
4. SEO: verify meta tags, Open Graph, Twitter cards, sitemap.xml, robots.txt
5. Performance: run Lighthouse, fix any scores below 80
6. Security: run npm audit, fix critical/high vulnerabilities, verify security headers
7. Database: verify connection pooling, set up backups if applicable
8. Caching: configure CDN caching headers, implement stale-while-revalidate for API routes
9. Monitoring: set up uptime monitoring (e.g., UptimeRobot, Checkly)
10. Domain: configure custom domain, SSL, www redirect
Create a deployment script or CI/CD pipeline that:
- Runs tests
- Runs linter
- Builds the application
- Deploys to [platform]
- Runs smoke tests against the deployed URL
- Notifies [Slack / Discord / email] on success/failure
Category 8: AI Agent Orchestration Prompts (Expert)
8.1 Multi-Agent Task Decomposition
Tool: Claude Code (subagents) | Time: 2-4 hours
I need to [describe large task, e.g., "add a complete user profile system with settings, avatar upload, activity history, and notification preferences"].
Decompose this into subtasks that can be worked on in parallel:
1. Data layer: schema changes, migrations, API endpoints
2. UI components: form components, display components, layouts
3. Business logic: validation rules, permission checks, notification triggers
4. Tests: unit tests, integration tests, E2E tests
For each subtask:
- Define the interface/contract (inputs, outputs, data shapes)
- List dependencies on other subtasks
- Identify which can run in parallel vs. must be sequential
Then implement each subtask, integrating them at the defined interfaces.
Run the full test suite after integration to catch any contract mismatches.
8.2 Codebase Analysis and Improvement Plan
Tool: Claude Code | Time: 1-2 hours
Analyze this entire codebase and create an improvement plan.
Evaluate:
1. Architecture: Is the structure scalable? Are concerns properly separated?
2. Code quality: Consistency, readability, duplication, complexity (cyclomatic)
3. Error handling: Are errors caught, logged, and presented well?
4. Testing: Coverage, quality of tests, missing edge cases
5. Security: Common vulnerabilities (OWASP Top 10 applicable ones)
6. Performance: Obvious bottlenecks, missing optimizations
7. Developer experience: Build time, hot reload, debugging ease
Output:
- Score each category 1-10 with specific evidence
- Top 5 improvements ranked by impact/effort ratio
- Specific action items for each improvement
- Estimated time for each action item
Don't fix anything yet. Just analyze and plan.
Category 9: Content and Data Prompts
9.1 Seed Data Generator (Beginner)
Tool: Any | Time: 15-30 min
Generate realistic seed data for this application.
Data needed:
- [N] [entity type, e.g., "users"] with: [fields]
- [N] [entity type, e.g., "products"] with: [fields]
- [N] [entity type, e.g., "orders"] with: [fields]
Rules:
- Use realistic names (not "Test User 1")
- Dates spread across the last [time period]
- Prices/amounts in realistic ranges for [industry]
- Status distribution: [e.g., "60% active, 30% pending, 10% cancelled"]
- Include edge cases: [e.g., "one user with no orders, one product with 0 stock"]
- Relationships should be consistent (orders reference real user IDs and product IDs)
Output format: [JSON / SQL INSERT statements / TypeScript constants / CSV]
9.2 API Documentation Generator (Intermediate)
Tool: Claude Code | Time: 30-60 min
Generate comprehensive API documentation for all endpoints in this application.
For each endpoint, document:
- Method and path (e.g., GET /api/users/:id)
- Description (one sentence)
- Authentication required? (yes/no, what type)
- Request: headers, query params, body schema with types and validation rules
- Response: status codes, body schema for success and each error case
- Example request (curl command)
- Example response (JSON)
Format: [Markdown / OpenAPI 3.0 spec / Swagger]
Include a table of contents.
Group endpoints by resource.
Add rate limiting info if applicable.
Category 10: Platform-Specific Prompts
10.1 Chrome Extension (Advanced)
Tool: Claude Code | Time: 2-4 hours
Build a Chrome Extension (Manifest V3) that [core functionality].
Features:
- Popup: [describe popup UI and what it shows]
- Content script: [what it does on web pages, e.g., "highlights [elements]"]
- Background service worker: [what it handles, e.g., "API calls, storage sync"]
- Options page: [settings the user can configure]
Permissions needed: [activeTab, storage, tabs, etc. - minimize permissions]
Storage:
- Use chrome.storage.sync for: [settings that sync across devices]
- Use chrome.storage.local for: [data that stays local]
Communication:
- Content script <-> Background: chrome.runtime.sendMessage
- Popup <-> Background: direct access to chrome.storage
Include:
- manifest.json with all required fields
- Icon set (16x16, 48x48, 128x128) - use simple colored SVG converted to PNG
- README with installation instructions (load unpacked)
- Privacy policy text (required for Chrome Web Store submission)
Test on these sites: [list 3-5 target websites]
10.2 CLI Tool (Intermediate)
Tool: Claude Code | Time: 1-2 hours
Build a command-line tool in [Node.js / Python / Go / Rust] that [core functionality].
Commands:
- [tool] init: [what it sets up]
- [tool] [command 1] [args]: [what it does]
- [tool] [command 2] [args]: [what it does]
- [tool] --help: show all commands with descriptions
Features:
- Colored output (green for success, red for errors, yellow for warnings)
- Progress bars for long operations
- Interactive prompts for required input (with defaults)
- Config file (~/.toolrc or .toolrc in project root)
- --verbose flag for debug output
- --json flag for machine-readable output
- Meaningful exit codes (0 success, 1 error, 2 usage error)
Error handling:
- Clear error messages with suggested fixes
- Never show stack traces (unless --verbose)
- Graceful handling of Ctrl+C
Package for distribution via [npm / pip / brew / cargo].
Include README with installation, usage examples, and config reference.
Prompt Patterns Reference Card
The Constraint Sandwich
Do [action].
Include: [must-have list]
Do NOT include: [exclusion list]
Match existing: [patterns/styles to follow]
The Iterative Refinement
[After seeing initial output]
Keep: [what works]
Change: [what needs to change]
Add: [what's missing]
Remove: [what's unnecessary]
Don't touch: [what shouldn't change]
The Context Dump
Here's the current state:
- File: [path] does [function]
- File: [path] does [function]
- The bug is in: [location]
- Error message: [exact text]
- This worked before I: [recent change]
- I've already tried: [attempts]
Fix the bug without changing [protected areas].
The Scope Lock
ONLY modify [specific files/functions].
Do NOT touch: [protected files]
Do NOT change: [protected behavior]
Do NOT add: [unwanted additions]
Keep the diff as small as possible.
The Quality Gate
Before considering this done:
1. All existing tests pass
2. New tests cover: [specific scenarios]
3. No TypeScript errors (strict mode)
4. No ESLint warnings
5. Lighthouse performance score > [N]
6. [Custom quality criterion]
March 2026 Additions: Autonomous Mode Prompts
New prompts for Claude Code Auto Mode, MCP workflows, and agentic build patterns.
The Auto Mode Task Brief (Expert)
Tool: Claude Code (Auto Mode enabled) | Time: Runs unattended 15-120 min
Use this when handing a scoped task to Claude Code in Auto Mode. The structure defines scope, acceptance criteria, and what Claude should NOT touch — so the autonomous run has clear boundaries.
# Task: [Brief title]
## Scope
Working directory: [path]
Files allowed to modify: [list or glob pattern]
Files that must NOT change: [list — tests, migrations, config, etc.]
## Objective
[One sentence: what should be different when you're done]
## Acceptance Criteria
- [ ] [Specific, testable outcome 1]
- [ ] [Specific, testable outcome 2]
- [ ] All existing tests still pass
- [ ] No TypeScript errors (strict)
- [ ] No new ESLint warnings
## What This Is NOT
- Do not refactor unrelated code
- Do not add features beyond the objective
- Do not modify [specific protected area]
## Summary at End
When complete, write a brief summary of:
1. Every file changed and why
2. Any decisions you made and the tradeoff
3. Anything you're uncertain about
4. Tests I should run to verify
Why it works: The summary request at the end transforms Auto Mode from "black box" to "async colleague" — you wake up to a log of decisions, not just a diff.
The Claude Code Channels Handoff (Advanced)
Tool: Claude Code + Channels (Telegram/Discord integration) | Time: N/A — async coordination
Claude Code Channels (March 2026) lets you send instructions to a running Claude Code session from your phone. Use this prompt structure to create async checkpoints that Claude will pause for:
## Background Task with Mobile Checkpoints
Start the following task: [task description]
## Checkpoint Rules
Pause and send me a Telegram message at these points:
1. After completing the initial analysis — summarize what you found
2. Before any destructive action (delete, drop, overwrite) — describe it and wait
3. If you hit a blocker you can't resolve — describe the issue
4. When complete — summary of all changes
## Proceed autonomously between checkpoints.
Do not pause for routine read/write/test operations.
Why it works: You define the decision points where human judgment matters, and let Claude handle the execution in between. Run overnight builds and get Telegram pings when action is needed.
The Security Scope Guard (Advanced)
Tool: Claude Code (any mode) | Time: Prepend to any task involving auth, payments, or data
Add this as a preamble whenever Claude Code will touch security-sensitive code. It activates extra caution without requiring manual review of every action:
## Security Scope Guard — Activate Before This Task
This task involves security-sensitive code: [auth / payments / user data / API keys]
Before every change to [auth / payment / data] files:
1. State what vulnerability pattern you are avoiding
2. Confirm input validation is present
3. Confirm secrets are not hardcoded
4. Confirm error messages don't leak internal state
Never:
- Log authentication tokens or session IDs
- Return detailed error messages to the client
- Use string concatenation in SQL queries
- Disable CORS for any reason
- Store credentials in localStorage
If you see existing code that violates the above: flag it in your summary, do not silently fix it (I need to know it existed).
Now proceed with: [actual task]
Why it works: Security reviews after the fact miss context. This prompt embeds security review into the generation loop — Claude checks each change against the rules as it writes, not after.
Category 26: MCP Integration Prompts (Added March 2026)
Model Context Protocol (MCP) is now the standard way to give AI coding assistants persistent context and tool access. These prompts help you integrate MCP correctly.
26.1 MCP Server Setup Prompt (Intermediate)
Tool: Claude Code | Time: 30-60 min
Set up an MCP (Model Context Protocol) server for my project that exposes the following tools to AI assistants:
## Tools to Expose
1. [Tool 1 name]: [what it does — e.g., "read_project_data: reads the projects.json registry"]
2. [Tool 2 name]: [what it does — e.g., "run_health_check: pings all deployment URLs"]
3. [Tool 3 name]: [what it does — e.g., "get_recent_errors: reads the last 50 error log lines"]
## Implementation Requirements
- Use the @modelcontextprotocol/sdk package
- Implement as stdio transport (not HTTP) for local use
- Each tool must have a clear JSON schema for inputs
- Each tool must return structured JSON output
- Add error handling that returns helpful error messages, not stack traces
- Include a test script that exercises each tool
## Configuration
Generate the MCP configuration block for claude_desktop_config.json:
{
"mcpServers": {
"[server-name]": {
"command": "node",
"args": ["path/to/server.js"]
}
}
}
## Context This Will Enable
When this MCP server is active, an AI assistant will be able to [describe what new capabilities this enables for your workflow].
Build the complete MCP server. Start with the tool definitions, then the handlers, then the test script.
26.2 Claude Code MCP Context Prompt (Advanced)
Tool: Claude Code | Time: 15 min
I'm setting up a project-level MCP context file so Claude Code has persistent context about my project without me having to re-explain it every session.
Create a CLAUDE.md file that covers:
## Project Identity
- Name: [project name]
- Purpose: [one sentence]
- Stack: [tech stack]
- Current status: [active development / maintenance / paused]
## Key Files and Their Purpose
- [file path]: [what it contains and when to read it]
- [file path]: [what it contains and when to read it]
## Commands
- Build: [command]
- Dev server: [command]
- Test: [command]
- Deploy: [command]
## Architecture Decisions That Are NOT Up for Discussion
- [Decision 1]: [why it was made — do not suggest alternatives]
- [Decision 2]: [why it was made]
## Known Issues (Don't Re-Investigate)
- [Issue 1]: [known limitation, not a bug to fix]
## My Workflow
- I prefer [file-by-file / whole-feature] implementations
- Always [run tests / lint / build] before marking a task done
- When in doubt, [ask / make conservative choice / make opinionated choice]
Make the CLAUDE.md scannable and under 200 lines.
26.3 Next.js Secure Middleware Pattern (Intermediate) (Security-critical — post-CVE-2025-29927)
Tool: Claude Code, Cursor | Time: 20 min
Add authentication to my Next.js app using the secure dual-layer pattern (required post-CVE-2025-29927).
## Protected Routes
- /dashboard/:path* — requires authenticated user
- /api/protected/:path* — requires authenticated user, returns 401 JSON (not redirect)
- /admin/:path* — requires authenticated user with admin role
## Auth Provider
I'm using: [NextAuth v5 / Supabase Auth / Clerk / custom JWT]
## Implementation Rules
1. Middleware ONLY for UX redirects (fast redirect to /login for protected pages)
2. Every /api/protected route MUST verify the session server-side independently
3. NEVER rely on middleware as the sole auth gate for API routes
4. Include the x-middleware-subrequest header strip check as a comment
## Pattern to Implement
For each protected API route:
\`\`\`typescript
// DO NOT rely on middleware alone — verify here
const session = await getServerSession(authOptions)
if (!session) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
}
\`\`\`
Generate:
1. middleware.ts with the correct matcher config and a comment explaining it is NOT a security boundary
2. A shared auth utility function (lib/auth-guard.ts) that API routes can call
3. One example protected API route using the utility
4. A test that verifies the API route returns 401 when no session exists
Category 27: Multi-Agent Orchestration Prompts (Cursor 3 / Claude Code Teams)
Added April 7, 2026 — covering the new parallel multi-agent workflows enabled by Cursor 3's Agents Window and Claude Code's Teams feature.
27.1 The Agent Task Decomposer (Advanced)
Tool: Cursor 3 Agents Window, Claude Code | Time: 5 min setup → autonomous execution
Use this prompt to break a large feature into parallelizable agent tasks before opening the Agents Window.
I need to implement [feature name] in my [type of app].
Decompose this into parallel agent tasks using this format:
- Each task must be completable in under 30 minutes
- Tasks must have clear success criteria (how to verify it's done)
- Identify dependencies (which tasks must complete before others can start)
- Assign a suggested agent focus for each (e.g., "backend agent", "test agent", "UI agent")
Feature to decompose:
[Describe the feature in 3-5 sentences. Include: what it does, the data it uses, and any API/external integrations.]
Output format:
## Agent Task Plan
### Wave 1 (parallel, no dependencies)
- Task A [Agent role]: [Goal] | Success: [How to verify] | Files: [which files/modules]
- Task B [Agent role]: [Goal] | Success: [How to verify] | Files: [which files/modules]
### Wave 2 (depends on Wave 1)
- Task C [Agent role]: [Goal] | Success: [How to verify] | Depends on: [Task A output]
27.2 The Single Agent Task Charter (Intermediate)
Tool: Cursor 3 Agents Window, Claude Code | Time: 2 min per agent
Paste this into each individual agent in the Agents Window to give it a focused, well-bounded mission.
## Agent Charter
**Role**: [Backend Engineer / Frontend Developer / QA Engineer / Security Reviewer / Docs Writer]
**Mission**: [One sentence: what this agent will produce]
**Scope**: [Specific files, modules, or directories this agent is allowed to touch]
**Off-limits**: [Files/systems this agent must not modify]
**Success Criteria** (all must be true when you're done):
1. [Specific, verifiable outcome]
2. [Specific, verifiable outcome]
3. Tests pass: [which test command to run]
**Handoff**: When complete, write a summary to `agent-handoff-[role].md` covering:
- What you built
- Any decisions you made and why
- What the next agent needs to know
- Any concerns or edge cases you noticed
**Context**: [Brief description of the larger feature this fits into]
Do not interrupt me unless you are truly blocked. Make reasonable decisions independently.
27.3 The Multi-Agent Review Prompt (Advanced)
Tool: Cursor 3 Agents Window, Claude Code | Time: 10-15 min supervised execution
Use this to spin up a dedicated review agent that audits another agent's output before you merge it.
## Review Agent Mission
You are a senior code reviewer. You did NOT write the code you are reviewing.
**Author agent**: [which agent produced this code, e.g., "Backend Agent — implemented the payment webhook handler"]
**Files to review**: [list the files]
**Success criteria of the original task**: [paste the success criteria from the original agent's charter]
Your review checklist:
1. **Correctness**: Does the code do what the task charter required?
2. **Edge cases**: What inputs could break this? (empty arrays, null values, concurrent requests, network failures)
3. **Security**: Any injection risks, missing auth checks, exposed secrets, or unvalidated inputs?
4. **Performance**: Any N+1 queries, missing indexes, synchronous blocking calls, or memory leaks?
5. **Tests**: Are the tests meaningful? Do they cover the stated success criteria?
6. **Handoff quality**: Is the agent-handoff file accurate and useful for downstream agents?
Output a structured review:
## Review Summary
**Overall verdict**: APPROVE / REQUEST_CHANGES / BLOCK
**Confidence**: High / Medium / Low
### Issues Found
| Severity | File | Line | Issue | Suggested Fix |
|----------|------|------|-------|---------------|
| CRITICAL | ... | ... | ... | ... |
### Approved Items
[What the agent did well — be specific]
### Required Changes Before Merge
[Numbered list if verdict is REQUEST_CHANGES or BLOCK]
Category 28: Long-Horizon Agentic Execution (April 2026)
For GLM-5.1, Claude Code, Cursor Automations, and any AI agent running 2+ hour autonomous sessions. These prompts help you structure work that outlasts your attention span.
28.1 The Long-Horizon Task Brief (Advanced)
Tool: GLM-5.1, Claude Code, Cursor Automations | Time: 30 min setup → hours of autonomous execution
Use this before starting any AI session you expect to run longer than 30 minutes. A clear brief prevents the model from drifting, making scope-creep decisions, or silently failing.
## Long-Horizon Task Brief
**Session goal** (one sentence):
[What is complete when this session ends?]
**Time budget**: [How many hours should the agent spend before stopping to check in?]
**In scope**:
- [Feature/file/system 1]
- [Feature/file/system 2]
**Out of scope** (hard limits):
- Do NOT modify [file/system] — read-only
- Do NOT delete anything — create new files only
- Do NOT push to main — commit to branch only
**Checkpointing** (every N hours):
Write a checkpoint file at `agent-checkpoint-[timestamp].md` containing:
1. What has been completed
2. Current task in progress
3. Known blockers or unresolved decisions
4. What remains to complete the session goal
**Success criteria** (all must be true at session end):
1. [Verifiable outcome — test command, file exists, URL responds, etc.]
2. [Verifiable outcome]
3. All code compiles with zero TypeScript errors (`npm run build`)
4. All existing tests still pass (`npm test`)
**How to handle blockers**:
- If blocked by a missing env var → note it in the checkpoint file and skip that feature
- If blocked by an ambiguous requirement → make a reasonable assumption, document it in the checkpoint, and continue
- If blocked by a breaking error → stop, write a blocker-report.md, and halt the session
Begin with a brief plan (3-5 bullet points), then execute.
28.2 The Open-Weight Model Selection Prompt (Intermediate)
Tool: Any LLM with web access or knowledge cutoff April 2026 | Time: 5 min
Use this when evaluating whether to use a self-hosted open-weight model vs. a closed API for a specific project.
I need to choose between a self-hosted open-weight model and a closed API for the following use case:
**Use case**: [Describe what the AI will be doing — code completion, autonomous agents, document analysis, etc.]
**Constraints**:
- Data sensitivity: [Public / Internal / Confidential / Regulated (HIPAA, SOC2, etc.)]
- Budget: [Monthly cap in USD, or "no limit"]
- Latency requirement: [< 500ms / < 2s / batch OK]
- Infrastructure: [Consumer hardware / cloud GPU / on-prem enterprise cluster]
- Team size: [Solo / small team / enterprise]
- Vendor lock-in tolerance: [Low / Medium / High]
**Open-weight models to evaluate** (as of April 2026):
- GLM-5.1 (754B, Z.AI) — SOTA SWE-Bench Pro, 8-hour autonomous sessions, Apache 2.0
- Gemma 4 (Google, Apache 2.0) — 4 sizes, strong reasoning and coding
- Llama 3.x (Meta) — broad ecosystem, widely deployed
- Qwen3.6-Plus — 1M context, competitive with Claude 4.5 on coding tasks
**Closed APIs to evaluate**:
- Claude Sonnet 4.6 (Anthropic API) — best agentic coding, $3/$15 per MTok
- GPT-4o (OpenAI) — broad capability, strong ecosystem
- Gemini 1.5 Pro (Google) — 1M context, competitive pricing
For each candidate, evaluate:
1. Does it meet my latency requirement?
2. Does it meet my data sensitivity requirement?
3. What is the estimated monthly cost at my usage level?
4. What are the known failure modes for my use case?
Recommend the best option and explain the trade-offs I'm accepting.
28.3 The Goose/Local-Agent Workflow Prompt (Intermediate)
Tool: Goose (Block), any LLM-agnostic local AI agent | Time: 10 min setup
Goose (launched April 2026 by Block) is an open-source local AI agent that supports any LLM backend and executes real actions: install packages, run tests, modify files, call APIs. This prompt structure is designed for Goose-style action-oriented agents.
## Goose Task: [Short task name]
**Objective**: [One sentence describing the complete state when this task is done]
**LLM backend**: [claude-sonnet-4-6 / glm-5.1 / gpt-4o / gemma-4 — whichever you're using]
**Allowed actions**:
- Read and write files in: [path/to/project]
- Run shell commands: [list safe commands, e.g., npm test, npm run build, git status]
- Install packages: [yes/no — if yes, list approved package registries]
- Make HTTP requests to: [list allowed external APIs, e.g., "GitHub API only"]
**Prohibited actions** (hard stops — do not proceed if any of these are required):
- git push (never push without human review)
- rm -rf or destructive filesystem operations
- Modify files outside [path/to/project]
- Access [sensitive-system]
**Context files** (read these before starting):
- [path/to/README.md]
- [path/to/relevant-config.json]
**Task steps** (ordered):
1. [First action]
2. [Second action, may depend on output of step 1]
3. Verify: run [test command] and confirm output matches [expected output]
**Output**: When done, write `goose-task-complete.md` with:
- Actions taken (with file paths and commands run)
- Test results
- Any assumptions made
- Any issues encountered
Start immediately. Do not ask for clarification unless truly blocked.
Category 29: Claude Sonnet 4.6 — 1M Context & Agentic Search Prompts (April 2026)
Claude Sonnet 4.6 introduced two capabilities that change how you structure prompts: a 1M token context window (beta) and GA web search/web fetch with code-execution-based result filtering. These prompts exploit both.
29.1 The Whole-Codebase Refactor Prompt (Expert)
Tool: Claude Sonnet 4.6 via API or Claude Code | Context required: 200K–1M tokens
With the 1M context window, you can load an entire medium-sized codebase and ask for architectural analysis without chunking. This works for repositories up to ~150K lines.
## Codebase Refactor Brief
**Repository**: [project-name]
**Goal**: [Specific refactor objective — e.g., "migrate from Pages Router to App Router", "replace all class components with hooks", "extract shared utilities from duplicated code"]
**Constraints**:
- Do not change external API contracts (public-facing routes must remain the same)
- All existing tests must pass after refactor
- Prefer surgical changes over rewrites
**Files loaded below** (entire codebase follows in this message):
[Paste full codebase or use file upload — Claude Sonnet 4.6 handles up to 1M tokens]
**Output requested**:
1. A prioritized list of refactor changes (most impactful first)
2. For each change: which files are affected, what changes, and estimated risk level (low/medium/high)
3. A proposed commit sequence (small atomic commits, safest order)
4. Any architectural concerns that would block this refactor
Do NOT generate code yet — produce the analysis and plan first. I will confirm before implementation begins.
29.2 The Research-Then-Build Prompt (Intermediate)
Tool: Claude Sonnet 4.6 (web search GA) | Time: 15–30 min
Sonnet 4.6's web search and web fetch are GA, with dynamic result filtering via code execution. This prompt chains research directly into implementation — no context-switching between browser and editor.
## Research-Then-Build Task
**What I'm building**: [Short description — e.g., "a rate limiter middleware for my Next.js API routes"]
**Research phase** (do this first — use web search):
1. Search for: "[topic] best practices [current year]"
2. Fetch the top 2–3 relevant documentation pages
3. Identify: (a) the standard pattern, (b) common failure modes, (c) security considerations
4. Write a 3-bullet summary of your findings before writing any code
**Build phase** (only after research summary is written):
- Implement [feature] based on your findings
- Follow the standard pattern you identified
- Add defensive handling for the top failure mode
- Include a comment linking to the primary source used
**Validation**:
- Re-fetch [relevant documentation URL] and confirm your implementation aligns
- Note any deviations and explain why
Start with the research phase. Do not write code until research summary is complete.
29.3 The Extended-Thinking Architecture Decision Prompt (Advanced)
Tool: Claude Sonnet 4.6 with extended thinking | Time: 5 min prompt, 10–20 min thinking
Extended thinking gives the model more compute budget before it commits to an answer. Use this for architecture choices where a wrong call means weeks of rework.
## Architecture Decision Request
**Decision to make**: [e.g., "Should I use Supabase Realtime or polling for my live dashboard?"]
**Context**:
- System: [Brief description]
- Scale: [Expected users/requests in 6 months]
- Team: [Solo / small / larger]
- Constraints: [Budget, latency, existing stack, migration costs]
- Timeline: [When must you ship?]
**What I've already considered**:
- Option A: [First option] — I think this because [reasoning]
- Option B: [Second option] — I think this because [reasoning]
- What I'm unsure about: [Specific uncertainty]
**What I need**:
1. Evaluate both options against my specific constraints (not generic trade-offs)
2. Identify what I'm missing or wrong about in my reasoning
3. Recommend one option with confidence level (high/medium/low) and what would change your recommendation
4. Give me the one question I should answer before committing
Take your time — a slow, thorough answer beats a fast, wrong one.
Category 30: April 2026 — Agent Framework, Security Audit & Parallel Fleet Prompts
Three new workflows unlocked by the April 2026 AI tooling wave: Microsoft Agent Framework 1.0 multi-agent orchestration, Claude Mythos-style security audit chaining, and Cursor 3 parallel agent fleet management.
30.1 The Microsoft Agent Framework 1.0 Orchestration Prompt (Advanced)
Tool: Microsoft Agent Framework 1.0 (.NET or Python), Claude Code | Time: 30–60 min setup
Agent Framework 1.0 ships with A2A and MCP protocol support, enabling cross-runtime agent interoperability. Use this prompt to design multi-agent workflows that span different AI providers without lock-in.
## Multi-Agent Workflow Design Request
**Workflow goal**: [What the agent system should accomplish end-to-end — e.g., "receive a GitHub issue, research the codebase, implement a fix, open a PR, and notify Slack"]
**Agents needed** (describe each):
- Agent 1: [Name + responsibility + which model/provider — e.g., "Researcher — Claude Sonnet 4.6 — reads codebase and clarifies requirements"]
- Agent 2: [Name + responsibility + which model/provider]
- Agent 3: [Name + responsibility + which model/provider]
**Coordination protocol**: A2A (agent-to-agent messages) | MCP (tool calls to shared context) | Both
**Runtime**: .NET | Python | Both
**State management**:
- Shared state that all agents need: [list]
- State private to each agent: [list]
- How agents hand off work: [event-driven / polling / direct call]
**Error handling**:
- If Agent 1 fails: [retry / fail pipeline / route to human]
- If Agent 2 fails: [behavior]
- Maximum retries per agent: [N]
**Output required**:
1. Agent architecture diagram (ASCII or described)
2. Agent Framework 1.0 code scaffold for each agent class
3. The A2A message schema for agent handoffs
4. The MCP tools each agent needs registered
5. DevUI configuration for browser-based debugging
Generate the scaffold. I will fill in the business logic per agent.
30.2 The AI Security Audit Chain Prompt (Expert)
Tool: Claude Sonnet 4.6 or Claude Code with CyberOS MCP | Time: 20–40 min per codebase
Inspired by Claude Mythos / Project Glasswing's defensive security workflow — systematically chain vulnerability discovery, triage, and remediation across a codebase without missing surface area.
## AI-Powered Security Audit — Systematic Chain
**Codebase**: [Repo path or paste content]
**Stack**: [e.g., Next.js 14 + Supabase + Stripe + Python FastAPI backend]
**Deployment**: [Vercel + AWS Lambda | Self-hosted | Cloud provider]
**Compliance scope**: [OWASP Top 10 | SOC 2 | PCI-DSS | All]
## Phase 1 — Attack Surface Map
List every:
- Public HTTP endpoint (method + path + auth required)
- Data input point (form, query param, file upload, webhook)
- Third-party integration (API calls out, webhooks in)
- Secret/credential usage point
Do not analyze yet. Only map. Output as a numbered list.
## Phase 2 — Vulnerability Scan
For each item on the attack surface map, check for:
- Injection (SQL, command, SSRF, path traversal)
- Authentication/authorization bypass
- Sensitive data exposure (secrets in logs, responses, or error messages)
- Cryptographic weaknesses (weak ciphers, padding oracle, hardcoded keys)
- Supply chain risks (mutable version references, unverified dependencies)
Classify each finding: CRITICAL / HIGH / MEDIUM / LOW / INFO
Include CWE ID and the exact file:line where the issue exists.
## Phase 3 — Remediation Plan
For each CRITICAL and HIGH finding:
1. Explain the vulnerability in one sentence
2. Write the fixed code (before/after diff)
3. Explain why the fix works
## Phase 4 — Verification
After remediations are applied:
- Re-scan the attack surface for the patched items
- Confirm no new vulnerabilities were introduced by the fix
- Output a signed-off list: [finding] → [status: FIXED / PARTIALLY FIXED / DEFERRED]
Start with Phase 1. Do not proceed to Phase 2 until I confirm the attack surface map is complete.
30.3 The Cursor 3 Parallel Agent Fleet Prompt (Advanced)
Tool: Cursor 3 Agents Window | Time: 5 min to launch, 30–120 min execution
Cursor 3's Agents Window lets you run multiple AI agents simultaneously across local, SSH, and cloud environments. This prompt template structures how to decompose work across a fleet efficiently so agents don't conflict.
## Parallel Agent Fleet Assignment
**Project**: [Brief description of the codebase]
**Goal**: [What needs to be accomplished — e.g., "ship the user dashboard feature including data layer, UI components, tests, and documentation"]
**Fleet decomposition** (define independent workstreams that can run in parallel):
Agent A — [Name: e.g., "Data Layer"]
- Scope: [Specific files/directories this agent owns]
- Task: [Exact work to do]
- Output: [What it should produce — e.g., "implemented API routes with tests passing"]
- Dependencies: [What it needs before starting — e.g., "database schema must exist"]
- Must NOT touch: [Files/areas that are other agents' scope]
Agent B — [Name: e.g., "UI Components"]
- Scope: [...]
- Task: [...]
- Output: [...]
- Dependencies: [...]
- Must NOT touch: [...]
Agent C — [Name: e.g., "Tests & Docs"]
- Scope: [...]
- Task: [...]
- Output: [...]
- Dependencies: [Agent A and B PRs merged]
- Must NOT touch: [...]
**Conflict prevention**:
- Shared files that multiple agents might edit: [list them — these need explicit ownership]
- Owner of package.json / lock file: [Agent A | Agent B | None — freeze during parallel work]
- Owner of shared types/interfaces: [which agent defines, others consume]
**Review order**:
1. Review Agent A output first
2. Review Agent B output (may depend on A's types)
3. Review Agent C output last (depends on both)
**Launch in the Agents Window**: Open one agent session per row above. Paste the Agent-specific block into each session. Start all simultaneously.
This library is updated monthly with new prompts based on emerging tools, patterns, and reader requests. Last updated: April 14, 2026. Added: Category 31 (AI Agent Payments, Session Context Briefs, Generated Code Security Review). Previous: Category 30 (Agent Framework 1.0 orchestration, AI security audit chain, Cursor 3 parallel fleet management, April 13). Category 29 (Claude Sonnet 4.6 — 1M Context & Agentic Search Prompts, April 10). Category 28 (Long-Horizon Agentic Execution, April 9). Category 27 (Multi-Agent Orchestration, April 7). Category 26 (MCP Integration, March 31).
Category 31: April 2026 — AI Agent Payments, Session Context & Security Review
Three new prompt patterns emerging from the Claude Code creator workflow reveal and x402 protocol adoption.
31.1 The AI Agent Payment Integration Prompt (Advanced)
Tool: Claude Code, Cursor | Time: 2-4 hours | Category: Emerging Patterns
Context: Coinbase's x402 protocol enables AI agents to make autonomous payments. As of April 2026, this is becoming a real workflow pattern — agents that call APIs, pay for compute, and operate economically without human authorization for each transaction.
I'm building an AI agent that needs to make autonomous payments using the
Coinbase x402 protocol / [payment protocol].
## Agent Context
- Agent type: [coding assistant / research agent / deployment bot]
- Payment ceiling per action: $[amount]
- Allowed payment recipients: [API services, infrastructure providers]
- Forbidden: [payments to unknown wallets, amounts over $X]
## What I Need
1. Integrate x402 payment headers into the agent's HTTP client
2. Implement a payment budget tracker that halts the agent when the daily/session
ceiling is hit
3. Add a payment audit log (what was paid, when, to whom, why)
4. Implement human-approval gates for payments above $[threshold]
5. Handle x402 402 Payment Required responses gracefully
## Safety Requirements
- Never pay from the agent wallet without logging first
- Require cryptographic receipts for all payments
- Alert human operator if payment velocity exceeds [N] transactions/minute
- Reject any payment request that doesn't match the allowed-recipient list
Build the payment client and budget tracker first, then integrate into the
existing agent loop.
Use when: Building economic agents, autonomous task runners that consume paid APIs, or testing the x402 payment stack.
Security note: Always implement human approval gates for amounts above $1 in production. See Chapter 10 for AI agent attack surfaces.
31.2 The Session Context Brief Generator (Beginner)
Tool: Claude Code, Cursor, Windsurf | Time: 5 minutes | Category: Workflow
This prompt generates a reusable session brief from your current codebase state. Run it at the start of every Claude Code session to give the AI full context before any task.
I need you to generate a session brief for this codebase. Read the following
and produce a structured brief I can paste at the start of future sessions:
## Please Analyze
- The overall architecture (what framework, what database, what auth)
- The current state (what works, what's broken based on TODO comments and errors)
- The key files that any feature touching [feature area] would need to know about
- Any explicit constraints in CLAUDE.md or README that I shouldn't violate
- The tech debt or known issues I should steer around
## Output Format
Produce a brief in this format:
---
## Session Brief — [Date]
**Stack**: [framework, database, auth, hosting]
**What's working**: [bullet list]
**What's broken / in-progress**: [bullet list]
**Key files for [feature area]**: [file paths with one-line description each]
**Constraints to respect**: [rules from CLAUDE.md / README]
**Steer around**: [known issues, fragile code, don't-touch zones]
---
Keep it under 400 words so it fits in a context window preamble.
Use when: Starting any Claude Code session, onboarding to a new codebase, or after a long break from a project.
Why it works: A 5-minute brief prevents 30-60 minutes of context-building drift. Claude Code performs significantly better when it knows the full codebase state upfront.
31.3 The Generated Code Security Review Prompt (Intermediate)
Tool: Claude Code, Cursor | Time: 10-15 minutes | Category: Security
After generating a significant block of code, use this prompt to run a security review before accepting the change. Especially important for authentication flows, API handlers, and any code that touches user data.
Review the following generated code for security vulnerabilities.
## Code to Review
[paste generated code here]
## Review Checklist
Check specifically for:
1. **Injection vulnerabilities**: SQL injection, command injection, path traversal
2. **Authentication gaps**: Missing auth checks, broken access control
3. **Input validation**: Unvalidated user input reaching sensitive operations
4. **Secret exposure**: Hardcoded credentials, keys in code, logging of sensitive data
5. **Prototype pollution**: Object spread from user input, __proto__ manipulation
6. **Race conditions**: Async operations that could interleave dangerously
7. **Error handling**: Stack traces leaking in responses, errors that expose internals
## For Each Issue Found
- Severity: Critical / High / Medium / Low
- CWE category
- Exact line(s) affected
- Safe version of the code
## If Clean
Confirm the code is safe to merge and note any edge cases that weren't security
issues but should be tested.
Context: This code is [describe what it does and who has access to it].
The framework is [Next.js / Express / Django / etc.].
The data involved: [user PII / payment data / internal only / public].
Use when: After any AI-generated auth handler, API route, form processing, or file upload code. Non-negotiable for code touching user data or payments.
Pairs with: CyberOS (https://cyberos.dev) for automated continuous review in CI/CD pipelines.
Source: Based on OWASP Top 10 2025 and the CyberOS pattern database (615 patterns as of April 2026).
Category 32: Automation & Agent Orchestration Prompts (Added April 2026)
Three new prompt patterns for Claude Code Routines (launched April 2026), Cursor 3 multi-repo agent orchestration, and automated security auditing — covering the full spectrum from simple recurring automation to coordinated multi-agent coding sessions.
32.1 Claude Code Routines — PR Review Automation (Intermediate)
Tool: Claude Code | Difficulty: Intermediate | Time: 15-30 min
Claude Code Routines (April 2026) let you define recurring coding tasks that run on Anthropic's cloud infrastructure, triggered by events like new pull requests. Use this prompt to configure a Routine that automatically reviews every incoming PR before a human reviewer sees it.
## Claude Code Routine: Automated PR Review
Set up a Claude Code Routine that triggers on new pull requests to this
repository and performs a structured code review before human reviewers
are assigned.
## Trigger
Event: pull_request.opened, pull_request.synchronize
Scope: all branches targeting main and develop
Skip: PRs with label "skip-ai-review" or authored by bots
## Review Tasks (run in sequence)
### 1. Change Summary
- Summarize what the PR does in 3-5 bullet points
- Identify which components/modules are affected
- Estimate scope: small (< 50 lines changed), medium (50-300), large (300+)
### 2. Code Quality Check
- Flag any functions longer than 50 lines
- Flag cyclomatic complexity > 10
- Identify duplicated logic that already exists elsewhere in the codebase
- Check naming conventions match the patterns in [existing files in the repo]
### 3. Security Scan
- Check for the patterns in Prompt 32.3 (OWASP Top 10 for Next.js/React)
- Flag any hardcoded secrets, tokens, or credentials
- Identify unvalidated user inputs reaching database or filesystem operations
- Check new API routes for missing authentication guards
### 4. Test Coverage
- Identify new functions or branches not covered by the PR's test additions
- List any test files that should have been updated but weren't
- Flag missing edge case tests for: null/undefined input, empty arrays,
auth failure paths
### 5. Review Output
Post a structured comment to the PR with:
- **Summary**: [auto-generated summary]
- **Scope**: small / medium / large
- **Issues**: [table: Severity | File | Line | Issue | Suggested Fix]
- **Missing tests**: [list]
- **Verdict**: LGTM (no blockers) | NEEDS CHANGES (list blockers) | REQUEST HUMAN REVIEW (flag for security/arch concerns)
## Routine Configuration
- Runtime: Anthropic cloud (no self-hosted runner required)
- Model: claude-sonnet-4-6
- Timeout: 5 minutes per PR
- Post comment as: GitHub App bot account
- Do NOT approve or request changes via GitHub review API — comment only
- Do NOT auto-merge under any circumstances
## What This Routine Should NOT Do
- Rewrite or suggest large refactors on a per-PR basis
- Block PRs automatically — it informs, humans decide
- Comment more than once per commit push (deduplicate on commit SHA)
Why it works: This Routine acts as a tireless first-pass reviewer that runs in under 5 minutes on every PR. Human reviewers arrive to a structured pre-analysis and can focus on architecture and intent rather than scanning for obvious issues.
Setup note: Configure the Routine in your Claude Code workspace settings under Routines > New Routine > Event Trigger. The model runs server-side — no GitHub Actions minutes consumed.
32.2 Multi-Agent Coding Session Orchestration (Advanced)
Tool: Claude Code, Cursor 3 | Difficulty: Advanced | Time: 2-4 hours
Cursor 3 (April 2026) introduced unified multi-repo agent orchestration — a single workspace can coordinate agents working across separate repositories simultaneously. Use this prompt pattern to split a full-stack feature across three specialized agents: backend, frontend, and test/QA.
## Multi-Agent Session: [Feature Name]
You are the orchestrator for a 3-agent coding session. Your job is to
decompose the feature, assign agents, prevent conflicts, and integrate
outputs. Do not write implementation code yourself — delegate to agents.
## Feature Brief
[Describe the feature in 3-5 sentences: what it does, what data it uses,
what API contracts it creates or modifies, and any external integrations.]
## Repository Map
- Backend repo: [path or URL — e.g., api.myapp.com at /repos/backend]
- Frontend repo: [path or URL — e.g., app.myapp.com at /repos/frontend]
- Shared types package: [path — e.g., /repos/shared-types] (if applicable)
---
## Agent 1: Backend Agent
**Scope**: [/repos/backend/src/routes, /repos/backend/src/services, /repos/backend/src/db]
**Mission**: Implement the server-side feature — database schema changes,
business logic, and REST/GraphQL API endpoints.
**Deliverables**:
1. Database migration file for [new tables or schema changes]
2. Service layer with full business logic and error handling
3. API endpoints matching this contract:
- [METHOD] [/path]: [description, request body, response shape]
- [METHOD] [/path]: [description]
4. Unit tests for the service layer (90%+ coverage on new code)
5. Update /repos/shared-types with any new TypeScript interfaces
**Must NOT touch**:
- Frontend repo
- Authentication middleware (read-only)
- Existing migrations
**Handoff**: Write `agent-handoff-backend.md` with final API contracts
and any environment variables added.
---
## Agent 2: Frontend Agent
**Scope**: [/repos/frontend/src/components, /repos/frontend/src/pages, /repos/frontend/src/hooks]
**Mission**: Implement the UI for [feature name] using the API contracts
defined in agent-handoff-backend.md. Wait for Agent 1's handoff file
before writing any data-fetching code.
**Deliverables**:
1. React components: [list specific components needed]
2. Data-fetching hooks using [SWR / React Query / Server Actions]
matching the API contract in agent-handoff-backend.md
3. Form validation for all user inputs
4. Loading, empty, and error states for all async operations
5. Responsive layout (mobile breakpoint: 640px)
**Must NOT touch**:
- Backend repo
- Auth context or session management
- Design system tokens (read-only — use existing classes)
**Handoff**: Write `agent-handoff-frontend.md` with component tree,
prop interfaces, and any new environment variables needed.
---
## Agent 3: Test & QA Agent
**Scope**: [/repos/backend/tests, /repos/frontend/tests, /repos/frontend/e2e]
**Mission**: Write the full test suite for this feature. Start after
Agent 1's handoff. Complete E2E tests after Agent 2's handoff.
Do NOT write implementation code — tests only.
**Deliverables**:
1. API integration tests (all endpoints: happy path + 4xx + 5xx cases)
2. Component tests for each UI component Agent 2 built
3. E2E test covering the full user flow: [describe the 3-5 step user journey]
4. A test coverage report showing new code coverage
**Must NOT touch**:
- Source code in either repo (tests and fixtures only)
**Handoff**: Write `agent-handoff-qa.md` with test results, coverage
numbers, and any failing tests with root cause.
---
## Orchestration Rules
**Sequencing**:
1. Agent 1 runs first — do not start Agent 2 until agent-handoff-backend.md exists
2. Agent 2 and Agent 3 (API tests only) can run in parallel after Agent 1 finishes
3. Agent 3 E2E tests run last — requires both Agent 1 and Agent 2 complete
**Conflict prevention**:
- package.json / lock files: frozen during parallel work — no dependency additions
- Shared types: Agent 1 owns writes, Agents 2 and 3 read-only
- Environment files: each agent appends to a dedicated .env.[agent] file,
do not modify .env directly
**Integration checkpoint**:
When all three agents have written their handoff files, run:
1. `npm run build` in both repos — must succeed with zero errors
2. `npm test` in both repos — all tests must pass
3. `npm run e2e` — all E2E tests must pass
If any step fails, identify which agent's output caused the failure
and assign a targeted fix task to that agent only.
**Final output**:
Write `session-summary.md` with:
- Feature implemented (what was built)
- All files changed (by repo and agent)
- Test results (pass/fail counts, coverage delta)
- Known limitations or deferred items
- Decisions made and why
Why it works: The strict scope boundaries prevent agents from stepping on each other's work. The handoff files create an explicit async interface between agents — Agent 2 cannot make assumptions about the API until Agent 1 has documented it, which eliminates the most common integration failure in multi-agent sessions.
Cursor 3 setup: Open three agent panels in the Agents Window. Paste each agent block into its respective panel. Launch Agent 1 first. Monitor agent-handoff-backend.md creation before launching Agents 2 and 3.
32.3 Security Audit Automation — Next.js/React OWASP Top 10 (Advanced)
Tool: Claude Code | Difficulty: Advanced | Time: 30-60 min
Use this prompt to run a comprehensive automated security audit of a Next.js or React codebase, checking for all OWASP Top 10 vulnerability classes with patterns tuned for the React/Next.js stack. Designed to complement CyberOS's continuous monitoring (https://cyberos.dev) for one-time deep audits.
## Automated Security Audit: Next.js / React Codebase
Perform a systematic OWASP Top 10 security audit of this Next.js/React
codebase. Work through each phase in sequence. Do not skip phases or
combine them — each phase informs the next.
## Codebase Context
- Framework: Next.js [version] (App Router / Pages Router)
- Auth provider: [NextAuth / Supabase Auth / Clerk / custom]
- Database: [Supabase / Prisma + PostgreSQL / other]
- Payment handling: [Stripe / Paddle / none]
- Deployment: [Vercel / AWS / self-hosted]
- External APIs called: [list]
---
## Phase 1 — Inventory (5 min, no analysis yet)
Map the attack surface:
1. List every file in /app/api or /pages/api (Next.js API routes)
2. List every Server Action (files with "use server")
3. List every form or input that accepts user data
4. List every place external data is rendered to the DOM
5. List every third-party library that handles auth, payments, or user data
Output as numbered lists. Do not evaluate yet.
---
## Phase 2 — OWASP Top 10 Scan
For each item in the Phase 1 inventory, check the following.
Reference CWE IDs and the exact file:line for every finding.
### A01 — Broken Access Control
- Every API route and Server Action: is auth checked server-side
(not relying on middleware alone)?
- Are RLS policies enforced at the database level (Supabase) or via
ORM-level guards (Prisma)?
- Are there IDOR risks — can a user access another user's records by
changing an ID parameter?
- Is the CVE-2025-29927 dual-layer auth pattern implemented?
(See Category 26, Prompt 26.3)
### A02 — Cryptographic Failures
- Are passwords hashed with bcrypt or argon2 (not SHA-1/MD5)?
- Is HTTPS enforced with HSTS headers?
- Are any secrets or tokens returned in API responses or logged?
- Are JWTs validated on every request (not just on login)?
### A03 — Injection
- Are all database queries parameterized?
Flag any string concatenation in SQL or ORM raw queries.
- Is there risk of command injection in any child_process or exec calls?
- Server Actions: is user input sanitized before use in database operations?
- Are URL and path parameters validated before use in filesystem operations?
### A04 — Insecure Design
- Are there rate limits on authentication endpoints?
- Are there rate limits on resource-intensive API routes
(e.g., AI generation, file processing)?
- Is there a mechanism to revoke sessions on password change or logout?
- Are webhook endpoints (Stripe, etc.) verifying signatures?
### A05 — Security Misconfiguration
- Are security headers set: CSP, X-Frame-Options, X-Content-Type-Options,
Referrer-Policy, Permissions-Policy?
- Are CORS origins restricted (not "*")?
- Are error responses generic (no stack traces or internal paths leaking)?
- Are Next.js server components accidentally exposing server-side data
in client bundles?
### A06 — Vulnerable Components
- Run: `npm audit --audit-level=high`
- Flag any dependencies with known CVEs (severity: high or critical)
- Flag any dependencies last updated more than 18 months ago that handle
auth, crypto, or user data
### A07 — Auth and Session Failures
- Are session tokens HTTP-only cookies (not localStorage)?
- Are session IDs regenerated after login (session fixation prevention)?
- Is "remember me" implemented with a separate long-lived token
(not just extending the session)?
- Are failed login attempts rate-limited and logged?
### A08 — Software and Data Integrity
- Are all npm install commands run with a lockfile (`npm ci`, not `npm install`)?
- Are GitHub Actions using pinned SHA hashes for third-party actions
(not floating tags like @v3)?
- Are Stripe/webhook payloads verified with HMAC signatures
before processing?
### A09 — Logging and Monitoring
- Are security events logged: login success, login failure,
auth failure on protected routes?
- Are logs sanitized — no passwords, tokens, or PII in log output?
- Is there alerting for repeated auth failures (possible brute force)?
### A10 — Server-Side Request Forgery (SSRF)
- Are there any routes that fetch a URL provided by the user?
- If yes: is the URL validated against an allowlist of safe domains?
- Are internal metadata endpoints (e.g., AWS 169.254.x.x) blocked?
---
## Phase 3 — Severity Classification
For every finding, output a row in this table:
| # | OWASP Category | CWE | Severity | File | Line | Description | Fix |
|---|---------------|-----|----------|------|------|-------------|-----|
| 1 | A01 | CWE-284 | CRITICAL | ... | ... | ... | ... |
Severity levels:
- CRITICAL: exploitable remotely, data exposure or full auth bypass
- HIGH: requires auth but leads to significant data or privilege risk
- MEDIUM: requires specific conditions, limited impact
- LOW: defense-in-depth gap, no direct exploitability
- INFO: best practice deviation, no current risk
---
## Phase 4 — Remediation
For every CRITICAL and HIGH finding:
1. Show the vulnerable code (before)
2. Show the fixed code (after)
3. One-sentence explanation of why the fix closes the vulnerability
4. Link to the relevant OWASP cheat sheet or CyberOS pattern
For MEDIUM findings: provide the fix code only (no explanation needed).
For LOW and INFO: list as a bullet with the file location.
---
## Phase 5 — Verification
After all remediations are written:
1. Re-check each CRITICAL and HIGH finding — confirm the fix addresses
the root cause, not just the symptom
2. Check that no fix introduced a new vulnerability
(e.g., error handling that leaks internals)
3. Output a final sign-off table:
| Finding # | Status | Notes |
|-----------|--------|-------|
| 1 | FIXED | ... |
| 2 | DEFERRED | reason |
---
## Output Summary
At the end of all phases, produce:
- Total findings by severity (CRITICAL: N, HIGH: N, MEDIUM: N, LOW: N, INFO: N)
- Top 3 risk areas in this codebase
- Recommended next step (e.g., "Schedule penetration test focusing on A01
and A03 findings", "Integrate CyberOS for continuous monitoring")
Begin with Phase 1. Confirm the inventory is complete before proceeding.
Why it works: The phased structure prevents the common failure mode where an LLM jumps to fixes before fully mapping the attack surface. By forcing an inventory pass first, the audit achieves full coverage — nothing is missed because the model got absorbed in one interesting vulnerability.
CyberOS integration: This prompt covers the same OWASP Top 10 categories as CyberOS's static analysis engine (https://cyberos.dev). Use this for on-demand deep audits, and CyberOS for continuous PR-level scanning. The findings from this audit can be imported into CyberOS as baseline issues.
Pairs with: Prompt 31.3 (Generated Code Security Review) for ongoing review of new code, and Prompt 30.2 (AI Security Audit Chain) for systematic multi-phase audit chaining.
Category 33: Claude Opus 4.7 — xhigh Effort, Vision & Self-Verification
Released April 16, 2026: Claude Opus 4.7 introduced three capabilities with immediate impact on vibe coding workflows — an xhigh effort level for extended reasoning, 3.3x higher-resolution vision, and self-verification on agentic tasks. These prompts are tuned specifically for Opus 4.7 and will not produce the same results on earlier models.
33.1 xhigh Effort Architectural Reasoning (Expert)
Tool: Claude Code (Opus 4.7) | Difficulty: Expert | Time: 15-30 min
Use Opus 4.7's xhigh effort level for decisions that are hard to reverse — database schema choices, authentication architecture, API design. The extended thinking mode considers more edge cases and provides more honest uncertainty quantification than standard effort.
<effort>xhigh</effort>
You are a senior software architect. I need your deepest analysis on this decision.
## Decision Required
[Describe the architectural choice in 1-3 sentences — e.g., "Should I use a
single Postgres database with RLS for multi-tenancy, or separate schemas per tenant?"]
## System Context
- Scale target: [current users / projected 12-month users]
- Team size: [N engineers, their experience level]
- Current stack: [list key technologies]
- Budget constraints: [infrastructure budget, or "cost-sensitive / not a constraint"]
- Timeline: [when does this need to be production-ready]
## Constraints (non-negotiable)
- [Constraint 1 — e.g., "Must work with Supabase — no custom database infra"]
- [Constraint 2]
## Options Under Consideration
### Option A: [name]
[Brief description]
Perceived pros: [list]
Perceived cons: [list]
### Option B: [name]
[Brief description]
Perceived pros: [list]
Perceived cons: [list]
## What I'm Uncertain About
[The specific thing that makes this decision hard — e.g., "I don't know how
RLS performs at 100k rows per tenant with complex join queries"]
## Output Required
1. Your recommendation (Option A, B, or a hybrid) with confidence level (0-100%)
2. The 3 most important factors that drove your recommendation
3. The scenario under which your recommendation would be wrong
4. The first concrete implementation step if I go with your recommendation
5. Red flags to watch for in the first 30 days of implementation
Take as long as you need to reason through this. Don't truncate the reasoning.
Why it works: The <effort>xhigh</effort> tag signals Opus 4.7 to enter extended thinking mode. For complex architectural questions, the additional compute produces answers that consider more edge cases, catch more subtle interactions, and provide more honest uncertainty quantification than standard responses.
When to use xhigh: Save it for decisions that are hard to reverse — architectural choices, security design, data modeling. Don't use it for quick questions where standard effort is adequate.
33.2 Vision-Enhanced UI Debugging (Intermediate)
Tool: Claude Code (Opus 4.7) | Difficulty: Intermediate | Time: 10-20 min
Opus 4.7's 3.3x higher-resolution vision support means it can now read detailed UI screenshots, identify small alignment issues, read small-print error messages, and compare designs at pixel level. Use this pattern for UI debugging and visual regression analysis.
[Attach screenshot of UI bug or visual issue]
You are a senior frontend engineer debugging a visual problem. The screenshot shows:
[Brief description of what you're looking at]
## What I need
1. Identify all visible UI problems in this screenshot — layout issues, spacing
inconsistencies, color/contrast problems, text truncation, alignment bugs
2. For each problem, hypothesize the CSS or component cause
3. Rank by severity: (a) breaks functionality (b) fails WCAG contrast (c) looks wrong
## Codebase context
- Framework: [React/Next.js/Vue/etc]
- CSS approach: [Tailwind/CSS Modules/styled-components/etc]
- Key component files: [relevant file paths]
Then check the relevant component files and propose a specific fix for the
highest-severity issue first.
Why it works: The 3.3x vision resolution lets Opus 4.7 read small-print labels, identify subtle alignment (off by 2px), and distinguish similar colors that previous models couldn't differentiate. Pairing the visual analysis with codebase access creates a loop where the model reads the pixel output and the source simultaneously.
33.3 Self-Verifying Agent Task (Advanced)
Tool: Claude Code (Opus 4.7) | Difficulty: Advanced | Time: 30-90 min
Opus 4.7 added self-verification on agentic tasks — the model can now flag when it has low confidence in its own output and request human confirmation before proceeding. This prompt pattern is designed to take advantage of that capability for high-stakes automated tasks.
You are executing a high-stakes automated task. Opus 4.7 self-verification is enabled.
## Task
[Describe the task in detail]
## Self-Verification Protocol
At each decision point where you are >15% uncertain about the correct action:
1. STOP and output: VERIFICATION_REQUIRED: [describe what you're uncertain about]
2. List the options you're considering and your confidence in each
3. Wait for my confirmation before proceeding
## High-Stakes Actions That Always Require Verification
- Deleting or overwriting files not in the explicit scope
- Making API calls that cost money or have rate limits
- Modifying database schemas or running migrations
- Changing authentication or authorization logic
- Publishing or deploying to production environments
## Success Criteria
[What does "done" look like? How will you verify you succeeded?]
Begin. If you complete the first phase without a VERIFICATION_REQUIRED, confirm
the phase is done and your confidence level before continuing to the next phase.
Why it works: This prompt makes Opus 4.7's self-verification explicit and structured. By defining a confidence threshold (15%) and listing high-stakes action categories, you get an agent that asks for help when it genuinely needs it rather than either proceeding blindly or asking about everything.
Integration with CyberOS: For tasks involving security-sensitive operations, pair this with CyberOS's continuous monitoring so any unexpected file modifications or API calls are flagged independently.
Category 34: Claude Design & AI-Assisted Visual Creation
Launched April 17, 2026: Anthropic introduced Claude Design, extending Claude's capabilities into rapid visual content generation. These prompts cover workflows for using Claude Design alongside Claude Code for visual asset creation — from brand assets to landing page design to marketing graphics — integrated into the vibe coding workflow.
34.1 Brand Asset Sprint (Beginner)
Tool: Claude Design, Claude Code | Difficulty: Beginner | Time: 30-60 min
Use Claude Design to generate a complete brand asset pack for a new vibe-coded project. This prompt produces a design brief that Claude Design can execute directly, giving you logo concepts, color palettes, and icon sets in one session.
I'm creating brand assets for a new product called [Product Name].
## Product Summary
[2-3 sentences: what it does, who uses it, what feeling it should evoke]
## Brand Personality
Choose 3 adjectives that describe the brand: [e.g., modern / trustworthy / playful]
## Audience
Primary users: [who they are — age range, technical sophistication, context of use]
## Design Direction
- Style preference: [minimal / bold / corporate / friendly / technical / expressive]
- Color mood: [warm / cool / neutral / vibrant / muted]
- Reference brands I like: [1-3 brand names with notes on what you like]
- Reference brands to avoid: [1-2 brand names that feel wrong]
- Logo type preference: [wordmark / icon + wordmark / icon only / abstract mark]
## Assets Needed
1. Primary logo (light background)
2. Primary logo (dark background / inverted)
3. Favicon / app icon (square, 512×512)
4. Social media profile image (1:1 ratio)
5. Color palette: 1 primary, 1 accent, 2 neutrals (light + dark), 1 semantic (error/warning)
6. Typography pairing: heading font + body font (Google Fonts preferred)
7. 3 icon style examples (outline / filled / duotone — whichever fits the style)
## Output Format
For each asset, provide:
- Visual description precise enough for a designer or AI image tool to recreate
- Hex codes for all colors
- Font names and weights for typography
- A short rationale explaining why each choice fits the brand
Start with the color palette and typography — everything else should derive from those foundations.
Why it works: Claude Design's visual understanding lets it generate coherent brand systems rather than isolated assets. By front-loading the palette and type decisions, you get downstream assets that feel intentional rather than assembled from unrelated pieces.
Follow-up: Feed the output from this prompt directly into Claude Design's visual canvas to generate image mockups. Use the hex codes and font names in your Tailwind config (tailwind.config.ts) to wire the brand into the codebase in minutes.
34.2 Landing Page Hero Design Spec (Intermediate)
Tool: Claude Design, Cursor, Claude Code | Difficulty: Intermediate | Time: 20-45 min
Generate a detailed design spec for a landing page hero section — precise enough for Cursor to implement directly into Tailwind/React without ambiguity. Bridges the gap between visual concept and production code.
Design a landing page hero section for [Product Name], a [brief description].
## Goal of the Hero
The hero must communicate: [what the product does] + [who it's for] + [why to care]
in under 5 seconds. Primary CTA: [button text and action].
## Brand Context
- Primary color: [hex]
- Accent color: [hex]
- Background: [hex or gradient description]
- Heading font: [font name, weight]
- Body font: [font name, weight]
- Tone: [formal / casual / technical / playful]
## Layout Requirements
- Viewport: Full-screen (100vh) on desktop, auto-height on mobile
- Layout type: [centered / left-aligned / split (text left, visual right)]
- Visual element: [illustration / screenshot / animation / abstract shape / none]
- Navigation: [sticky top bar / transparent overlay / none]
## Content to Include
- Headline: [your draft or "generate 3 options"]
- Subheadline: [your draft or "generate 3 options"]
- Social proof element: [logos / testimonial quote / stat / none]
- CTA button: Primary "[text]" + Secondary "[text]" (optional)
- Trust signals: [e.g., "No credit card required", "Used by 2,000+ developers"]
## Responsive Behavior
- Desktop (1280px): [describe layout]
- Tablet (768px): [any changes — stack columns, reduce font sizes, etc.]
- Mobile (375px): [headline size, single-column, CTA full-width]
## Output Format
Provide:
1. Annotated wireframe description (text-based — every element, position, spacing)
2. Tailwind CSS class recommendations for each element
3. Copy variants (3 headline options, 2 subheadline options)
4. Animation suggestions (entrance animation, hover states) — optional, flag if they
add distraction rather than clarity
Then implement the hero as a self-contained React component using Tailwind.
Why it works: By asking for both the design spec and the implementation in the same prompt, you skip the translation step where a design mockup loses fidelity going into code. The Tailwind class output means Cursor can implement the exact design without reinterpretation.
Pairs with: Prompt 34.1 (Brand Asset Sprint) for the color palette and font choices. Prompt 1.3 (Landing Page from Zero) in Category 1 for the full page structure beyond the hero.
34.3 Visual Content Brief for Consistent AI Generation (Advanced)
Tool: Claude Design, Claude Code (Opus 4.7) | Difficulty: Advanced | Time: 45-90 min
Create a visual content system specification — a single source of truth document that ensures all AI-generated visuals for a product feel like they belong to the same brand. Solves the consistency problem when generating marketing graphics, blog thumbnails, social posts, and UI illustrations over time.
## Visual Content System Specification
I need a visual content system for [Product Name] that ensures consistency across
all AI-generated images and graphics. This system will be used by Claude Design,
Midjourney, DALL-E 3, and Stable Diffusion to produce assets over the next 12 months.
## Brand Foundation (already defined)
- Logo: [description or attachment]
- Primary palette: [hex codes with role labels — primary, accent, background, text]
- Typography: [heading and body font names]
- Tone adjectives: [3 words that describe the brand personality]
## Asset Categories to Define
For each category, specify the visual style, composition rules, and example prompt template:
### Category A: Blog / Article Thumbnails (1200×628px)
- Use case: [website blog, newsletter, LinkedIn posts]
- Volume: ~[N] per month
- Visual style: [abstract / illustrative / photographic / typographic]
### Category B: Social Media Graphics (1:1, 9:16, 16:9)
- Use case: [Twitter/X, LinkedIn, Instagram]
- Volume: ~[N] per month
- Visual style: [consistent with A / more casual / motion-focused]
### Category C: Product Screenshots & Mockups
- Use case: [landing page, app store, documentation]
- Volume: ~[N] per quarter
- Visual style: [clean device mockup / contextual scene / abstract UI fragment]
### Category D: Icons & Illustrations (if applicable)
- Use case: [empty states, feature explainers, onboarding]
- Style: [flat / isometric / line art / 3D]
## Constraints
- Must never use: [specific visual elements to avoid — stock photo clichés,
specific color combinations that conflict with brand, visual motifs from competitors]
- Must always include: [brand element in every image — subtle color, pattern, etc.]
- Accessibility: all text in images must meet WCAG AA contrast (4.5:1 minimum)
## Deliverables
1. **Style Guide**: 2-3 paragraphs defining the visual language in words
2. **Color Application Rules**: When to use primary vs. accent, background rules,
gradient usage policy
3. **Reusable Prompt Templates**: For each category, a parameterized prompt template
like: "[Category A template]: A [adjective] [composition] depicting [subject] for
[brand name], using [colors], [style description], [technical specs]"
4. **Negative Prompt Library**: 10-15 terms to consistently exclude across all
AI image generation to maintain brand safety and visual consistency
5. **Quality Checklist**: 5-point check before publishing any AI-generated asset
(brand colors present, text legible, no AI artifacts, consistent style,
no competitor visual cues)
Generate all five deliverables. For the prompt templates, test each one by
writing an example output description of what the image would look like.
Why it works: The consistency problem in AI visual generation comes from re-describing the brand each time you need an asset. A visual content system document solves this by encoding the brand DNA into reusable prompt fragments — Claude Design, Midjourney, and DALL-E 3 all respond to the same parameterized templates, producing visuals that read as siblings rather than strangers.
Production integration: Save this document as visual-content-system.md in your project root. Reference it at the start of every visual generation session: "Using the system defined in visual-content-system.md, generate [asset type]." Claude Design can read it directly as context.
Cross-link: CyberOS brand toolkit for security-focused products needing consistent trust-signal visuals. vibe-coding.academy for the course on building complete brand systems with AI tools.
Category 35: Claude Code Routines & Automation Prompts (New — April 2026)
These prompts are designed for Claude Code's Routines feature (launched April 2026), which runs saved workflows automatically on Anthropic's cloud infrastructure — triggered by GitHub events or cron schedules.
35.1 Automated Dependency Audit Routine (Intermediate)
Tool: Claude Code Routines | Trigger: Weekly cron | Time: Runs overnight
Deploy as a weekly cron Routine to audit all dependencies for CVEs, breaking changes, and outdated packages — then file a single consolidated GitHub issue with a prioritized upgrade plan.
You are a dependency security auditor running a weekly scan.
## Your task
1. Run `npm audit --json` (or equivalent for the project's package manager) and parse the output
2. Run `npx npm-check-updates --json` to identify outdated packages
3. Check the GitHub Security Advisories API for CVEs affecting any direct dependency
4. Cross-reference CVEs against the CISA Known Exploited Vulnerabilities catalog
## Prioritization framework
- P0 (File GitHub issue + comment on all open PRs): CVSS >= 9.0 CVEs in direct deps
- P1 (File GitHub issue): CVSS 7.0-8.9 CVEs, or packages > 2 major versions behind
- P2 (Add to weekly report): Minor/patch updates, low-severity advisories
- P3 (Skip): Dev-only dependencies with no production surface
## GitHub issue format
Title: `[Security] Weekly dependency audit — {DATE}`
Do not open a PR. File the issue only. Mark it with labels: `security`, `dependencies`.
If zero issues found: close any open dependency audit issues from previous weeks and post
a comment: "Weekly dependency scan {DATE}: No critical issues found."
Why it works: Manual dependency audits happen inconsistently — usually only when a CVE alert lands in your inbox, meaning you're already reactive. A Routine that runs every Monday at 2am means your team starts every week knowing their exposure.
Setup: Claude Code → Settings → Routines → New. Trigger: 0 2 * * 1 (every Monday at 2am). Connect GitHub. Paste prompt.
35.2 PR Quality Gate Routine (Beginner)
Tool: Claude Code Routines | Trigger: GitHub PR opened | Time: 2-3 min per PR
Run this Routine on every new pull request. It checks code quality, security, and test coverage gaps before a human reviewer looks at the diff.
You are a PR quality gate. Review the attached pull request diff and produce a
structured assessment. Do not approve or request changes — post a comment only.
Review for:
1. Security: OWASP Top 10, hardcoded secrets, missing auth checks on new endpoints
2. Code quality: functions >50 lines, duplicate code, broad TypeScript `any` types,
missing async error handling, console.log in production paths
3. Test coverage: new functions with no test changes, API endpoints with no integration test
4. PR hygiene: description matches diff, breaking changes flagged
Output as a GitHub comment:
**Automated PR Review**
| Category | Status | Details |
|----------|--------|---------|
| Security | Pass / Issues | [summary] |
| Code Quality | Pass / Issues | [summary] |
| Test Coverage | Pass / Issues | [summary] |
Issues requiring action before merge: [list with file:line, or "None."]
Suggestions (non-blocking): [list, or "None."]
_Automated review. Final approval requires human review._
Why it works: Routes mechanical catches to automation so human reviewers spend time on architecture and business logic decisions. Teams using automated first-pass review report 30–40% shorter human review cycles.
35.3 Daily Release Notes Generator (Intermediate)
Tool: Claude Code Routines | Trigger: Daily cron (9am) | Time: 5-10 min
Generates human-readable release notes from yesterday's merged PRs and appends to CHANGELOG.md automatically.
You are a technical writer generating daily release notes.
1. Fetch all PRs merged into `main` in the last 24 hours
2. Group by category from PR labels or commit prefix: feat/fix/perf/security/docs/chore
3. Write 1-3 sentence plain-English summaries of each change
4. Identify breaking changes (look for "BREAKING" in PR titles or descriptions)
Append to CHANGELOG.md at the top:
## {DATE}
### Breaking Changes
[If any. Omit section if none.]
### New Features
- **[Feature name]**: [1-2 sentence description]
### Bug Fixes
- **[What was broken]**: [What was fixed]
### Security
- [Specific CVE/issue patched]
Rules:
- If no PRs merged: append `## {DATE}\n_No changes merged._`
- Never overwrite existing CHANGELOG entries
- Commit with message: `docs: daily release notes {DATE}`
Why it works: CHANGELOG debt is universal — teams know they should maintain it but rarely do consistently. A Routine removes the friction entirely. The CHANGELOG stays accurate at zero ongoing cost.
Cross-link: → EndOfCoding.com for the full article on Claude Code Routines. → LLMHire.com for AI Automation Architect roles (this skill commands a $28K salary premium).
Category 36: Context Engineering Prompts (New — April 2026)
"Context engineering" — coined in early 2026 by Tobi Lütke (Shopify CEO) and rapidly adopted across the industry — is the discipline of structuring what you put into an AI's context window to maximize output quality. With Claude's 1M-token context and $200/mo Max plan, context management is now a primary vibe coding skill.
36.1 Legacy Codebase Context Map (Beginner)
Tool: Claude Code | Time: 15-20 min | Context: 1M tokens ideal
Use this at the start of any engagement with an unfamiliar or legacy codebase. It builds a mental model for Claude that persists across the session, dramatically reducing hallucination and incorrect assumptions.
I'm about to ask you to work on a large existing codebase. Before I give you
any tasks, I want to load you with the context you need to reason accurately.
## Codebase overview
[Paste your README or write 2-3 sentences describing the product]
## Tech stack
- Language: [e.g., TypeScript, Python]
- Framework: [e.g., Next.js 15, FastAPI]
- Database: [e.g., PostgreSQL via Supabase]
- Deployment: [e.g., Vercel + Railway]
- Key dependencies: [list 5-10 most important packages]
## Architecture pattern
[Describe in 2-3 sentences: monolith vs. microservices, how data flows, where business logic lives]
## Naming conventions
- Files: [e.g., kebab-case for components, camelCase for utils]
- DB tables: [e.g., snake_case, plural]
- API routes: [e.g., /api/v1/resource]
- Env vars: [e.g., NEXT_PUBLIC_ prefix for client-safe vars]
## What NOT to touch
[List any files, modules, or patterns to avoid — e.g., "Don't modify auth middleware, it's vendor-managed"]
## Current known issues
[List 3-5 open bugs or technical debt items so Claude doesn't re-introduce them]
Acknowledge this context and tell me what you understand about the codebase
before I give you your first task.
Why it works: Without this upfront loading, Claude infers conventions from what it sees in each individual file — and can contradict itself across a session. This prompt anchors a shared mental model that holds for the entire working session.
Pro tip: Save this filled-in template as CLAUDE_CONTEXT.md in your repo root. Paste its contents at session start, or reference it as a Routine pre-step.
36.2 Rolling Summary Context Compression (Intermediate)
Tool: Claude Code, Claude.ai | Time: 5 min per compression cycle | Context: Any size
Long conversations drift. After ~20 exchanges, earlier decisions get forgotten and Claude starts making inconsistent choices. This prompt compresses your session state into a portable summary you can paste into a fresh context window.
We've been working together for a while. Before continuing, I need you to create
a compressed context summary I can paste into a new session.
Write a structured summary with these sections:
## Project State
- What we're building: [1 sentence]
- Current milestone: [what we're working on right now]
- Completion status: [% done, what's left]
## Decisions Made (Do Not Revisit)
[List every architectural, naming, or technical decision we've committed to —
even if it feels suboptimal. These are locked.]
## Active Constraints
[List every constraint that's shaped our decisions: performance requirements,
team conventions, third-party limitations, deadlines]
## Mistakes to Avoid
[List every wrong path, failed approach, or anti-pattern we've already ruled out —
with 1 sentence on why it was rejected]
## Current Task State
[Describe exactly where we left off — what was last completed, what's in progress,
what the immediate next step is]
## Files Modified This Session
[List every file touched, with 1-sentence description of what changed]
Format this for copy-paste into a new Claude session. The summary should be
complete enough that a fresh Claude instance can continue seamlessly with zero
catch-up questions.
Why it works: Context compression is the single highest-leverage technique for long vibe coding sessions. Teams using this report 60–70% reduction in "wait, I thought we decided..." regressions. It also makes sessions resumable across days.
36.3 Multi-File Feature Context Bundle (Advanced)
Tool: Claude Code | Time: 5 min setup, saves hours | Context: Targeted loading
When implementing a new feature that touches 5+ files, Claude needs to see all relevant code simultaneously to avoid making changes that break other parts of the system. This prompt guides you through building the right context bundle before writing any code.
I'm about to implement: [feature name in 1 sentence]
Before writing any code, help me identify every file that could be affected
and what I need to know about each one.
## Feature description
[2-3 sentences on what the feature does, what user-facing behaviour it changes,
and what data it reads/writes]
## Entry points
[Where does this feature start? e.g., "New API endpoint at /api/payments/refund"
or "New button in the checkout flow"]
Based on this, please:
1. List every file likely to need modification (with filepath and why)
2. List every file I should READ but not modify (key context for side effects)
3. Identify any circular dependencies or layering violations to watch for
4. Flag any existing tests I must update
5. Estimate total lines-of-change and rate the blast radius: Low / Medium / High
Then read the files you've listed and summarize what you learn about each
before we write a single line of new code.
Why it works: The #1 cause of vibe coding regressions is writing code without reading all the files it interacts with. This prompt forces a "read phase" before any "write phase" — identical to how senior engineers approach large features. The blast radius estimate alone prevents dozens of surprise breakages.
Cross-link: → EndOfCoding.com for the deep-dive on context engineering techniques. → Vibe Coding Academy for the Context Mastery course module (covers CLAUDE.md, context windows, and session hygiene).
Category 37: Agentic Engineering Prompts (New — April 2026)
Andrej Karpathy coined "agentic engineering" in April 2026 — the professional evolution beyond vibe coding. Where vibe coding was about letting AI write code, agentic engineering is about directing AI agents with precision: architects design, agents implement, engineers verify. These prompts operationalize that workflow.
37.1 The Agentic Engineering Brief (Intermediate)
Tool: Claude Code, Cursor 3 | Time: 10-15 min | Category: Project Architecture
Inspired by: Karpathy's "agentic engineering" reframe — humans architect, agents implement.
I'm building [product/feature name]. Before writing any code, help me create an Agentic Engineering Brief:
## What I'm Building
[One paragraph description]
## Agent Task Breakdown
Decompose this into discrete tasks that an AI agent can execute autonomously:
1. [Task type: research/scaffold/implement/test/review]
2. ...
## Human Decision Points
Where do I need to review and approve before the agent continues:
- After: [milestone 1]
- After: [milestone 2]
## Acceptance Criteria
How will I know each task is complete and correct:
- [Measurable criterion 1]
- [Measurable criterion 2]
## Risk Flags
What should I watch for in the AI's output:
- [ ] Security: [specific concern for this project type]
- [ ] Logic: [specific business logic to verify]
- [ ] Dependencies: [packages to audit before installing]
Generate this brief, then we'll execute task by task with you as my engineering agent.
Why it works: The single biggest quality failure in AI-assisted development is jumping into code before the architecture is clear. This brief forces you to think like an engineering lead — decomposing work, setting decision gates, and specifying success criteria — before a single line of code is written. Teams using structured briefs report 40–60% fewer mid-project pivots.
Cross-link: → EndOfCoding.com for the full agentic engineering explainer. → LLMHire.com for Agentic Workflow Architect roles (the fastest-growing AI job category in Q2 2026).
37.2 The Dependency Safety Audit (Intermediate)
Tool: Claude Code, any LLM terminal | Time: 5 min | Category: Security
Inspired by: Slopsquatting attacks — AI-hallucinated package names used as malicious attack vectors. In Q1 2026, supply chain attacks using hallucinated package names rose 340% YoY.
Before I install these packages, audit them for safety:
[Paste the list of packages your AI suggested, e.g.:
- unused-imports
- react-query-v5-compat
- @supabase/auth-helpers-nextjs
]
For each package:
1. Confirm it exists on npm/PyPI/crates.io (not hallucinated)
2. Check download count (flag anything < 1,000/week)
3. Check last published date (flag if > 1 year)
4. Check maintainer count (flag if 1 maintainer with no activity)
5. Check for typosquatting similarity to a popular package
6. Note any known CVEs
Output as a table: Package | Verified | Downloads/wk | Last Published | CVEs | Verdict (SAFE/CAUTION/REJECT)
Flag any package you would not install in a production app and explain why.
Why it works: AI coding tools hallucinate package names at a measurable rate — typically 2–5% of suggestions in complex codebases. Slopsquatting actors register the hallucinated names and serve malicious payloads. This 5-minute audit catches the class of attack before it reaches your build. Run it every time AI suggests a package you haven't used before.
Cross-link: → EndOfCoding.com for the full security crisis analysis. → CyberOS.dev for automated supply chain scanning (detects slopsquatting patterns in CI/CD).
37.3 The AI Output Trust Calibration Prompt (Beginner)
Tool: Any LLM | Time: 5 min | Category: Quality / Evaluation
Inspired by: Developer trust in AI tools collapsing to 29% — the "almost right but not quite" problem costs teams hours in debugging code that looked correct on first read.
You just gave me this code/solution:
[PASTE THE AI OUTPUT HERE]
Now play devil's advocate. In this code:
1. What could be wrong or subtly broken that I might miss on first read?
2. What assumptions did you make that might not hold in my specific context?
3. What are the 2-3 things most likely to fail in production?
4. What would you want to test first before shipping this?
5. Is there a simpler approach you didn't take? Why didn't you take it?
Be honest. I'd rather know the risks now than discover them at 2am.
Why it works: AI models are trained to be helpful, which means they default to confident, complete-looking answers even when they're working from incomplete context. This prompt exploits the model's ability to reason about its own outputs — switching from generation mode to critique mode. Read question 2 first: the assumptions section surfaces the real risks fastest. Teams running this prompt before every PR merge report catching 30–40% more issues that would have reached production.
37.4 The Multi-Model Router Design Prompt (Advanced)
Tool: Claude Code, Cursor | Time: 60-90 min | Category: Architecture / Cost Optimization
Inspired by: 90% API cost reduction achieved via multi-model routing (n1n.ai, April 2026). With frontier models costing $5–75/M tokens and open models available for $0.10–0.50/M, intelligent routing is the highest-ROI architecture decision for AI-heavy applications.
I'm building an AI feature that currently routes all requests to [expensive model, e.g., Claude Opus 4.6].
Monthly cost is $[X]. I want to reduce this by 70%+ using multi-model routing without degrading quality.
Current request types hitting [expensive model]:
1. [Request type 1] — e.g., "classify user intent from a short message" — volume: [N]/day
2. [Request type 2] — e.g., "generate a 500-word marketing email" — volume: [N]/day
3. [Request type 3] — e.g., "debug a TypeScript error with full codebase context" — volume: [N]/day
Design a multi-model routing architecture:
## Model Tier Assignment
For each request type above, assign to the appropriate tier:
- Tier 1 (classification/routing): Mistral 7B or similar at < $0.20/M — for intent detection, simple categorization
- Tier 2 (general tasks): DeepSeek-V3 or Llama 3.1 70B at < $0.80/M — for summarization, drafts, standard Q&A
- Tier 3 (complex reasoning): [Current expensive model] — reserve for tasks requiring deep context, code generation, or multi-step reasoning
## Router Implementation
Write a routing function that:
1. Classifies each incoming request by complexity (Tier 1 fast classifier, < 100ms)
2. Routes to the appropriate model
3. Falls back to the next tier up if confidence < 0.85
4. Logs tier assignments for quality review
## Caching Layer
Add semantic caching using Redis:
- Cache responses for semantically similar queries (cosine similarity > 0.92)
- TTL: [appropriate for your domain, e.g., 1 hour for support answers, 24h for documentation]
- Cache hit rate target: > 30% of requests
## Quality Gate
Define what "quality equivalent" means for each tier:
- Run A/B test routing 10% of Tier 2 traffic to Tier 3 for 1 week
- Measure: [task completion rate / user satisfaction / error rate]
- Accept Tier 2 routing only if metrics within [5%] of Tier 3 baseline
Show me: the router code, the Redis caching layer, estimated new monthly cost, and the A/B test setup.
Why it works: Model routing is the single highest-ROI optimization for AI applications — but most teams skip it because designing the routing logic feels complex. This prompt structures the design process into clear tiers with quality gates, preventing the common failure mode where cheaper models get assigned tasks they can't handle. The semantic caching layer alone typically cuts 25–35% of API calls. Run this prompt once per AI feature surface; the resulting architecture typically achieves 70–90% cost reduction with less than 5% quality degradation.
Cross-link: → EndOfCoding.com for AI cost optimization analysis. → CyberOS.dev for API security scanning of multi-model routing endpoints.
37.5 The Desktop AI Agent Workflow Audit Prompt (Intermediate)
Tool: Claude Code, Codex Desktop | Time: 20-30 min | Category: Workflow / Automation
Inspired by: OpenAI Codex Desktop's background computer use across any Mac app (April 2026) and Claude Code Routines. Desktop AI agents can now operate autonomously across applications while you work in parallel — but most developers have no framework for deciding which tasks to delegate versus keep manual.
I want to set up desktop AI agents (Claude Code Routines / Codex Desktop / similar) to handle recurring tasks autonomously in the background.
My current recurring dev tasks (estimate time per week):
1. [Task 1] — e.g., "reviewing PRs for style and obvious bugs" — [N hours/week]
2. [Task 2] — e.g., "updating dependencies and checking changelogs" — [N hours/week]
3. [Task 3] — e.g., "writing release notes from git log" — [N hours/week]
4. [Task 4] — e.g., "responding to standard support tickets" — [N hours/week]
For each task, evaluate:
## Automation Suitability Matrix
Score each task on:
- **Reversibility** (1-5): If the agent makes a mistake, how easy to undo? (5 = trivial, 1 = catastrophic)
- **Determinism** (1-5): How predictable is the correct output? (5 = clear right answer, 1 = judgment call)
- **Verification** (1-5): How easy to verify agent output quality? (5 = automated check, 1 = expert review required)
- **Volume** (1-5): How often does this task occur? (5 = multiple times/day, 1 = monthly)
Automate tasks scoring > 12/20. Keep manual tasks scoring < 8/20. Human-in-loop for 8-12/20.
## Agent Configuration
For each task marked AUTOMATE:
1. Write the Routine/agent prompt (be specific: what to check, what to ignore, what to escalate)
2. Define the trigger: [schedule / GitHub event / file change / manual]
3. Define the success criteria: what does "done correctly" look like?
4. Define the escalation condition: when should the agent stop and ask a human?
5. Define the rollback plan: if the agent's output is wrong, how do we fix it?
## Safety Constraints
For all agents, enforce:
- Never push to main without human approval
- Never send external communications (email, Slack) without review
- Always create a draft/branch/preview, not a final artifact
- Log every action to [audit log location]
Output: a prioritized automation roadmap with ready-to-use agent prompts for the top 3 tasks.
Why it works: Desktop AI agents are powerful but dangerous when applied without a framework. The suitability matrix prevents the two failure modes: over-automation (delegating judgment calls to agents) and under-automation (manually doing tasks that are perfect for agents). The safety constraints are non-negotiable — every production-grade agent deployment needs explicit boundaries on irreversible actions and external communications. Teams that run this audit before deploying agents avoid 80% of the agent-gone-wrong incidents that generate angry post-mortems.
Cross-link: → Vibe Coding Academy for structured lessons on Claude Code Routines setup. → EndOfCoding.com for Codex Desktop computer use deep dive.
Cross-link: → EndOfCoding.com for the full trust collapse data. → Vibe Coding Academy for the Quick Tip lesson on trust calibration.
Category 38: AI Output Evaluation & Production Quality Prompts (New — April 2026)
As AI-generated code and content flood production systems, teams are discovering a painful gap: they have no systematic way to verify that AI output is correct, regressing, or degrading over time. These prompts address the emerging discipline of AI quality engineering — building test suites, A/B frameworks, and CI/CD gates that treat AI output like any other production artifact.
38.1 The LLM Regression Test Suite Builder (Intermediate)
Tool: Claude Code, Cursor | Time: 45-60 min | Category: Quality / Testing
Inspired by: The growing incidence of "silent quality regression" where prompt or model changes degrade output quality without triggering any alerts. Engineering teams at Notion, Linear, and Vercel have reported this as a top-5 AI production issue in Q1 2026.
I have an AI feature that uses [model, e.g., Claude Sonnet 4.6] for [task description, e.g., "generating user-facing error messages from raw exception data"].
The feature is currently working well, but I need a regression test suite so I know immediately if output quality degrades after:
- A prompt change
- A model version upgrade
- A context window change
- A temperature/parameter adjustment
## Current Feature Spec
- Input: [describe the inputs, e.g., "raw Node.js stack trace + user action that triggered it"]
- Expected output: [describe what good looks like, e.g., "plain-English error message under 50 words, no technical jargon, actionable next step"]
- Output format: [e.g., JSON with fields: message, action, severity]
- Current prompt: [paste your system prompt]
## Build a Regression Test Suite
### Step 1: Golden Dataset
Create 20 test cases covering:
- 5 happy-path inputs (clear, well-formed data)
- 5 edge cases (empty inputs, very long inputs, unusual formats)
- 5 adversarial inputs (inputs designed to confuse the model)
- 5 real production examples (anonymized from logs)
For each test case, define:
- Input (the exact data the model receives)
- Expected output characteristics (not exact text — that's too brittle)
- Evaluation criteria (a checklist of what makes the output acceptable)
### Step 2: Evaluation Rubric
For my feature, define a rubric with 5 dimensions scored 1-5:
1. [Accuracy]: Does the output correctly interpret the input?
2. [Format compliance]: Does output match required JSON/format?
3. [Tone]: Is the output appropriate for [audience]?
4. [Completeness]: Are all required fields populated?
5. [Safety]: Does output avoid [specific harms, e.g., exposing stack traces to users]?
Pass threshold: average score >= 4.0 across all test cases.
### Step 3: Automated Evaluation
Write an evaluation script that:
1. Runs all 20 test cases against the current prompt/model
2. Scores each output against the rubric using a fast evaluator model (Claude Haiku 4.5)
3. Generates a report: overall score, per-dimension breakdown, failed cases with details
4. Exits with code 1 if overall score < 4.0 (fail) or >= 4.0 (pass)
Language: [TypeScript/Python]
Test runner: [Jest/pytest/Vitest]
### Step 4: Baseline
Run the suite against the current prompt/model and save results as baseline.json.
All future runs compare against this baseline; alert if any dimension drops > 0.3 points.
Output: the 20 test cases, the evaluation rubric, the evaluator script, and baseline.json structure.
Why it works: Most AI testing fails because it checks for exact string matches (too brittle) or relies on human review (doesn't scale). This prompt creates rubric-based evaluation — scoring output against quality dimensions rather than exact text — which is both automatable and meaningful. The golden dataset covers the failure modes that actually occur in production, not just the happy path. Teams that implement this catch prompt regressions within hours of deployment rather than days after user complaints.
Cross-link: → EndOfCoding.com for AI quality engineering deep dives. → Vibe Coding Academy for hands-on lessons in LLM testing frameworks.
38.2 The Prompt A/B Testing Framework (Advanced)
Tool: Claude Code, Cursor | Time: 60-90 min | Category: Quality / Experimentation
Inspired by: The proliferation of prompt variants across teams — most organizations now have 3-10 competing prompt versions for core features, with no systematic way to determine which performs best. A/B testing prompts has become as important as A/B testing UI copy.
I want to A/B test two (or more) prompt variants for my AI feature to determine which performs better in production.
## Feature Context
- Feature: [e.g., "AI-generated onboarding email personalization"]
- Current prompt (Control - Variant A): [paste prompt A]
- New prompt (Challenger - Variant B): [paste prompt B]
- What I'm trying to improve: [e.g., "email open rate / click rate / user activation within 7 days"]
- Traffic volume: approximately [N] requests/day through this feature
## Build the A/B Testing Infrastructure
### Traffic Splitting
Design a deterministic traffic splitter that:
- Routes [50%] of requests to Variant A, [50%] to Variant B
- Uses user ID (or session ID) for consistent assignment (same user always gets same variant)
- Logs which variant served each request with a unique experiment ID
- Supports gradual rollout: start 10/90, move to 50/50, then 90/10 before full switch
```typescript
// Implement this function:
function selectPromptVariant(userId: string, experimentId: string, variants: Record<string, number>): string {
// variants = { "A": 0.5, "B": 0.5 }
// Must be deterministic: same userId + experimentId → same variant every time
// Use consistent hashing, not Math.random()
}
Outcome Tracking
Define the primary metric for this experiment:
- Primary metric: [e.g., "user clicks the CTA in the email within 48h"]
- Secondary metrics: [e.g., "email open rate, unsubscribe rate"]
- Guardrail metric: [e.g., "spam complaint rate must not increase > 0.1%"]
- Minimum detectable effect: [e.g., "5% improvement in click rate"]
- Statistical significance threshold: p < 0.05 (two-tailed)
Write the tracking event schema:
interface PromptExperimentEvent {
experimentId: string;
variantId: 'A' | 'B';
userId: string;
timestamp: string;
primaryMetricTriggered?: boolean; // logged separately when outcome occurs
metadata?: Record<string, unknown>;
}
Sample Size Calculator
Given:
- Baseline conversion rate: [e.g., 12%]
- Minimum detectable effect: [e.g., 5% relative improvement → 12.6%]
- Statistical power: 80%
- Significance level: 5%
Calculate: how many requests per variant are needed before we can declare a winner?
Analysis Query
Write a SQL query (for [Postgres/BigQuery/SQLite]) that:
- Joins experiment assignment events with outcome events
- Calculates conversion rate per variant
- Runs a chi-squared test for statistical significance
- Returns: variant, requests, conversions, conversion_rate, p_value, is_significant
Decision Rules
Define clear stop conditions:
- Stop early for harm: if guardrail metric exceeds threshold with > 95% confidence, stop immediately
- Stop early for win: if primary metric improvement > MDE with p < 0.01 after 50% of required sample
- Stop at plan: declare winner after required sample size reached, even if not significant (null result is a result)
Output: the traffic splitter, tracking schema, SQL analysis query, and decision rules documentation.
**Why it works**: Prompt A/B testing fails in practice because teams eyeball results or run tests too short. This framework imports the rigor of classical A/B testing — statistical significance, power calculations, guardrail metrics — into the AI prompt domain. The deterministic traffic splitter is critical: random assignment creates inconsistent user experiences and confounds results. The decision rules prevent the most common mistake: stopping tests early when early results look good but sample size is insufficient. This framework has been validated by teams at 3 mid-stage AI startups who discovered their "better" intuition prompts actually underperformed by 8-15% on measured outcomes.
**Cross-link**: → [EndOfCoding.com](https://endofcoding.com) for prompt experimentation methodology articles. → [Vibe Coding Academy](https://vibe-coding.academy) for the A/B testing for AI features course module.
---
### 38.3 The AI Quality Gate for CI/CD (Expert)
**Tool**: Claude Code, GitHub Actions | **Time**: 90-120 min | **Category**: Quality / DevOps
*Inspired by: The engineering teams shipping AI feature updates daily are discovering that standard CI/CD (lint, test, deploy) doesn't catch AI-specific regressions: prompt drift, context window violations, output format breaks, and latency spikes. Quality gates for AI features are the next frontier of CI/CD.*
I want to add an AI quality gate to my CI/CD pipeline that automatically validates AI feature health before every deployment.
Current Pipeline
- CI/CD: [GitHub Actions / GitLab CI / CircleCI]
- Deployment: [Vercel / Railway / AWS / GCP]
- AI features: [list the AI-powered features in your app, e.g., "chat assistant, code review bot, document summarizer"]
- Current pipeline: lint → unit tests → integration tests → deploy
Design the AI Quality Gate
I want to add an "AI Health Check" stage between integration tests and deploy that fails the pipeline if AI quality degrades.
Gate 1: Prompt Integrity Check
Before deployment, verify that all prompts in the codebase:
- Are valid (no syntax errors, no truncated templates)
- Are within model context limits (tokenize and count — fail if > 80% of context window)
- Have not changed from last deploy (flag changes for human review, not automatic block)
- Include required safety instructions (check for presence of [specific safety phrases])
Write a script that:
- Finds all prompt files/strings matching [pattern, e.g.,
prompts/**/*.mdorconst SYSTEM_PROMPT] - Runs each check above
- Outputs a structured report: prompt_id, checks_passed, checks_failed, token_count, change_detected
- Exits with code 1 if any check fails (except change_detected — that's a warning only)
Gate 2: Golden Dataset Regression
Run the regression test suite (from Prompt 38.1) against the new prompt/model version:
- Execute all [N] test cases
- Score with evaluator model
- Compare scores to baseline.json
- Fail if: overall score drops > 0.3 points OR any single dimension drops > 0.5 points
- Pass if: all scores within acceptable range OR new prompt scores BETTER than baseline (update baseline on pass)
Gate 3: Latency & Cost Budget
For each AI feature, enforce SLOs:
- P95 latency ≤ [500ms] (run [10] test calls, measure P95)
- Average cost per call ≤ $[0.005] (use token counts × model pricing)
- Fail if: latency or cost exceeds budget by > 20%
- Report: actual vs. budget for each feature, with model/prompt recommendations if over budget
Gate 4: Safety & Content Policy Check
Run [3-5] adversarial test cases designed to elicit unsafe outputs:
- [Test case 1: describe the adversarial input and what unsafe output to watch for]
- [Test case 2: ...]
- [Test case 3: ...] Pass criteria: model refuses or safely deflects all adversarial inputs. Fail: pipeline blocked, immediate security review required.
GitHub Actions Workflow
Write a GitHub Actions job ai-quality-gate that:
- Runs after
integration-testsjob - Executes all 4 gates sequentially (stop on first failure)
- Uploads gate reports as GitHub Actions artifacts
- Posts a summary comment on the PR with gate results (using
github-script) - Requires manual approval via GitHub Environments if Gate 3 (change detected) is flagged
# ai-quality-gate.yml
name: AI Quality Gate
on:
pull_request:
paths:
- 'prompts/**'
- 'src/ai/**'
- '.env.example'
jobs:
ai-quality-gate:
runs-on: ubuntu-latest
steps:
# Implement the 4 gates above
Output: the full GitHub Actions workflow, all gate scripts, and the PR comment template.
**Why it works**: AI quality gates close the gap that every team hits when shipping AI features fast: standard CI catches code bugs but not AI behavior bugs. The four-gate design mirrors the four failure modes that actually bring down AI features in production — broken prompts (Gate 1), silent quality regression (Gate 2), cost/latency overrun (Gate 3), and safety failures (Gate 4). The GitHub Actions integration makes this a first-class part of the engineering workflow, not an optional manual check. Teams that implement this report catching 2-3 regressions per month that would have reached users; the average incident cost avoided is estimated at 4-8 hours of investigation plus user trust damage.
**Cross-link**: → [EndOfCoding.com](https://endofcoding.com) for CI/CD for AI applications deep dives. → [Vibe Coding Academy](https://vibe-coding.academy) for the AI DevOps course module. → [CyberOS.dev](https://cyberos.dev) for security scanning of AI pipeline configurations.
18. Tool Comparison Matrix
A living comparison of every major vibe coding tool. Updated monthly.
AI-Native IDEs
| Tool | Price | Best For | Key Feature | Security Concern |
|---|---|---|---|---|
| Cursor | $20/mo | Full-stack dev, large codebases | Composer multi-file gen, Automations (event-driven agents), MCP Apps | CurXecute (CVE-2025-54135) |
| Windsurf (acquired) | N/A | Long-context projects | Memories (persistent context) | Memory poisoning via prompt injection |
| VS Code + Copilot | $10/mo | AI without switching editors | Inline suggestions, Agent Mode, chat | Lower risk (suggestions, not autonomous) |
Autonomous Agents
| Tool | Price | Best For | Autonomy | Differentiator |
|---|---|---|---|---|
| Claude Code | Usage-based | Enterprise codebases | High (subagent teams) | $2.5B+ ARR, 80.9% SWE-bench (#1 of 15 agents), multi-agent orchestration |
| Devin | $500/mo | Async tasks, migrations | Very High | Full AI employee model, Devin Review |
| Codex CLI | Usage-based | Open-source, Rust/systems | Medium | Open-source, sandboxed execution |
| Jules | Free-$125/mo | Async bugfixes, PR gen | High | Works while you sleep, Gemini 3 Pro |
| Amazon Q | Free-$19/mo | AWS-heavy projects | Medium | Deep AWS integration |
Browser Builders (No-Code)
| Tool | Price | Best For | Output Quality | Risk Level |
|---|---|---|---|---|
| Bolt.new | Free-$20/mo | Rapid full-stack prototypes | Good | Medium |
| v0 | Free-$20/mo | React/Next.js UI components | Excellent | Low (UI only) |
| Lovable | Free-$25/mo | Non-dev app creation | Good | High (170/1645 apps had vulns) |
| Replit Agent | Free-$25/mo | Complete apps from description | Good | Medium — $400M Series D, $9B valuation (Mar 2026). 75% of Replit AI users write zero code. |
Open-Source & Cost-Efficient Alternatives
For teams optimizing cost, data privacy, or running on self-hosted infrastructure.
| Model/Tool | Parameters | Cost vs Claude Sonnet | SWE-bench / Rank | Best For |
|---|---|---|---|---|
| MiMo-V2-Pro (Xiaomi) | 1 Trillion (Hunter Alpha) | -67% cheaper than Claude Sonnet 4.6 | 3rd globally on agent benchmarks (Mar 2026) | Cost-sensitive production workloads, batch jobs |
| Gemini CLI (Google) | N/A (cloud) | Free tier available | Competitive, Flash variant | Open-source terminal work, Google ecosystem |
| Codex CLI (OpenAI) | N/A (cloud) | Usage-based (GPT-5.4) | 77.3% Terminal-Bench | Sandboxed execution, CI/CD integration |
| obra/superpowers | N/A (framework) | Free + model API costs | 92,100 GitHub stars (Mar 2026) | Custom agent framework, multi-step workflows |
| OpenClaw | N/A (framework) | Free + model API costs | 210,000 GitHub stars (Mar 2026) | Open-source agent orchestration, self-hosted |
Choosing Your Stack
</div>
19. The Security Playbook
A practical guide to hardening vibe-coded applications before they touch real users.
</div>
The 30-Minute Security Checklist
Run this on every vibe-coded application before showing it to anyone outside your team:
</div>
</div>
</div>
</div>
</div>
</div>
AI Tool Security Advisories
MCP Supply Chain: The New Attack Surface
Key MCP CVEs (March 2026):
- CVE-2026-23744 (CVSS 9.8, MCPJam Inspector ≤ v1.4.2): A crafted HTTP request to a critical endpoint bound to 0.0.0.0 with no authentication can install an arbitrary MCP server and execute code on the host. No user interaction required.
- Azure MCP Server RCE (CVSS 9.6, demonstrated at RSAC 2026): A vulnerability in Microsoft’s Azure MCP server capable of compromising cloud environments via the agent connection.
- SSRF exposure: BlueRock Security analyzed 7,000+ MCP servers and found 36.7% potentially vulnerable to server-side request forgery.
How to protect yourself:
- Audit all installed MCP servers. Run
ls ~/.config/claude/mcp*and remove any servers you didn’t explicitly install. - Only install MCP packages from verified, well-known authors with active maintenance history.
- Pin MCP server versions in your configuration — don’t use
@latest. - Check package provenance before installing from ClawHub or any MCP registry.
- Treat MCP server packages as executable code with system access — because they are.
Supply Chain Attacks: April 2026 Alert
April 2026 Supply Chain Attack Summary:
| Package / Tool | Date | Impact | Attribution |
|---|---|---|---|
| axios 1.14.1, 0.30.4 | March 31 | WAVESHAPER.V2 RAT; ~100M weekly downloads | UNC1069 (North Korea/DPRK) |
| LiteLLM 1.82.7, 1.82.8 | March 24 | Multi-stage credential stealer (SSH keys, cloud tokens, K8s secrets, .env files) | Unknown |
| Langflow ≤ 1.8.2 (CVE-2026-33017) | March 17 | Unauthenticated RCE via public endpoint; exploited within 20h; CISA KEV | Active threat actors |
| Trivy Docker Hub images (CVE-2026-33634) | March 19 | Malicious code in Aqua Security's Trivy scanner images | TeamPCP |
Langflow CVE-2026-33017 detail: Critical code injection in the AI agent framework's public flow build endpoint. No authentication required. Exploitation was observed in the wild within 20 hours of public disclosure and CISA added it to the Known Exploited Vulnerabilities catalog. If you run Langflow, upgrade to 1.8.3+ immediately.
Trivy Cascade extended (April 2026): The Trivy compromise (CVE-2026-33634) evolved into a much larger incident. Attackers force-pushed malicious code to 75 of 76 trivy-action GitHub Actions tags, then published additional malicious Docker images during the remediation effort (taking 5 days to fully evict). The attack then spawned CanisterWorm — a self-propagating npm worm that hit 64+ packages using blockchain-based command-and-control infrastructure, making it resistant to traditional domain seizure. CanisterWorm spread to Checkmarx KICS and AST GitHub Actions, and separately reached LiteLLM (95 million monthly PyPI downloads). Any CI/CD pipeline that used Trivy, Checkmarx KICS, or LiteLLM between March 19 and April 10 should be treated as potentially compromised and audited.
What this means for vibe coders:
- Dependencies installed by AI-generated code are attack vectors. Always
npm auditafter any AI-generatedpackage.jsonor install step. - AI coding tools themselves (Langflow, LiteLLM, MCP servers, security scanners) are now priority targets for supply chain attackers.
- Security tooling is not immune — Trivy (a vulnerability scanner) was itself the vector. Audit your audit tools.
- Pin exact dependency versions. Don't use
@latestor loose semver ranges for packages you can't quickly audit. - Enable npm provenance verification and
--ignore-scriptsin CI pipelines to limit post-install attack surface. - Blockchain-based C2 is increasingly being used to make supply chain worms resistant to takedown — conventional domain blocklists are insufficient.
Vibe-Coded App Vulnerability Research
AI-generated code CVE trend:
| Month | CVEs attributed to AI-generated code |
|---|---|
| January 2026 | 6 |
| February 2026 | 15 |
| March 2026 | 35 |
The accelerating rate reflects both more AI-generated code in production and improved attribution tooling. Per Autonoma research, 53% of AI-generated code contains security holes. The pattern in these CVEs is consistent: AI models tend to generate working functionality quickly but skip authentication checks, hardcode credentials, and mis-scope data access — exactly the failures the 30-minute checklist is designed to catch.
The Coming Paradigm: AI as Autonomous Vulnerability Researcher
This is a meaningful shift. For years, the security community discussed AI as a tool to help humans find bugs faster. Claude Mythos demonstrates a model that can operate the entire vulnerability research workflow autonomously — including exploitation. The implications for vibe-coded applications:
- The attack surface is permanent. Security is not a one-time audit. Autonomous vulnerability research tools will continuously discover new issues in deployed applications. Shipping and forgetting is no longer viable.
- AI finds what humans miss. A 17-year-old RCE in FreeBSD escaped human detection for nearly two decades. AI can find deep logic bugs and memory-corruption patterns at scale.
- Defense must scale too. The same AI capabilities that find bugs can also be used defensively to scan your code before it ships. Use AI-powered security scanning in your CI/CD pipeline — not as a replacement for the 30-minute checklist, but as an additional layer.
- The vibe-coded app risk is elevated. AI-generated code is already producing 35+ CVEs per month. As autonomous vulnerability finders become more capable, that code will be scanned faster and more thoroughly by both defenders and attackers.
The practical response for vibe coders: treat every public-facing application as permanently under automated security review. Build with authentication, input validation, and secrets management from the first commit — not as an afterthought.
Security Prompts for AI Tools
Review this codebase for OWASP Top 10 vulnerabilities.
For each issue found: severity (Critical/High/Medium/Low),
file and line number, what's wrong, the fix, and how to test it.
Prioritize by severity.
</div>
Chapter 20: Video Tutorials -- Embedded Remotion-Generated Walkthroughs
Bite-sized, binge-worthy video tutorials that show real vibe coding workflows in action. Each video is 60-120 seconds, focused on one specific technique, and embedded directly in the interactive ebook using Remotion components. Updated monthly with 2-4 new videos.
Why Video Tutorials Inside an Ebook
Reading about vibe coding is one thing. Watching a real app materialize from a single prompt in under ninety seconds is something else entirely.
Traditional ebooks give you text and screenshots. This one gives you motion. Every video in this chapter is a self-contained Remotion composition -- a React component that renders to video. That means each tutorial is versioned, reproducible, and embedded natively in the interactive ebook without relying on external hosting. You can watch them inline, pause on any frame, and in the web version, interact with the code snippets directly.
The videos are grouped into three series, each designed for a different purpose:
- Prompt to Product -- Viral-format demonstrations of complete apps built from single prompts. Optimized for shareability and shock value.
- The Prompt That... -- Educational deep-dives with a comedic edge. Each video dissects one prompt and its unexpected consequences.
- Tool Face-Off -- Head-to-head comparisons between competing tools, scored on speed, quality, and developer experience.
Every video follows the same production pipeline: markdown script, Remotion composition with screen recordings and motion graphics, AI-generated narration, and branded end cards. The result is a library that grows over time and works across platforms -- full-length on YouTube, clipped for TikTok/Reels/Shorts, and embedded here in the ebook.
Video Series 1: "Prompt to Product" (Viral Potential)
Each video in this series shows a complete, functional application being built from a single natural-language prompt. A real-time countdown timer runs in the corner. The screen recording is unedited -- what you see is what actually happened. The final reveal shows the deployed app running in a browser.
Series format:
- Duration: 60-90 seconds
- Structure: Hook (3s) -> Prompt reveal (5s) -> Countdown build (40-70s) -> Reveal + deploy (10s) -> End card (5s)
- Visual signature: Neon countdown timer in the top-right corner, split-screen showing prompt on the left and the AI's output on the right
- Audio: Fast-paced electronic background track, AI text-to-speech narration, keystroke and notification sound effects
Video #1: 60-Second SaaS (Bolt.new)
Title/Hook: "I built a $9/month SaaS in 60 seconds"
Tool: Bolt.new
Concept: Starting from a completely blank Bolt.new session, a single prompt generates a fully functional micro-SaaS -- a link shortener with analytics, user accounts, and a Stripe-ready pricing page. The countdown timer hits zero just as the app deploys.
Tone: Breathless, slightly disbelieving. The narration captures the genuine absurdity of how fast this is.
Script Outline (170 words): Open on a blank browser tab. The narrator says: "I'm going to build a SaaS product that charges $9 a month. I have 60 seconds." The countdown starts. Cut to the Bolt.new interface. The prompt appears on screen as it is typed: a link shortener with user authentication, click analytics dashboard, custom short domains, and a pricing page with free and pro tiers. Bolt.new starts generating. The split screen shows the prompt on the left, the live preview assembling on the right -- components appearing in real time, a login form, a dashboard with charts, a pricing table with toggle between monthly and annual. The timer passes 30 seconds. The app is taking shape. At 50 seconds, the deployment starts. At 58 seconds, a live URL appears. The timer hits zero. Cut to the deployed app in a fresh browser: working signup, working dashboard, working pricing page. End card: "Total cost: $0. Total code written by a human: 0 lines."
Visual Concepts for Remotion:
CountdownTimercomponent: neon green digits, pulses red below 10 seconds, shakes at 3-2-1SplitScreenBuildcomposition: left panel shows the prompt text animating in typewriter-style, right panel shows a screen recording of Bolt.new's live previewDeploymentFlashanimation: when the URL goes live, a burst animation radiates from the URL barMetricCardend-card overlay: three floating cards showing "Time: 60s", "Lines of code: 0", "Cost: $0" with staggered fade-in- Screen recording captured at 60fps, composited at 30fps for smooth playback
Video #2: Portfolio Speedrun (v0 + Vercel)
Title/Hook: "Your portfolio shouldn't take longer than your morning coffee"
Tools: v0 by Vercel, Vercel deployment
Concept: A developer's portfolio website -- hero section, project grid, about page, contact form, dark mode toggle -- goes from blank prompt to live Vercel deployment while a coffee timer ticks down. The coffee metaphor runs throughout: the video opens with pouring coffee, and each section of the site appears as the coffee cools.
Tone: Relaxed and conversational, contrasting with the speed of what is happening on screen. The humor comes from the mismatch between the casual narration and the absurd pace.
Script Outline (180 words): Open on a close-up of coffee being poured. The narrator says: "The average developer spends 3 weeks on their portfolio. I'm going to finish mine before this coffee is cool enough to drink." Cut to v0. The prompt describes a developer portfolio: dark theme, animated hero with a typewriter effect showing "I build things," a responsive project grid pulling from a JSON file, an about section with a timeline, a contact form, and a dark/light mode toggle. v0 generates the first component. The narrator walks through what is appearing while keeping the tone casual -- "Oh, that's a nice grid layout... didn't ask for that hover effect but I'm keeping it." At 40 seconds, the design is complete. The code is exported to a GitHub repo. Vercel picks up the push and begins deploying. The narrator takes a sip of coffee. The Vercel build completes. The live site loads: responsive, polished, with real content. "Still too hot to drink. I should probably build a second portfolio."
Visual Concepts for Remotion:
CoffeeTimercomponent: a coffee cup illustration in the corner with a steam animation, a circular progress ring around it representing timeComponentAssemblyanimation: each section of the portfolio slides into a wireframe layout, then fills in with color and content -- like a blueprint becoming a buildingv0Previewscreen capture: the v0 interface generating components in real timeVercelDeployanimation: a minimal deployment progress bar styled in Vercel's black-and-white aesthetic, with the URL appearing at the end- Smooth crossfade transitions between the coffee close-up and the screen recording
Video #3: The $0 Startup (Lovable)
Title/Hook: "This app makes money. I didn't write a single line."
Tool: Lovable
Concept: A non-technical founder builds a complete SaaS product using only Lovable -- from idea to deployed, revenue-generating application. The video emphasizes that the person building this has no programming background. The "reveal" is not just the app, but a real Stripe dashboard showing the first payment.
Tone: Inspirational but grounded. Not "anyone can do this" hype -- more "here's exactly what the process looks like when you've never coded before."
Script Outline (190 words): Open on a text overlay: "I'm not a developer. I'm a marketing manager." The narrator continues: "Last month, I had an idea for a tool that helps freelancers track their invoices. This morning, I built it." Cut to Lovable. The prompt is detailed and specific -- it describes an invoice tracker with client management, recurring invoice templates, PDF export, and a simple dashboard showing outstanding payments. Lovable begins generating. The narration explains the key decisions: why the prompt specifies Supabase for the backend, why it asks for Row Level Security so each user only sees their own data, why it mentions Stripe Connect for future payment processing. At 45 seconds, the app is running in Lovable's preview. The narrator tests the core workflow: create a client, generate an invoice, export to PDF. Everything works. At 70 seconds, the app deploys. Cut to a real Stripe dashboard showing a $12 test payment. "I didn't write code. I didn't hire a developer. I described what I needed. Total investment: a Lovable subscription and one afternoon of prompt writing."
Visual Concepts for Remotion:
IdentityCardintro animation: a business-card-style overlay showing "Marketing Manager" with a crossed-out "Developer" beneath itPromptAnnotationoverlay: as the prompt scrolls, key phrases highlight and small tooltip annotations explain why each detail matters (e.g., "Row Level Security" highlights with a note: "This keeps each user's data private")WorkflowDemoscreen recording: the invoice creation flow captured step-by-step with zoom-ins on important UI elementsStripeRevealanimation: the Stripe dashboard slides in from the bottom with a cash register sound effect and a subtle confetti particle burst- Color palette shifts from grayscale (the "before") to full color (the "after") as the app comes to life
Video #4: Clone Wars (Cursor)
Title/Hook: "I showed AI a screenshot of Notion. Here's what happened."
Tool: Cursor (Agent mode with Composer)
Concept: A screenshot of Notion's interface is fed to Cursor's AI, along with a prompt asking it to recreate the core functionality. The video follows the agent as it plans the architecture, generates the components, and builds a working Notion-like workspace -- pages, blocks, drag-and-drop, slash commands -- all from a single image and a paragraph of context.
Tone: Playful and slightly mischievous. The "clone wars" framing leans into the controversy of AI-generated clones while keeping it lighthearted.
Script Outline (185 words): Open on a screenshot of Notion's interface. The narrator says: "This is Notion. 400 engineers built this over 10 years. I'm going to see how close AI can get in 2 minutes." The screenshot is dragged into Cursor's Composer. The prompt is brief but precise: recreate a note-taking workspace with a sidebar, nested pages, rich text blocks, slash command menu for adding headers/lists/toggles, and drag-to-reorder blocks. Cursor's agent starts planning. An overlay shows the agent's thought process -- the file tree it is creating, the components it has decided to build, the libraries it is installing. At 30 seconds, the first components render: a sidebar with a page tree. At 60 seconds, the editor is working: typing, formatting, slash commands. At 90 seconds, drag-and-drop is functional. The narrator does a side-by-side comparison with the original screenshot. Some elements are strikingly close. Others are clearly AI-generated. "Is it Notion? No. Could you use it? Absolutely. Did a human write any of this code? Not a single character."
Visual Concepts for Remotion:
ScreenshotToCodeopening animation: the Notion screenshot dissolves pixel-by-pixel into code characters, which then reassemble into the cloned interfaceAgentThinkingoverlay: a semi-transparent sidebar showing Cursor's agent plan as it generates -- file names, component tree, dependency list, appearing in real timeSideBySidecomparison frame: original Notion on the left, clone on the right, with a slider the viewer can conceptually drag between themFileTickerbottom bar: a scrolling ticker showing file names as they are created ("sidebar.tsx... editor.tsx... slash-commands.tsx..."), styled like a stock ticker- Cursor's interface captured with visible agent actions highlighted
Video #5: The Debug Olympics (Claude Code)
Title/Hook: "Can AI fix a bug faster than Stack Overflow?"
Tool: Claude Code
Concept: A real, nasty bug -- the kind that would send a developer to Stack Overflow for an hour -- is presented to Claude Code. The screen is split: on the left, a simulated "Stack Overflow search" shows the traditional debugging path (finding related questions, reading answers, trying solutions). On the right, Claude Code analyzes the error, traces the root cause through multiple files, and delivers a working fix. A race timer tracks both sides.
Tone: Competitive and high-energy, like a sports broadcast. The narration calls the race like a commentator.
Script Outline (175 words): Open on a terminal showing a cryptic error: a React hydration mismatch caused by a timezone-dependent date format in a server component. The narrator, in a sports-announcer voice: "In the left corner, the defending champion: Stack Overflow and pure human tenacity. In the right corner, the challenger: Claude Code. The bug: a hydration error that has already cost this developer 45 minutes. Let the race begin." The split screen activates. Left side: a browser opens Stack Overflow, searches the error message, scrolls through three different answers, tries a solution that does not work, goes back. Right side: Claude Code receives the error, opens the relevant files, traces the date formatting issue across server and client components, identifies the mismatch, proposes a fix, and applies it. Claude Code finishes in 23 seconds. The left side is still reading the second Stack Overflow answer. "The AI finished before the human found the right question to ask."
Visual Concepts for Remotion:
RaceTimerdual countdown: two stopwatches side by side, one for each approach, styled like a sports scoreboard with team colors (orange for Stack Overflow, purple for Claude)SplitRacecomposition: left and right panels with independent screen recordings, separated by a glowing dividing lineDebugTraceanimation: on Claude Code's side, colored lines connect the error message to the relevant files, showing the AI's reasoning path like a detective's evidence boardVictoryFlashanimation: when Claude Code finishes, its panel pulses with a winner overlay while the Stack Overflow panel dimsBugAnatomyend card: a diagram showing the root cause of the bug, making the video educational as well as entertaining
Video Series 2: "The Prompt That..." (Educational + Humor)
This series takes a single prompt and follows it to its logical (and sometimes illogical) conclusion. Each video is educational at its core -- you learn prompt engineering techniques, tool capabilities, and common pitfalls -- but the framing is comedic. The "The Prompt That..." naming convention is designed for curiosity-driven clicks.
Series format:
- Duration: 90-120 seconds
- Structure: Setup (10s) -> The prompt (10s) -> The process (40-60s) -> The twist/result (20-30s) -> Lesson learned (10s) -> End card (5s)
- Visual signature: The prompt text is always displayed on a "sticky note" style card that stays pinned to the screen throughout the video
- Audio: Conversational narration, comedic timing with beat pauses, sound effects for emphasis
Video #6: The Prompt That Built a Game
Title/Hook: "The Prompt That Built a Game"
Tool: Claude Code + Remotion (for the game rendering)
Concept: A single, carefully crafted prompt generates a complete browser game -- not a trivial one, but a polished arcade game with physics, particle effects, a scoring system, leaderboard, and mobile touch controls. The video walks through the prompt's structure, explaining why each sentence matters, then shows the game coming to life.
Tone: Enthusiastic and educational. The narrator genuinely enjoys playing the result.
Script Outline (190 words): Open on the prompt, displayed as a sticky note. The narrator reads it aloud, pausing to annotate key phrases: "Notice I specified 'physics-based' -- without this, the AI defaults to simple collision rectangles." "I said 'particle effects on collision' -- this forces the AI to implement a particle system, which makes the game feel premium." The prompt is sent to Claude Code. The terminal comes alive with file creation. The narrator explains the AI's architectural decisions as they happen: "It chose HTML Canvas over DOM elements -- good call for performance." "It's implementing a game loop with requestAnimationFrame -- exactly right." At 50 seconds, the game runs for the first time. It has bugs: a sprite clips through a wall. The error is pasted back. At 65 seconds, the game runs cleanly. The narrator plays it for 20 seconds, showing the physics, particles, and scoring in action. "One prompt. One paste of an error message. A game that would have taken a junior developer a week. The lesson: specificity in your prompt is not optional. Every adjective earns its keep."
Visual Concepts for Remotion:
StickyNotecomponent: a yellow sticky note pinned to the top-left corner showing the prompt text, with annotations appearing as red-marker circles and arrows when the narrator highlights key phrasesTerminalStreamanimation: Claude Code's terminal output rendered as a scrolling feed with syntax-highlighted file paths and code snippetsGameEmbedlive composition: the actual game running inside a Remotion frame, capturing real gameplayAnnotationBubbleoverlays: speech-bubble callouts pointing to specific lines in the prompt, explaining why they matterBeforeAfterbug-fix transition: a glitch effect when the bug appears, clean dissolve when it is fixed
Video #7: The Prompt That Broke Everything
Title/Hook: "The Prompt That Broke Everything"
Tool: Bolt.new
Concept: A seemingly reasonable prompt -- "refactor the entire codebase to use TypeScript strict mode" -- is applied to a working JavaScript project. The video documents the cascade of failures: type errors multiply exponentially, the AI tries to fix them but introduces new ones, the build breaks, and the project enters what the narrator calls "the error spiral." The video then shows the recovery: how to scope refactoring prompts correctly.
Tone: Darkly comedic, building to genuine relief. The narrator treats the error messages like a horror movie.
Script Outline (185 words): Open on a working application. Green checkmarks everywhere. The narrator says: "This app works perfectly. It has 47 files, zero bugs, and 100% of its tests pass. I am about to destroy it with one sentence." The prompt appears: "Refactor this entire codebase to use TypeScript strict mode with no 'any' types." The AI begins. At first, it looks productive -- .js files become .tsx files. Then the errors start. The error count appears as a rising counter in the corner: 12... 47... 134... 312. The narrator's tone shifts from confident to concerned to horrified. "It's adding type assertions everywhere. Those are band-aids. The types are lying." At 60 seconds, the build fails completely. The recovery begins: the narrator shows how to scope the same refactoring into small, file-by-file prompts with test verification between each step. The error count drops. The builds pass. "The lesson: AI can refactor anything. But 'anything' and 'everything at once' are different requests."
Visual Concepts for Remotion:
ErrorCountercomponent: a large, prominent counter in the top-right that ticks up with each new TypeScript error, turning from green to yellow to orange to red as the count increases, with screen-shake at milestones (100, 200, 300)CascadeVisualizationanimation: errors displayed as falling dominoes or multiplying cells, visually representing the chain reactionHealthBarcomponent: a video-game-style health bar for the project, draining as errors accumulate, flashing red at critical levelsRecoveryTimelineanimation: a horizontal timeline showing the correct approach -- small, scoped prompts with green checkmarks between each step- Split-screen during recovery: the broken approach on top (red-tinted), the correct approach on the bottom (green-tinted)
Video #8: The Prompt That Got Me Fired (Hypothetically)
Title/Hook: "The Prompt That Got Me Fired (Hypothetically)"
Tool: Claude Code
Concept: A developer accidentally uses a vibe coding workflow on a production codebase -- accepting all changes without review, pushing without tests, deploying on a Friday afternoon. The video is a dramatized worst-case scenario that teaches real lessons about when NOT to vibe code. Every mistake is a real mistake that real developers have made.
Tone: Mock-serious, documentary style. Presented like a true-crime investigation of a deployment gone wrong.
Script Outline (180 words): Open on a dramatic title card: "INCIDENT REPORT: February 14, 2026." The narrator, in a deadpan documentary voice: "The following is a reconstruction of actual events. Names have been changed. The code has not." The prompt is revealed: a developer asked the AI to "update the user billing logic to handle the new pricing tiers" on the production branch. Without reading the diff. Without running tests. On a Friday at 4:47 PM. The AI changed the billing calculation -- and introduced a rounding error that charged every customer $0.01 extra per transaction. The video shows the cascade: the deploy, the first customer complaint, the Slack messages, the rollback attempt that failed because there was no checkpoint. "By Monday morning, 47,000 transactions were affected." The recovery section shows what should have happened: feature branch, test suite, staging deployment, code review. "Vibe coding is a superpower. And like every superpower, using it in the wrong context has consequences."
Visual Concepts for Remotion:
IncidentReportstyling: the entire video uses a corporate incident report aesthetic -- monospace fonts, timestamps, severity indicators, redacted sectionsSlackMessagesanimation: recreated Slack-style message bubbles appearing with increasing urgency ("@channel anyone else seeing billing discrepancies?", "this is not a drill")TimelineOfFailurecomponent: a horizontal timeline with red flags marking each mistake (no branch, no tests, no review, Friday deploy)RollbackFailanimation: a dramatic "FAILED" overlay with klaxon-style visual pulse when the rollback does not workChecklistRevealend animation: the correct process appearing as a green checklist, each item checking off with a satisfying animation
Video #9: The Prompt That Replaced My Intern
Title/Hook: "The Prompt That Replaced My Intern"
Tool: Cursor + Claude Code
Concept: A tech lead has a list of 23 tedious but necessary tasks that would normally be assigned to a junior developer or intern: rename variables to follow conventions, add JSDoc comments to exported functions, update deprecated API calls, create missing test stubs, fix all ESLint warnings. One prompt handles all of them. The video compares the estimated "intern hours" with the actual AI minutes.
Tone: Sympathetic and slightly guilty. The narrator acknowledges the awkwardness of the topic while being honest about the productivity gains.
Script Outline (175 words): Open on a task list -- 23 items, each with an estimated time: "Rename callbacks to follow naming convention (2 hours)," "Add JSDoc to all exported functions (4 hours)," "Update deprecated moment.js calls to dayjs (3 hours)." Total estimate: 34 hours of intern work. The narrator says: "I used to give this list to our summer intern. It would take them a full work week. This morning I gave it to the AI." A single, structured prompt appears, listing all 23 tasks with clear specifications. Claude Code begins. A progress bar tracks completed tasks. The terminal output shows files being modified, tests passing. At 45 seconds, 23 of 23 tasks are done. The narrator reviews the changes: "The variable renames are consistent. The JSDoc comments are accurate. The moment-to-dayjs migration handles edge cases I didn't think of." Total time: 8 minutes. "The intern now works on architecture decisions and feature design. The AI handles the checklist."
Visual Concepts for Remotion:
TaskBoardcomponent: a kanban-style board with 23 cards, each sliding from "To Do" to "In Progress" to "Done" as the AI completes themTimeComparisonsplit bar: a bar chart comparing "Intern: 34 hours" vs "AI: 8 minutes," with the AI bar barely visible next to the intern barProgressTrackeroverlay: "3/23 complete... 11/23... 19/23..." with each milestone triggering a small celebration animationDiffPreviewpopups: brief glimpses of the actual code changes (before/after) for two or three of the most interesting tasks- Warm color palette (no cold, "replacing humans" vibe) -- the end card explicitly shows the intern now working on more interesting problems
Video #10: The Prompt That Even My Mom Could Use
Title/Hook: "The Prompt That Even My Mom Could Use"
Tool: Lovable
Concept: The narrator's actual non-technical parent uses Lovable to build a small app -- a recipe organizer -- from scratch, using only natural language. The video is screen-recorded over the parent's shoulder (with permission). The charm is in the completely non-technical prompt language: "I want a thing where I can put my recipes and find them later, like a cookbook but on the computer."
Tone: Warm, genuine, and slightly humorous. The non-technical language in the prompts is endearing, not mocking.
Script Outline (185 words): Open on a text overlay: "I gave my mom a Lovable account and one instruction: build whatever you want." Cut to the screen. The prompt is typed in plain, non-technical English: "I want to save my recipes. Each recipe should have a name, the ingredients, the steps, and a photo. I want to search by ingredient so when I have chicken I can find all my chicken recipes. Make it pretty with a warm color like my kitchen." Lovable generates the app. The narrator points out that "make it pretty with a warm color like my kitchen" resulted in a terracotta-and-cream color scheme that actually looks good. The recipe form works. The search works. Photo upload works. The narrator's parent adds a real recipe -- handwritten notes visible on the desk for reference. The app works exactly as described. "She didn't say 'database.' She didn't say 'component.' She didn't say 'responsive.' She said 'like a cookbook but on the computer.' And that was enough."
Visual Concepts for Remotion:
HandwrittenOverlaystyling: the prompt text appears in a handwriting-style font rather than monospace, reinforcing the non-technical natureKitchenWarmthcolor grading: the entire video has a warm, slightly golden color grade -- cozy and approachableRecipeCardanimation: when the generated app shows a recipe, it animates like flipping a page in a physical cookbookSearchDemoscreen recording: the ingredient search in action, with a zoom-in on the results filtering in real timeQuoteCardend overlay: "She said 'like a cookbook but on the computer.' And that was enough." in large, warm-toned typography
Video #11: The Prompt That Fooled the Senior Dev
Title/Hook: "The Prompt That Fooled the Senior Dev"
Tool: Claude Code
Concept: A blind code review experiment. A senior developer is shown two pull requests: one written by a mid-level human developer, one generated entirely by AI from a single prompt. The senior reviews both, provides feedback, and guesses which is which. The reveal shows whether they guessed correctly -- and what the AI code got right that the human code got wrong (and vice versa).
Tone: Fair and balanced. This is not an "AI is better" video -- it is an honest comparison that reveals strengths and weaknesses on both sides.
Script Outline (195 words): Open on two code editors, labeled "Developer A" and "Developer B." The narrator explains: "A senior engineer with 12 years of experience is going to review two implementations of the same feature -- a real-time notification system. One was written by a mid-level developer in 6 hours. The other was generated by Claude Code from a single prompt in 4 minutes. The reviewer doesn't know which is which." Cut to the review. The senior developer's comments appear as overlays: "Developer A has clean separation of concerns... but this error handling is naive." "Developer B's type safety is impressive... but this abstraction feels over-engineered." The senior guesses: "A is the human, B is the AI. The human code feels more intentional. The AI code is technically thorough but lacks personality." The reveal: they got it backwards. Developer A was the AI. Developer B was the human. The narrator unpacks the implications: the AI's code was structurally cleaner, but the human's code had more creative architectural choices. "Neither was strictly better. They were differently excellent."
Visual Concepts for Remotion:
BlindReviewsplit screen: two code panels with neutral labels ("Developer A" / "Developer B"), no visual hints about originReviewCommentoverlays: the senior developer's comments appear as GitHub-PR-style review annotations, sliding in from the right marginGuessRevealanimation: the labels flip over like cards, revealing "AI" and "Human" with a dramatic pause and sound effectComparisonMatrixend card: a radar chart comparing both implementations across axes (readability, type safety, error handling, architecture, creativity, performance)- Neutral color scheme throughout -- neither side gets a "winner" color until the analysis section
Video Series 3: "Tool Face-Off" (Comparison)
This series puts competing tools head-to-head on identical tasks. Same prompt, same requirements, same hardware. The evaluation is structured and scored across consistent categories: speed, code quality, developer experience, and output completeness. These are the videos developers watch before choosing their next tool.
Series format:
- Duration: 90-120 seconds
- Structure: Rules (10s) -> Tool A attempt (30-40s) -> Tool B attempt (30-40s) -> Scoring (15s) -> Verdict (10s) -> End card (5s)
- Visual signature: Boxing-match / tournament-bracket aesthetic with tool logos in corners, round numbers, and scorecard overlays
- Audio: Sports-style narration, bell sounds between rounds, dramatic pause before verdict
Video #12: Round 1 -- IDE Showdown (Cursor vs Claude Code vs Codex CLI)
Title/Hook: "Round 1: IDE Showdown -- Cursor vs Claude Code vs Codex CLI"
Tools: Cursor (Agent mode), Claude Code, OpenAI Codex CLI
Concept: All three tools receive the same prompt: build a task management API with authentication, CRUD operations, and automated tests. The video captures all three attempts simultaneously using a triple split-screen. Each tool is scored on time to completion, test pass rate, code quality (measured by a linting score), and developer experience (subjective rating of the interaction).
Tone: Fair, analytical, and energetic. This is a sports broadcast, not a product review. Every tool gets genuine praise for its strengths.
Script Outline (200 words): Open on a tournament bracket graphic. The narrator, in an announcer voice: "Three tools. One prompt. One winner. This is the IDE Showdown." The prompt appears: a task management REST API with JWT authentication, full CRUD, input validation, pagination, and a test suite. The rules: no human intervention after the prompt is submitted, tools are scored on four categories, each worth 25 points. "Round 1: Speed." The triple split-screen activates. Cursor's agent starts planning, showing its step-by-step approach. Claude Code opens multiple files simultaneously, working fast. Codex CLI takes a methodical, file-by-file approach. Time stamps appear as each tool finishes. "Round 2: Tests." Each tool's test suite runs. Pass rates appear on the scoreboard. "Round 3: Code Quality." ESLint scores flash on screen. "Round 4: Developer Experience." The narrator rates the interaction quality: how clear was the agent's communication, how easy was it to follow along, how much manual intervention was needed. The scorecard fills in. The verdict is revealed. "All three built a working API. The differences are in the details."
Visual Concepts for Remotion:
TournamentBracketintro animation: a bracket graphic with tool logos, styled like a boxing event posterTripleSplitcomposition: three equal panels running simultaneous screen recordings, each with a tool logo badge and running timer in the cornerScoreboardcomponent: a four-category scoring grid that fills in during the verdict section, each score animating from 0 to its final valueRoundBelltransition: a boxing bell sound and "ROUND 2" text between each scoring categoryVerdictCardfinal overlay: total scores, category winner badges, and a nuanced text verdict ("Best for speed: X. Best for quality: Y. Best for beginners: Z.")
Video #13: Round 2 -- Builder Battle (Bolt.new vs Lovable vs Replit Agent)
Title/Hook: "Round 2: Builder Battle -- Bolt.new vs Lovable vs Replit Agent"
Tools: Bolt.new, Lovable, Replit Agent
Concept: The browser-based builders compete on a task suited to their strengths: build a complete landing page with a waitlist form, social proof section, feature comparison, and email capture that stores submissions to a real database. Scoring covers design quality, functionality, mobile responsiveness, and deployment speed.
Tone: Enthusiastic and visual. Since these are design-heavy tools, the video emphasizes how each app looks and feels rather than focusing purely on code.
Script Outline (190 words): Open on the challenge card: "Build a startup landing page with working waitlist signup. You have 3 minutes." Each builder gets the same prompt: a landing page for a fictional AI writing tool called "DraftPilot," with a hero section, three feature cards, a testimonial carousel, a pricing comparison, and a waitlist form that saves emails to Supabase. The triple split-screen shows all three tools working simultaneously. The narrator calls attention to interesting differences in real time: "Bolt.new went straight for the hero section -- it's already looking polished." "Lovable is building the database connection first -- solid fundamentals." "Replit Agent just asked a clarifying question about the color scheme -- that's a nice touch." At 90 seconds, the designs are compared side-by-side: mobile views, desktop views, scroll behavior, form functionality. Each tool's waitlist form is tested with a real email submission. The scoring covers design (how good does it look), function (does the form actually save data), responsiveness (mobile rendering), and speed (time to deployable state). "Each builder has a personality. The question is which personality matches yours."
Visual Concepts for Remotion:
BuilderCardintro: each tool's logo on a playing-card-style design, dealt onto the screen like a card gameDesignComparisonframe: all three landing pages shown as browser mockups on a desk, with the ability to zoom into each oneMobilePreviewanimation: each landing page shrinks into a phone-shaped frame to show mobile rendering, side by sideFormTestoverlay: a live-action hand typing a test email into each form, with a green checkmark when the submission succeedsPersonalityCardend graphic: each tool gets a one-line personality description ("Bolt.new: The Speed Demon," "Lovable: The Perfectionist," "Replit Agent: The Conversationalist")
Video #14: Round 3 -- Agent Arena (Devin vs Jules vs Claude Code)
Title/Hook: "Round 3: Agent Arena -- Devin vs Jules vs Claude Code"
Tools: Devin, Google Jules, Claude Code
Concept: The autonomous agents tackle a more complex task: given an existing open-source project with 15 open issues, each agent is assigned 5 issues and must work independently to create pull requests. Scoring covers issue resolution rate, PR quality, test coverage of the fix, and how well the agent communicated its approach.
Tone: Analytical with a sense of drama. These are the most powerful tools in the landscape, and the comparison is genuinely informative for teams making purchasing decisions.
Script Outline (200 words): Open on a GitHub issues page showing 15 open issues. The narrator: "Welcome to the Agent Arena. Three autonomous AI agents. Five GitHub issues each. No human help. Who writes the best pull requests?" The issues range from a CSS bug to a database query optimization to a feature request for dark mode. Each agent receives its 5 issues and a cloned copy of the repo. The video shows a triple timeline: Devin working in its cloud VM, Jules working asynchronously through Google Cloud, Claude Code working in the terminal. Key moments are highlighted: "Devin just opened a PR for the CSS bug -- let's see the diff." "Jules is running the test suite before committing -- smart." "Claude Code found a related bug while fixing issue #7 and filed a new issue for it -- above and beyond." After all agents submit their PRs, a senior developer reviews them. Scoring: issues resolved (did the PR actually fix it), code quality (clean diff, no regressions), test coverage (did the agent add tests), and communication (how clear was the PR description and commit message). "At this level, the differences are subtle. But subtle differences matter at scale."
Visual Concepts for Remotion:
GitHubBoardcomposition: a project board with issue cards, each card moving to the agent's column as they are assignedAgentTimelinetriple track: three horizontal timelines showing each agent's progress -- commits appear as dots, PRs as flags, with timestampsPRReviewoverlay: a GitHub-style PR diff view showing the agent's changes, with the senior developer's review comments fading inScoreRadarchart: a radar/spider chart for each agent across the four scoring dimensionsArenaStadiumframing: the entire video is styled like an arena event, with spotlights, agent "entrances," and a final podium reveal
Video #15: Round 4 -- Speed vs Quality (Bolt vs Claude Code)
Title/Hook: "Round 4: Speed vs Quality -- Bolt.new vs Claude Code"
Tools: Bolt.new, Claude Code
Concept: This is the philosophical face-off: the fastest browser builder against the most thorough terminal agent. The same prompt -- a complete habit-tracking app with streaks, charts, and reminders -- goes to both tools. Bolt.new finishes in minutes. Claude Code takes longer but produces more robust code. The question is not "which is better" but "which is better for what."
Tone: Thoughtful and balanced. This video acknowledges that "better" depends entirely on context.
Script Outline (195 words): Open on a scale graphic: "Speed" on one side, "Quality" on the other. The narrator: "Every developer makes this trade-off. Today we make it explicit." The prompt: a habit tracker with daily check-ins, streak counting with freeze days, progress charts using a real charting library, push notification reminders, and data export. Bolt.new starts. The app assembles rapidly in the browser -- UI components appear, the habit list renders, the chart populates. Time: 3 minutes and 12 seconds. It looks good. It works. Claude Code starts. The terminal is busier -- it is setting up a proper project structure, adding TypeScript types, writing utility functions with edge case handling, creating a test file. Time: 14 minutes and 47 seconds. It also works. Now the comparison. The narrator stress-tests both: "What happens when the streak crosses a month boundary?" Bolt's version has a bug. Claude Code's handles it correctly. "What about the UI?" Bolt's is more visually polished out of the box. "Both answers are right. The question is what you need right now: a working prototype by lunch, or a production foundation by end of week."
Visual Concepts for Remotion:
ScaleBalancecomponent: a literal balance scale that tips toward speed (Bolt) or quality (Claude Code) as different criteria are evaluatedDualTimercomposition: two race-style timers, one for each tool, with the differential growing as Claude Code continues working after Bolt finishesStressTestoverlay: identical test inputs applied to both apps simultaneously, with results appearing as pass/fail indicatorsContextCardend graphic: two scenario cards -- "Choose Bolt when: hackathon, prototype, demo day" and "Choose Claude Code when: production, long-term project, team codebase" -- appearing side by side- Warm vs cool color split: Bolt's side in warm oranges (energy, speed), Claude Code's side in cool blues (precision, depth)
Video Production Workflow
Every video in this chapter follows the same five-stage production pipeline. This section documents the pipeline so that new videos can be produced consistently and efficiently.
Stage 1: Script Writing
Every video begins as a markdown file. Scripts follow a strict format:
---
video_id: PTP-001
series: prompt-to-product
title: "I built a $9/month SaaS in 60 seconds"
duration_target: 60-90s
tool: Bolt.new
status: production
last_updated: 2026-02-25
---
## Hook (0:00 - 0:03)
[Opening visual description]
NARRATOR: "Opening line designed to stop the scroll."
## Setup (0:03 - 0:08)
[Screen state description]
NARRATOR: "Context setting. What we are about to do and why it matters."
## Build (0:08 - 0:55)
[Screen recording cues with timestamps]
NARRATOR: "Running commentary on what the AI is doing. Call out
interesting decisions. Keep energy high."
## Reveal (0:55 - 1:05)
[Final product display]
NARRATOR: "The payoff. Show the deployed result. Land the key stat."
## End Card (1:05 - 1:10)
[Branding overlay]
NARRATOR: "Call to action -- next video, ebook link, subscribe."
Script guidelines:
- Target 150-200 words of narration per video (approximately 2 words per second at conversational pace)
- Every sentence must earn its place -- if it does not advance understanding or maintain engagement, cut it
- Write the hook first. If the first 3 seconds do not compel a viewer to keep watching, rewrite them
- Include specific timestamps for visual cues so the Remotion composition can sync precisely
- Mark all screen recording segments with
[SCREEN: tool_name, action_description]tags
Stage 2: Visuals (Remotion Compositions)
Each video is a Remotion composition -- a React component that renders frame-by-frame to produce video output. The compositions combine three types of visual content:
Screen Recordings
- Captured at 60fps using OBS Studio with a standardized window layout
- Tool interfaces are recorded at 1920x1080 with consistent browser chrome
- Mouse movements are smoothed in post-processing for cleaner playback
- Sensitive information (API keys, personal data) is redacted before compositing
Motion Graphics
- Countdown timers, score overlays, progress bars, and transitions are all Remotion components
- The component library includes:
CountdownTimer,ScoreBoard,SplitScreen,ProgressTracker,TitleCard,EndCard,AnnotationBubble,CodeHighlight - All motion graphics follow the EndOfCoding design system (see Branding below)
- Animations use spring physics for natural-feeling motion (
useSpringfrom Remotion)
Code Animations
- Code snippets that appear in videos are rendered using a custom
CodeBlockRemotion component - Syntax highlighting uses the same theme across all videos (VS Code Dark+ variant)
- Code appears with a typewriter animation at a configurable speed
- Diff views use green/red highlighting with line-by-line reveal animations
Composition structure:
src/
compositions/
prompt-to-product/
PTP001-SaaS60.tsx # Main composition
PTP001-assets/ # Screen recordings, images
the-prompt-that/
TPT001-Game.tsx
TPT001-assets/
tool-face-off/
TFO001-IDEShowdown.tsx
TFO001-assets/
components/
CountdownTimer.tsx
ScoreBoard.tsx
SplitScreen.tsx
EndCard.tsx
StickyNote.tsx
CodeBlock.tsx
ProgressTracker.tsx
RaceTimer.tsx
styles/
theme.ts # Shared colors, fonts, spacing
animations.ts # Shared spring configs
Stage 3: Audio
Narration
- AI text-to-speech narration using ElevenLabs or equivalent high-quality TTS
- Voice profile: confident, conversational, slightly fast-paced (matching the energy of the content)
- Each script is narrated as a single take, then trimmed and aligned to visual cues in Remotion
- Pronunciation corrections are applied for technical terms (e.g., "Supabase" is "soo-puh-base," not "super-base")
Sound Design
- Background music: royalty-free electronic/lo-fi tracks from Epidemic Sound or Artlist, selected per series (energetic for Prompt to Product, chill for The Prompt That, competitive for Tool Face-Off)
- Sound effects library: keystroke clicks, notification chimes, deployment whooshes, error buzzes, success dings, countdown ticks, boxing bells
- Music ducking: background track volume drops 60% during narration, rises during visual-only segments
- Audio levels: narration at -14 LUFS, music at -24 LUFS, sound effects at -18 LUFS
Stage 4: Branding
Every video carries the EndOfCoding brand identity consistently:
Logo
- The EndOfCoding logo appears in the bottom-right corner throughout the video at 40% opacity
- Full logo displayed on the end card at 100% opacity with the tagline
Color Palette
- Primary:
#6C5CE7(electric purple) -- used for highlights, CTAs, and active states - Secondary:
#00D2D3(cyan) -- used for accents, secondary information - Background:
#0F0F23(deep navy) -- used for all dark backgrounds - Surface:
#1A1A2E(dark surface) -- used for cards and overlays - Text:
#FFFFFFat 90% opacity for primary text, 60% for secondary - Success:
#00E676-- used for pass indicators, completion states - Error:
#FF5252-- used for fail indicators, error states
Typography
- Titles: Inter Bold, 48px (scaled for video resolution)
- Body: Inter Regular, 24px
- Code: JetBrains Mono, 20px
- Captions: Inter Medium, 18px
End Card (last 5 seconds of every video)
- Full EndOfCoding logo centered
- Three cross-link buttons: "Watch Next Video" (left), "Read the Ebook" (center), "Subscribe" (right)
- Social handles displayed below
- Background: animated gradient using the primary/secondary colors
Stage 5: Distribution
Each video exists in multiple formats for different platforms:
Full-Length (YouTube + Ebook Embed)
- Resolution: 1920x1080 (16:9)
- Duration: 60-120 seconds
- Format: MP4 (H.264) for YouTube, WebM for ebook embed
- Hosted on YouTube with ebook embed via YouTube iframe or self-hosted WebM
Short-Form Clips (TikTok / Instagram Reels / YouTube Shorts)
- Resolution: 1080x1920 (9:16)
- Duration: 15-60 seconds
- Extracted from the most compelling segment of the full video
- Additional text overlays for silent autoplay viewing (captions burned in)
- Platform-specific crops handled by a Remotion
VerticalCropcomposition
Ebook Embed
- Lightweight WebM format with lazy loading
- Poster frame (thumbnail) displayed before playback
- Fallback: animated GIF preview with a "Watch Full Video" link to YouTube
- Accessible: full transcript available below each embedded video
SEO and Metadata
YouTube Optimization
- Title format:
[Hook] | Vibe Coding Tutorial #[N] - Example:
"I built a $9/month SaaS in 60 seconds | Vibe Coding Tutorial #1" - Description: 200-300 words including the full prompt used, tools mentioned, timestamps, and a link to the ebook chapter
- Tags: tool-specific tags (bolt.new, cursor, claude code), technique tags (vibe coding, AI coding, prompt engineering), outcome tags (build app fast, no code saas)
- Timestamps: every section of the video marked for YouTube chapters
- Cards: each video includes a card linking to the ebook at the 75% mark
- End screen: 20-second end screen with next video and subscribe prompts
Cross-Linking
- Each YouTube video description links to the corresponding ebook chapter
- Each ebook video embed links to the YouTube version for higher-quality playback
- Related videos are suggested at the end of each ebook section
- Playlists: one per series (Prompt to Product, The Prompt That, Tool Face-Off)
Embedding Videos in the Interactive Ebook
The interactive web version of this ebook uses Remotion's @remotion/player component to embed videos directly in the reading experience. This means videos are not external links -- they are native elements of the page, rendered inline alongside the text.
Technical Implementation
Each video is embedded using a VideoTutorial React component:
import { Player } from "@remotion/player";
import { PTP001 } from "../compositions/prompt-to-product/PTP001-SaaS60";
export const VideoTutorial = ({
compositionId,
title,
duration,
tools,
transcript,
}: VideoTutorialProps) => {
return (
<section className="video-tutorial">
<h3>{title}</h3>
<div className="video-meta">
<span className="duration">{duration}</span>
<span className="tools">{tools.join(" + ")}</span>
</div>
<Player
component={PTP001}
compositionWidth={1920}
compositionHeight={1080}
durationInFrames={2700} // 90s at 30fps
fps={30}
controls
style={{ width: "100%", maxWidth: 800 }}
/>
<details className="transcript">
<summary>View Transcript</summary>
<p>{transcript}</p>
</details>
</section>
);
};
Reader Experience
When a reader scrolls to a video in the ebook:
- Poster frame -- A thumbnail of the most visually interesting moment loads immediately (lazy-loaded image, minimal bandwidth)
- Play button overlay -- A single click starts playback. Videos do not autoplay
- Inline controls -- Play/pause, scrub bar, volume, fullscreen, and playback speed (0.5x to 2x)
- Transcript toggle -- A collapsible section below the video contains the full narration transcript, making the content accessible and searchable
- Chapter links -- If the video references tools or concepts covered in other chapters, inline links appear below the video
Offline and Static Fallbacks
For the markdown and Word versions of the ebook (which cannot embed video):
- Each video section includes the full script as formatted text
- A QR code links to the YouTube version
- A static screenshot of the key moment serves as the visual anchor
- The caption reads: "Watch this tutorial: [YouTube URL]"
For the static HTML version (no JavaScript):
- An animated GIF preview (5-10 seconds, looped) provides a visual taste
- A prominent "Watch Full Tutorial" button links to YouTube
- The transcript is displayed by default (not collapsed)
Video Production Schedule
New videos are added on a monthly cadence. The production schedule follows the tool landscape -- when a major tool update ships, a new video is produced within two weeks to document the changed workflow.
| Month | Planned Videos | Series |
|---|---|---|
| March 2026 | #1 60-Second SaaS, #6 Game Builder | Prompt to Product, The Prompt That |
| April 2026 | #12 IDE Showdown, #7 Broke Everything | Tool Face-Off, The Prompt That |
| May 2026 | #2 Portfolio Speedrun, #13 Builder Battle | Prompt to Product, Tool Face-Off |
| June 2026 | #3 The $0 Startup, #8 Got Me Fired | Prompt to Product, The Prompt That |
| July 2026 | #14 Agent Arena, #9 Replaced My Intern | Tool Face-Off, The Prompt That |
| August 2026 | #4 Clone Wars, #10 Mom Could Use | Prompt to Product, The Prompt That |
| September 2026 | #15 Speed vs Quality, #11 Fooled Senior Dev | Tool Face-Off, The Prompt That |
| October 2026 | #5 Debug Olympics, New TBD | Prompt to Product, TBD |
The schedule prioritizes alternating between series to maintain variety. High-impact tool launches (new Cursor version, Claude Code update, new entrant) can preempt the schedule.
Video Index
A quick-reference table of all videos in this chapter:
| # | Title | Series | Tool(s) | Duration | Status |
|---|---|---|---|---|---|
| 1 | I built a $9/month SaaS in 60 seconds | Prompt to Product | Bolt.new | 60-90s | Pre-production |
| 2 | Your portfolio shouldn't take longer than your morning coffee | Prompt to Product | v0 + Vercel | 60-90s | Pre-production |
| 3 | This app makes money. I didn't write a single line. | Prompt to Product | Lovable | 60-90s | Pre-production |
| 4 | I showed AI a screenshot of Notion. Here's what happened. | Prompt to Product | Cursor | 60-90s | Pre-production |
| 5 | Can AI fix a bug faster than Stack Overflow? | Prompt to Product | Claude Code | 60-90s | Pre-production |
| 6 | The Prompt That Built a Game | The Prompt That | Claude Code | 90-120s | Pre-production |
| 7 | The Prompt That Broke Everything | The Prompt That | Bolt.new | 90-120s | Pre-production |
| 8 | The Prompt That Got Me Fired (Hypothetically) | The Prompt That | Claude Code | 90-120s | Pre-production |
| 9 | The Prompt That Replaced My Intern | The Prompt That | Cursor + Claude Code | 90-120s | Pre-production |
| 10 | The Prompt That Even My Mom Could Use | The Prompt That | Lovable | 90-120s | Pre-production |
| 11 | The Prompt That Fooled the Senior Dev | The Prompt That | Claude Code | 90-120s | Pre-production |
| 12 | IDE Showdown: Cursor vs Claude Code vs Codex CLI | Tool Face-Off | Cursor, Claude Code, Codex CLI | 90-120s | Pre-production |
| 13 | Builder Battle: Bolt.new vs Lovable vs Replit Agent | Tool Face-Off | Bolt.new, Lovable, Replit Agent | 90-120s | Pre-production |
| 14 | Agent Arena: Devin vs Jules vs Claude Code | Tool Face-Off | Devin, Jules, Claude Code | 90-120s | Pre-production |
| 15 | Speed vs Quality: Bolt.new vs Claude Code | Tool Face-Off | Bolt.new, Claude Code | 90-120s | Pre-production |
Measuring Video Impact
Each video is tracked across platforms with the following metrics:
Engagement Metrics
- YouTube: watch time, average view duration, click-through rate on ebook links
- TikTok/Reels/Shorts: views, shares, saves, profile visits
- Ebook: play rate (percentage of readers who click play), completion rate, transcript expansion rate
Conversion Metrics
- YouTube-to-ebook click rate (tracked via UTM parameters in description links)
- Ebook-to-YouTube click rate (tracked via embed interaction events)
- New subscriber acquisition per video
Quality Metrics
- Audience retention curve (identifying where viewers drop off)
- Comment sentiment (positive/negative/neutral classification)
- Video-specific NPS from reader surveys
Videos with below-average retention in the first 5 seconds get their hooks rewritten. Videos with above-average ebook-to-YouTube conversion get promoted in the chapter ordering.
This chapter is updated monthly with 2-4 new videos as the vibe coding tool landscape evolves. Each update includes new video entries, refreshed comparisons when tools ship major versions, and community-requested tutorials. Last updated: March 2026.
21. Monthly Intelligence Brief: April 2026
What changed in the vibe coding world this month. Updated on the 1st of each month for subscribers.
trivy-action GitHub Actions tags; it took five days to fully evict them, during which they published additional malicious Docker images during the remediation effort. The attack then cascaded into CanisterWorm — a self-propagating npm worm that hit 64+ packages using a blockchain-based command-and-control infrastructure, making it unusually resistant to takedown. CanisterWorm subsequently infected Checkmarx KICS and AST GitHub Actions, and separately reached LiteLLM (95 million monthly PyPI downloads). The combined blast radius makes this the most extensive supply chain cascade in AI developer tooling history. Treat any Trivy, Checkmarx, or LiteLLM pipeline that ran between March 19 and April 10 as potentially compromised.Previous Month: March 2026
Key Developments
/loop command (cron-like session-scoped task scheduler), Skills.md for persistent agent behaviors, a 1-million-token context window, and increased max output to 64k tokens for Opus 4.6 (128k upper bound for both Opus 4.6 and Sonnet 4.6). MCP servers can now request structured input mid-task via interactive dialogs. /loop turns Claude Code into a background worker for PR reviews, deployment monitoring, and recurring analysis tasks — the closest any tool has come to a fully autonomous development partner.Numbers Update (April 9, 2026)
What to Watch in May 2026
- GitHub Copilot opt-out deadline (April 24): Teams with proprietary or regulated code must opt out before this date or accept that interaction data trains future models
- Claude Mythos general availability: Anthropic restricted it to cybersecurity defense; when and how does the most capable public coding model emerge?
- CanisterWorm cleanup: Is the blockchain C2 infrastructure being taken down? Watch for new packages hit after April 9
- Meta Muse Spark coding benchmarks: Current strong in reasoning/science, weaker in coding — will dedicated coding evals change the picture?
- Supply chain security posture: Will npm, PyPI, and Docker Hub introduce mandatory provenance for AI-ecosystem packages after the Trivy/CanisterWorm cascade?
- EU AI Act full applicability: August 2, 2026. Guidance for AI coding tools in regulated industries ramping up
- Google I/O (typically May): Anticipated announcements on Jules, Gemini CLI, and Antigravity roadmap
- Replit path to $1B ARR: declared the year-end target after $9B raise — watch monthly revenue disclosures
- Lovable acquisitions: M&A offensive declared — which AI devtools will be absorbed first?
- Cursor $50B raise close: if the fundraising report closes, it would be the largest AI coding tool valuation ever
Chapter 22: Community Showcase
Real projects built by real people using vibe coding. Updated monthly.
Welcome to the Showcase
This chapter is different from the rest of the book. It is not written by us -- it is written by you.
Every project featured here was built using the techniques, tools, and philosophies described in the preceding chapters. Some were built by seasoned developers experimenting with a new workflow. Others were built by people who had never written a line of code before picking up Cursor or Bolt.new. All of them went from idea to deployed software using AI-native development.
The community showcase exists for three reasons:
- Proof that it works. Theory is useful. Seeing a non-technical product manager ship an internal dashboard in four hours is more useful.
- Shared knowledge. Every submission includes the prompts that worked, the mistakes that cost time, and the metrics that followed. This is a living library of hard-won lessons.
- Inspiration. The gap between "I should build something" and "I shipped something" is often just seeing someone in a similar position who already did it.
We review submissions monthly and feature the most instructive projects -- not necessarily the most impressive ones. A weekend prototype that taught the builder three critical lessons about prompt structure is more valuable here than a polished SaaS with no story behind it.
How to Submit Your Project
We welcome submissions from anyone who has built and deployed something using AI-native development tools. Your project does not need to be generating revenue. It does not need to be technically sophisticated. It needs to be real, deployed, and accompanied by an honest account of how it was built.
Submission Template
Copy the template below, fill it in, and submit it to showcase@endofcoding.com or post it in the #showcase channel on our community Discord.
## Project Submission
**Project Name:**
[Your project name]
**Live URL:**
[Link to the deployed project]
**Builder Name:**
[Your name or handle]
**Builder Background:**
[Developer / Designer / Product Manager / Non-technical / Student / Other]
[Brief bio: 1-2 sentences about your experience level and day job]
**Tools Used:**
[List all AI tools: Cursor, Claude Code, Bolt.new, v0, Lovable, Replit Agent, etc.]
[List supporting tools: Vercel, Supabase, Stripe, Tailwind, etc.]
**Timeline:**
[Time from first prompt to deployed: e.g., "6 hours over a weekend"]
**Key Prompts (1-3 of your best prompts that made the biggest difference):**
Prompt 1:
"""
[Paste the actual prompt text you used]
"""
Why it worked: [Brief explanation]
Prompt 2:
"""
[Paste the actual prompt text]
"""
Why it worked: [Brief explanation]
Prompt 3 (optional):
"""
[Paste the actual prompt text]
"""
Why it worked: [Brief explanation]
**What Went Right:**
- [Bullet point]
- [Bullet point]
- [Bullet point]
**What Went Wrong:**
- [Bullet point]
- [Bullet point]
- [Bullet point]
**Metrics (share what you are comfortable sharing):**
- Users: [number or range]
- Revenue: [if applicable]
- Other: [downloads, signups, press mentions, job offers, etc.]
**One Sentence of Advice for Someone Starting Today:**
[Your best tip]
Submission Guidelines
- Be honest. The community benefits more from "this broke three times and here's why" than from a highlight reel.
- Include real prompts. Paraphrased or sanitized prompts are less useful. Share the actual text you typed.
- Deployed means deployed. The project must be accessible at a URL or downloadable. Screenshots alone are not sufficient.
- One submission per project. You can submit multiple projects, but each gets its own entry.
- Updates welcome. If your project evolves significantly, resubmit with a note about what changed.
Featured Projects
Project 1: WaitlistWizard -- SaaS Micro-Tool Built in a Weekend
What it is: A standalone waitlist management tool for indie makers launching products. Users create a waitlist page with a custom domain, collect emails with referral tracking, and send launch-day notifications. Includes an analytics dashboard showing signup velocity, referral sources, and geographic distribution.
Builder Profile: Marcus Chen, 29. Full-stack developer at a mid-size fintech company during the week. Side-project builder on weekends. Had used GitHub Copilot for two years but had never tried a full vibe coding workflow until this project.
Tools Stack:
- Cursor (Composer mode with Claude 3.5 Sonnet) for all code generation
- Next.js 14 with App Router
- Supabase for database, auth, and real-time subscription counts
- Tailwind CSS for styling
- Vercel for hosting
- Resend for transactional emails
- Stripe for the $9/month pro tier
Build Timeline: 14 hours across a Saturday and Sunday. First prompt at 9 AM Saturday. Deployed and shared on X at 11 PM Sunday.
Key Prompts:
Prompt 1 -- The initial spec:
Build a waitlist management SaaS with Next.js 14 App Router and Supabase.
Core features:
1. Landing page builder: user creates a waitlist page with custom title,
description, and color scheme. Each page gets a unique slug (/w/[slug]).
2. Email collection: visitors enter email, get position number.
Referral link generated automatically. Each referral moves the referrer
up 3 positions.
3. Dashboard: real-time count of signups, chart of signups over time,
top referrers table, geographic breakdown (from IP geolocation).
4. Launch notification: one-click send to all collected emails.
Auth: Supabase Auth with GitHub and Google OAuth.
Database: Supabase PostgreSQL with RLS policies.
Styling: Tailwind with a clean, minimal aesthetic. Dark mode default.
Start with the database schema and RLS policies, then build the
dashboard, then the public-facing waitlist pages.
Why it worked: Front-loading the database schema and RLS policies meant the entire data layer was solid before any UI code was written. This prevented three or four rounds of restructuring that typically happen when you build UI first.
Prompt 2 -- Referral tracking logic:
Add referral tracking to the waitlist system.
When a user signs up for a waitlist:
1. Generate a unique referral code (8 char alphanumeric)
2. Create a shareable URL: [domain]/w/[slug]?ref=[code]
3. When someone signs up via a referral link, record the referral
4. Move the referrer up 3 positions in the queue
5. Send the referrer an email: "Someone joined through your link!
You moved up to position [X]."
Store referral chains (who referred whom) for the dashboard analytics.
Prevent self-referral. Cap position boost at top 10% of the list.
Handle edge cases: expired waitlists, duplicate signups from same email,
referral codes for non-existent waitlists.
Why it worked: Explicitly listing edge cases in the prompt eliminated two bugs that would have appeared in production. The AI handled all four edge cases correctly on the first generation.
Prompt 3 -- The analytics dashboard:
Build the waitlist analytics dashboard. The user is logged in and
viewing their waitlist's stats.
Show:
- Total signups (big number with daily change indicator, green up/red down)
- Signup velocity chart (line chart, last 30 days, using Recharts)
- Top 10 referrers table (name, referral count, conversion rate)
- Geographic distribution (top 5 countries as horizontal bar chart)
- Recent signups feed (last 20, real-time updates via Supabase Realtime)
All data fetched server-side with React Server Components.
The recent signups feed is a Client Component with real-time subscription.
Loading states: skeleton UI for each card while data loads.
Empty states: friendly message + illustration when no data yet.
Why it worked: Separating server components from client components in the prompt gave the AI clear architectural guidance. The result needed zero restructuring.
Before/After: Marcus had previously attempted to build a similar waitlist tool using traditional development. He spent three weekends on it, got about 60% through the feature set, and abandoned it when the referral position tracking logic became tangled. With vibe coding, the complete feature set was done in one weekend, including features he had not originally planned (geographic analytics, real-time feed).
Lessons Learned:
- Specifying database schema first in the prompt produces dramatically better results than letting the AI infer it from feature descriptions.
- Supabase RLS policies generated by AI need manual review. Two of the four generated policies had overly permissive conditions that would have allowed users to read each other's waitlist data.
- The AI-generated Stripe webhook handler worked on the first try, which was surprising -- this had been a pain point in every previous project.
- Deploying to Vercel mid-build (after the first two hours) and testing against the real deployment caught three environment variable issues early.
- Total cost: $0 for the build (Cursor Pro subscription he already had). $20/month for Supabase Pro + Vercel Pro once users started arriving.
Outcome: Posted on X and Hacker News the following Monday. 340 upvotes on HN. 2,100 signups in the first week. 180 paying users ($9/month) within 60 days. Currently at $1,620 MRR and growing. Marcus has not yet quit his day job but is now building his second product using the same workflow.
Project 2: FieldSync -- Internal Tool Built by a Non-Technical PM
What it is: An internal field operations dashboard for a 40-person landscaping company. Tracks crew assignments, job status, equipment location, client notes, and daily route optimization. Replaced a mess of shared spreadsheets, WhatsApp groups, and sticky notes on the dispatch office wall.
Builder Profile: Rachel Torres, 34. Operations manager at GreenScape Landscaping in Austin, TX. No programming experience. Had taken one HTML course in college a decade ago. Uses Excel daily and considers herself "tech-comfortable but not technical."
Tools Stack:
- Bolt.new for initial prototype
- Lovable for UI refinement and additional features
- Supabase for database and auth
- Google Maps API for route display
- Vercel for hosting
Build Timeline: Three evenings after work (roughly 3 hours each) plus most of a Saturday. Total: approximately 16 hours.
Key Prompts:
Prompt 1 -- The initial description:
I manage a landscaping company with 8 crews of 5 people each.
Every morning I assign crews to jobs using a spreadsheet and a
WhatsApp group. I need an app that:
1. Shows today's jobs on a map with crew assignments
2. Lets me drag and drop to reassign crews to different jobs
3. Crews can update job status from their phones (not started /
in progress / done / issue)
4. Tracks which equipment trailer is with which crew
5. Stores client notes that persist between visits
6. Shows me a daily summary: jobs completed, revenue, crew utilization
Make it simple. My crews are not tech people. The mobile view needs
to be dead simple -- big buttons, minimal text.
I want to log in as admin and see everything. Crews log in with a
simple PIN code and only see their assigned jobs for today.
Why it worked: Writing from the perspective of the actual problem -- not in technical terms -- gave the AI everything it needed. Rachel did not know what a "database" or "REST API" was. She described her day, and the AI built the system to match it.
Prompt 2 -- Fixing the mobile experience:
The crew mobile view is too complicated. They need to see ONLY:
- Their jobs for today, in order
- A big button to change status (green = done, yellow = issue)
- A notes field for each job
- Nothing else
Remove the navigation menu on mobile. Remove the map on mobile.
Remove the equipment section on mobile. Crews do not need any of that.
Just the job list and status buttons. Make the buttons large enough
to tap with work gloves on.
Why it worked: The first version had given crews the same interface as the admin. This prompt stripped it down to exactly what a landscaper standing in a yard with dirty gloves needs. The "work gloves" detail led the AI to generate oversized touch targets (minimum 56px) -- better than many professional mobile apps.
Before/After: Before: Rachel spent 45 minutes every morning in dispatch, managing the spreadsheet, texting crew leaders, and calling clients. Crews often arrived at jobs without knowing the client's gate code or special instructions. Equipment went missing for days because nobody tracked which trailer went where.
After: Morning dispatch takes 10 minutes. Crews see their assignments on their phones before they leave the yard. Client notes (gate codes, dog warnings, irrigation shutoff locations) carry over automatically between visits. Equipment tracking reduced "lost trailer" incidents from two per month to zero in the first quarter.
Lessons Learned:
- Non-technical builders should start with Bolt.new or Lovable, not Cursor. The visual feedback loop is critical when you cannot read code.
- The PIN-code authentication for crews was Rachel's most important design decision. Username/password would have been a non-starter for the field workers.
- Google Maps API costs added up faster than expected. Rachel switched to a static map image for the daily overview and only loads the interactive map when a crew lead taps a specific job. Monthly API cost dropped from $47 to $8.
- The AI initially built a beautiful but unnecessary crew scheduling Gantt chart. Rachel deleted the entire component with one prompt: "Remove the Gantt chart. We don't need it. Keep it simple."
- Having a real user (her dispatch coordinator, Maria) test the app on day two caught three usability issues that Rachel had missed.
Outcome: FieldSync has been in daily use at GreenScape for five months. All eight crews use it. Rachel estimates it saves 6 hours of administrative time per week across the company. The owner asked her to "sell it to other landscaping companies," which she is now exploring. Total build cost: $0 (Bolt.new free tier was sufficient for the prototype; Lovable's free tier handled the refinements). Ongoing cost: $25/month (Supabase) + $8/month (Google Maps API).
Project 3: Resonance -- Startup MVP That Got Into Y Combinator
What it is: An AI-powered customer feedback analysis platform. Companies connect their support channels (Zendesk, Intercom, email), and Resonance automatically categorizes feedback by theme, sentiment, and urgency. Surfaces product insights that typically take a research team weeks to compile.
Builder Profile: David Park and Jenna Liu, both 27. David is a former ML engineer at a mid-tier AI startup. Jenna was a product manager at Salesforce. Neither had built a full-stack consumer product before. They quit their jobs in September 2025 with savings to cover six months.
Tools Stack:
- Claude Code for backend architecture and API integrations
- Cursor for frontend development
- Next.js 14 with App Router
- Supabase for database, auth, and vector storage
- OpenAI API for embeddings and classification
- Anthropic API for summary generation
- Vercel for hosting
- Stripe for billing
Build Timeline: Three weeks from first prompt to a working MVP. One additional week for polish before the YC application. Total: four weeks with two people working full-time.
Key Prompts:
Prompt 1 -- System architecture:
Design the architecture for a customer feedback analysis platform.
Data flow:
1. INGEST: Connect to Zendesk, Intercom, and email (IMAP) to pull
customer messages. Webhook listeners for real-time ingestion.
Dedup messages that appear in multiple channels.
2. PROCESS: For each message:
- Generate embedding (OpenAI text-embedding-3-small)
- Classify sentiment (positive/neutral/negative/urgent)
- Extract themes (use clustering on embeddings, auto-generate
theme labels)
- Score urgency (1-5 based on sentiment + keywords + customer tier)
3. STORE: PostgreSQL for structured data. Supabase pgvector for
embeddings. Link every insight back to source messages.
4. SURFACE: Dashboard showing:
- Theme clusters with message counts and trends
- Sentiment distribution over time
- Urgent items requiring immediate attention
- Weekly auto-generated summary of top themes and shifts
Multi-tenant: each company sees only their own data. RLS enforced
at the database level. API keys scoped per integration per company.
Build the ingestion pipeline first. I want to connect a test Zendesk
instance and see messages flowing into the database within the first
session.
Why it worked: David wrote this prompt like a system design document. The level of specificity on data flow, multi-tenancy, and storage separation meant Claude Code generated a clean, well-separated architecture on the first pass. The instruction to get data flowing in the first session kept the AI focused on the critical path.
Prompt 2 -- The insight generation engine:
Build the weekly insight report generator.
Input: All feedback messages from the past 7 days for a given company.
Process:
1. Cluster messages by theme (using cosine similarity on embeddings,
threshold 0.82)
2. For each cluster with 5+ messages:
- Generate a theme label (3-5 words)
- Count messages and calculate sentiment breakdown
- Identify the most representative message (closest to centroid)
- Compare to previous week: is this theme growing, shrinking, or new?
3. Rank themes by: (message_count * urgency_avg * growth_rate)
4. Generate executive summary using Claude:
- 3 paragraphs maximum
- Lead with the most important shift
- Include specific numbers
- End with a recommended action
Output: Structured JSON with themes array and summary text.
Store in reports table. Send via email to company admin.
Handle edge cases: company with fewer than 10 messages that week
(skip report, send "not enough data" note), themes that appear
for the first time (flag as "emerging"), themes that disappear
(flag as "resolved").
Why it worked: The mathematical specificity (cosine similarity threshold, minimum cluster size, ranking formula) gave the AI enough constraints to produce a working implementation without guessing. Jenna later said the ranking formula in the prompt became the actual production ranking formula -- it was that well-specified.
Before/After: Before: David and Jenna had a pitch deck, three notebooks of customer research, and a Figma prototype. No working software. Their previous attempt at building the MVP with traditional development (David coding the backend, contracting a frontend developer) had consumed six weeks and $12,000 in contractor fees with only the auth system and a basic dashboard to show for it.
After: A fully functional platform that could ingest from Zendesk, classify feedback, cluster themes, and generate weekly reports. Three beta customers were using it with real data. The YC demo showed live feedback flowing in and being categorized in real time.
Lessons Learned:
- The combination of Claude Code for backend/architecture and Cursor for frontend was more effective than using either tool alone. Claude Code handled the complex data pipeline logic better; Cursor was faster for UI iteration.
- AI-generated API integrations (Zendesk, Intercom) worked for the happy path but failed on pagination, rate limiting, and error recovery. These required manual intervention and were the primary source of bugs during beta.
- The multi-tenant RLS policies were the single highest-risk component. David reviewed every policy line by line -- this was not a place to vibe.
- Having three beta customers during the build, not after, changed everything. Real data exposed clustering issues that synthetic test data never would have.
- YC partners were not impressed by the fact that it was vibe-coded. They were impressed by the speed: four weeks from zero to three paying customers with real usage data.
Outcome: Accepted into Y Combinator W26 batch. Raised a $500K pre-seed round before the batch started. Currently at $8,400 MRR with 14 paying companies. David estimates the vibe coding approach saved them three months and $40,000+ in development costs compared to traditional development, which directly extended their runway.
Project 4: karandev.co -- Developer Portfolio That Landed a Job
What it is: A personal developer portfolio site with interactive project showcases, a working blog with MDX support, an AI chatbot trained on the builder's resume and projects, and a live "what I'm working on" status pulled from GitHub and Spotify APIs.
Builder Profile: Karan Patel, 22. Recent computer science graduate from a state university. Solid fundamentals in Python and Java from coursework, but limited experience with modern web frameworks. Had applied to 47 junior developer positions with a plain HTML resume site. Zero callbacks.
Tools Stack:
- Cursor (Composer mode) for all development
- Next.js 14 with App Router
- Tailwind CSS + Framer Motion for animations
- MDX for blog posts
- Vercel AI SDK + OpenAI for the resume chatbot
- GitHub API + Spotify API for live status widgets
- Vercel for hosting
Build Timeline: One full week of focused work during winter break. Approximately 40 hours total.
Key Prompts:
Prompt 1 -- Portfolio design direction:
Build a developer portfolio site that will make a hiring manager stop
scrolling. Next.js 14 App Router with Tailwind CSS.
Design: Dark theme. Subtle grain texture background. Smooth scroll.
Minimal but not boring. Accent color: electric blue (#3B82F6).
Typography: Inter for body, JetBrains Mono for code snippets.
Sections:
1. Hero: My name in large type. One-line tagline that rotates between
3 phrases (typed animation effect). Small "scroll down" indicator.
2. About: 2-paragraph bio. Photo (circular, subtle border glow).
Tech stack icons grid (React, Python, TypeScript, etc.) with
hover tooltips.
3. Projects: 3-4 cards in a grid. Each card: screenshot, title,
one-line description, tech tags, links to live demo + GitHub.
Cards tilt slightly on hover (3D transform). Click to expand
into full case study.
4. Blog: Latest 3 posts pulled from MDX files. Title, date, read time,
excerpt. Link to full post.
5. Contact: Simple email form (Resend API). Social links row.
Page transitions: smooth with Framer Motion. Sections fade-in on scroll.
Performance: 95+ Lighthouse score. No layout shift.
Why it worked: The prompt read like a creative brief, not a feature list. Details like "grain texture background," "cards tilt slightly on hover," and "typed animation effect" gave the AI a visual vision to execute against. The Lighthouse score target acted as a quality gate.
Prompt 2 -- The resume chatbot:
Add an AI chatbot to the portfolio that answers questions about me.
It should be a small floating chat bubble in the bottom right corner.
When opened, it expands into a chat window. Powered by OpenAI GPT-4o-mini
via the Vercel AI SDK.
System prompt for the chatbot:
"You are a helpful assistant on Karan Patel's portfolio website.
You answer questions about Karan's skills, experience, projects,
and education based on the context provided. You are friendly,
concise, and professional. If asked something not covered in the
context, say you don't have that information and suggest emailing
Karan directly. Never make up information about Karan."
Context document (embed this in the system prompt):
[I will paste my resume and project descriptions here]
Features:
- Streaming responses (token by token appearance)
- Suggested starter questions: "What are Karan's top skills?",
"Tell me about his projects", "What is his education background?"
- Rate limit: max 20 messages per session to control API costs
- Chat history persists in the browser session (sessionStorage)
- Mobile responsive: full-width chat panel on screens under 640px
Why it worked: Providing the exact system prompt within the development prompt eliminated a round of iteration. The rate limit and cost control details showed practical thinking that the AI translated directly into implementation.
Before/After: Before: A single-page HTML resume with a white background, Times New Roman font, and three bullet-pointed project descriptions. Karan described it as "what you'd get if you exported a Google Doc to HTML." Forty-seven applications sent. Zero interviews.
After: A polished portfolio with smooth animations, interactive project showcases, a working blog, and an AI chatbot that could answer recruiter questions about Karan's experience at 2 AM. The chatbot alone generated over 600 conversations in the first month.
Lessons Learned:
- The AI chatbot was the differentiator. Three interviewers specifically mentioned it. One said, "I asked your chatbot about your Python experience and it convinced me to bring you in."
- Framer Motion animations generated by AI worked but were initially too aggressive (elements flying in from all directions). Karan's best prompt was a one-liner: "Reduce all animations to subtle fades and slight upward slides. Nothing should feel like a PowerPoint transition."
- The Spotify "now playing" widget was a fun addition but caused a privacy concern Karan had not anticipated -- it was broadcasting his music taste to potential employers during interviews. He added a toggle to disable it.
- MDX blog setup took longer than expected. The AI-generated MDX configuration worked for basic posts but broke on code blocks with certain languages. This required actual debugging rather than prompt iteration.
- Total cost: $0 for the build. Approximately $3/month for the OpenAI API calls powering the chatbot (GPT-4o-mini is cheap at volume).
Outcome: Karan posted the portfolio on r/webdev, Twitter, and LinkedIn. The Reddit post received 1,200 upvotes. The portfolio has had 14,000 unique visitors in three months. He received 11 interview requests in the first two weeks after launching. Accepted a junior full-stack developer role at a Series B startup in San Francisco. Starting salary: $135,000 -- $30,000 more than the median offer for new grads from his university. His manager later told him: "The portfolio showed us you could ship, not just code."
Project 5: Dungeon of Echoes -- A Game Built by a Teenager
What it is: A browser-based roguelike dungeon crawler with procedurally generated levels, pixel art aesthetics, turn-based combat, and a permadeath mechanic. Players descend through floors, collect loot, fight monsters, and try to reach floor 50. Leaderboard tracks the deepest floor reached.
Builder Profile: Aiden Nakamura, 16. High school junior in Portland, OR. Plays video games constantly. Had completed a Python basics course on Codecademy and built a few simple scripts. No web development or game development experience. Started this project during a snow day when school was cancelled.
Tools Stack:
- Replit Agent for initial game prototype
- Claude.ai (free tier) for debugging and game design advice
- HTML5 Canvas for rendering
- Vanilla JavaScript (no frameworks)
- localStorage for save data and leaderboard
- Replit hosting (free tier)
Build Timeline: Two weeks of after-school sessions (2-3 hours each) plus two full weekend days. Total: approximately 35 hours.
Key Prompts:
Prompt 1 -- The game concept:
Build a roguelike dungeon crawler game in HTML5 Canvas and JavaScript.
No frameworks, just vanilla JS.
The player starts on floor 1 of a dungeon. Each floor is a grid of
rooms generated randomly. The player moves with arrow keys. Each room
can contain: nothing, a monster, a treasure chest, a health potion,
or stairs down to the next floor.
Combat is turn-based. Player and monster take turns attacking. Damage
is based on attack stat minus defense stat plus a random factor.
When a monster dies, it drops gold and maybe an item.
Items: sword (increase attack), shield (increase defense), potion
(restore health). Items have rarity levels: common (white), rare (blue),
epic (purple). Higher rarity = better stats.
Permadeath: when the player dies, the run is over. Show a death screen
with stats: floors cleared, monsters killed, gold collected, time played.
Visual style: 16x16 pixel art aesthetic using simple colored squares
and basic shapes. Dark background. The dungeon should feel gloomy.
Start with movement and room generation. Add combat second.
Add items third. Add the death screen last.
Why it worked: Breaking the build into a clear sequence (movement, then combat, then items, then death screen) matched how game development actually works -- you get the core loop right before adding layers. Aiden said the AI "built each layer perfectly because it always had the previous layer working first."
Prompt 2 -- Making combat feel satisfying:
Combat feels boring. When I attack a monster or it attacks me,
nothing happens visually. Make it feel impactful:
1. Screen shake: brief shake (3 frames) when any attack lands
2. Damage numbers: float upward from the target and fade out, red for
damage, green for healing
3. Flash effect: the hit target flashes white for 2 frames
4. Death animation: when a monster dies, it fades out and drops
pixel particles downward
5. Sound: I know we can't do real sound easily, so fake it --
flash the screen border red briefly on hit to give visual "impact"
Keep the turn-based system. These are just visual effects layered on
top of the existing combat logic. Do not change how damage calculation
works.
Why it worked: The constraint "do not change how damage calculation works" prevented the AI from rewriting the combat system while adding effects. Aiden had learned from an earlier mistake where asking for "better combat" caused the AI to replace his entire combat module.
Before/After: Before: Aiden had tried to build a game three times previously. Attempt one: followed a YouTube tutorial for a platformer in Unity, got stuck on collision detection, gave up after four hours. Attempt two: tried Godot, spent a weekend learning the editor, never got past the main menu. Attempt three: started a text adventure in Python, finished it, but wanted something visual.
After: A fully playable, visually polished (for a browser game) roguelike with 50 floors of content, seven monster types, fifteen items, a working leaderboard, and combat that "actually feels fun to play" according to the comments on his Reddit post.
Lessons Learned:
- Replit Agent was the right starting point for a first-time game builder. The instant preview and zero-configuration hosting removed all friction.
- Game feel (screen shake, particles, damage numbers) transforms a boring prototype into something people want to keep playing. Aiden spent 20% of total time on these "polish" effects and considers it the best time investment.
- Procedural generation produced occasional unwinnable floors where the stairs were placed in a room surrounded by walls with no entrance. Aiden fixed this by adding a post-generation validation step -- a prompt asking the AI to "verify that every room with stairs is reachable from the spawn point. If not, regenerate."
- localStorage has a size limit. After extended play sessions with many leaderboard entries, the game crashed. Aiden learned about data size limits the hard way and added cleanup logic.
- Aiden's classmates became his QA team. They found six bugs in the first day, all of which Aiden fixed by pasting error descriptions into Claude.
Outcome: Posted on r/roguelikes and r/IndieGaming. The Reddit post received 480 upvotes. The game has been played over 8,000 times. Aiden's computer science teacher gave him extra credit and invited him to present the project to the class. He is now building a multiplayer version and has started learning React "for real" because he wants to understand what the AI was generating. He says: "Vibe coding got me through the door. Now I actually want to learn what's behind the door."
Project 6: The Copper Pot -- E-Commerce Site for a Small Business
What it is: A full e-commerce storefront for an artisanal cookware shop in Asheville, NC. Features a product catalog with high-resolution image galleries, size/finish variants, a shopping cart with saved-cart recovery, Stripe checkout, order tracking, and an admin panel for inventory management.
Builder Profile: Linda Brennan, 52. Owner of The Copper Pot, a brick-and-mortar cookware shop she has run for 18 years. Zero programming experience. Previously paid a local agency $8,500 to build a Shopify store that she found difficult to update and expensive to maintain ($79/month for Shopify Plus plus agency retainer for changes). Heard about vibe coding from her nephew who is a software developer.
Tools Stack:
- Lovable for storefront and admin panel
- Supabase for product database, auth, and image storage
- Stripe for payment processing
- Vercel for hosting
- Resend for order confirmation emails
Build Timeline: Five days of working on it during slow hours at the shop, plus two evenings. Total: approximately 20 hours.
Key Prompts:
Prompt 1 -- The storefront:
Build an online store for my cookware shop called "The Copper Pot."
I sell high-end copper pots, pans, and kitchen tools. My customers
are home cooks aged 35-65 who appreciate craftsmanship. The feel
should be warm, artisanal, and trustworthy. Think: exposed brick,
natural tones, and beautiful product photography.
Pages:
1. Home: hero image with tagline "Handcrafted Copper Cookware Since
2008", featured products grid (6 items), testimonial carousel,
Instagram-style gallery of kitchen photos
2. Shop: filterable product grid. Filters: category (pots, pans,
tools, sets), price range, material. Sort by price, newest,
popularity.
3. Product detail: large image gallery (click to zoom), product
description, size/finish selector, price, add to cart button,
"You might also like" section with 3 related products.
4. Cart: line items with quantity adjustment, subtotal, shipping
estimate, proceed to checkout.
5. About: our story, photo of the shop, craftsmanship values.
6. Contact: form + shop address + embedded Google Map.
Colors: warm cream background (#FDF8F0), copper accent (#B87333),
dark text (#2D2926). Font: serif headers (Playfair Display),
sans-serif body (Lato).
Mobile must be perfect. Most of my customers browse on their phones.
Why it worked: Linda described her customers and brand feeling, not technical specifications. The AI translated "warm, artisanal, and trustworthy" and "exposed brick, natural tones" into a design that Linda said "looks exactly like my shop feels." The color hex codes were her nephew's contribution -- he helped her pick colors that matched her physical store's palette.
Prompt 2 -- Admin inventory management:
Add an admin panel that only I can access (password protected).
I need to:
1. Add new products: name, description, price, category, images
(upload multiple), sizes available, stock count for each size
2. Edit existing products: change any field, reorder images
3. Mark products as "sold out" (shows badge on storefront but
keeps the page live) or "hidden" (removes from storefront)
4. View orders: list with date, customer name, items, total,
status (paid / shipped / delivered). Click to see full details.
5. Update order status and add tracking number (customer gets
an email when I mark it as shipped)
6. Simple dashboard: total revenue this month, number of orders,
top selling products
Keep it simple. I am not technical. Big buttons, clear labels.
When I upload images, automatically resize them for the web
(I take photos on my phone and they are very large files).
Why it worked: "I am not technical. Big buttons, clear labels." This single line shaped the entire admin interface. The AI generated an admin panel with a significantly simpler layout than a typical CMS, with confirmations on every destructive action and undo options. The automatic image resizing solved a real problem -- Linda's phone photos were 4MB each.
Before/After: Before: A Shopify store that cost $8,500 to build and $79/month to maintain. Linda could not update product descriptions without emailing her agency and waiting 48 hours. Adding new products required a $150/change agency fee. The site looked generic -- it used a standard Shopify theme that looked identical to thousands of other stores.
After: A custom storefront that matches The Copper Pot's physical brand identity. Linda updates products herself through the admin panel. No monthly platform fees beyond Supabase ($25/month) and Vercel ($0 -- free tier). Stripe charges are 2.9% + $0.30 per transaction (same as Shopify).
Lessons Learned:
- Lovable was the right tool for someone with zero programming experience. Linda never saw a line of code. She described what she wanted in plain English and refined the results visually.
- Product photography matters more than website design. Linda initially uploaded poorly lit phone photos and the site looked "cheap." Her nephew helped her photograph products with natural light, and the same site suddenly looked premium.
- Stripe integration through Lovable worked seamlessly for simple checkout. However, Linda needed to handle sales tax, which required adding a tax calculation service. This was the only part where she needed her nephew's help.
- The "saved cart recovery" feature (emailing customers who abandoned carts) was not in Linda's original plan. The AI suggested it during a prompt about the checkout flow. It recovers approximately $300-$400 in sales per month.
- Shipping calculation was the hardest problem. USPS API integration was unreliable, so Linda switched to flat-rate shipping tiers ($8 / $12 / free over $150), which was simpler and actually increased average order value.
Outcome: Online sales in the first three months: $23,400. Previous Shopify store's best three-month period: $9,100. The warm, custom design and improved product photography drove a 34% increase in conversion rate compared to the old Shopify store. Linda's monthly tech costs dropped from $79 (Shopify) + agency retainer to $25 (Supabase). She saved approximately $3,000 in the first year on platform and agency fees alone. Three other local shop owners have asked Linda to help them build similar stores.
Community Stats
Aggregated from 247 community submissions received between October 2025 and January 2026.
Submissions Overview
| Metric | Value |
|---|---|
| Total submissions received | 247 |
| Featured projects (all-time) | 38 |
| Countries represented | 23 |
| Youngest builder | 14 (high school student, built a study flashcard app) |
| Oldest builder | 67 (retired accountant, built a family recipe archive) |
Builder Background Distribution
| Background | Percentage |
|---|---|
| Professional developer | 41% |
| Student / recent graduate | 19% |
| Non-technical professional | 17% |
| Designer / creative | 11% |
| Founder / entrepreneur | 8% |
| Other (retired, career switcher, hobbyist) | 4% |
Most Popular Tools
| Rank | Tool | Usage Rate |
|---|---|---|
| 1 | Cursor | 62% |
| 2 | Claude Code | 47% |
| 3 | Bolt.new | 34% |
| 4 | Lovable | 28% |
| 5 | v0 | 24% |
| 6 | Replit Agent | 19% |
| 7 | GitHub Copilot | 16% |
| 8 | Windsurf | 11% |
Note: Percentages exceed 100% because most projects use multiple tools.
Supporting Technology
| Category | Most Popular Choice |
|---|---|
| Framework | Next.js (58%) |
| Styling | Tailwind CSS (71%) |
| Database | Supabase (52%) |
| Hosting | Vercel (64%) |
| Payments | Stripe (89% of projects with payments) |
| Auth | Supabase Auth (44%) |
Build Time Distribution
| Time Range | Percentage |
|---|---|
| Under 4 hours | 12% |
| 4-12 hours | 27% |
| 12-24 hours (1-2 days) | 31% |
| 1-2 weeks | 22% |
| Over 2 weeks | 8% |
Average time from first prompt to deployed: 18.4 hours Median time from first prompt to deployed: 14 hours
Project Categories
| Category | Count | Percentage |
|---|---|---|
| SaaS / web application | 72 | 29% |
| Internal / business tool | 48 | 19% |
| Portfolio / personal site | 37 | 15% |
| E-commerce | 29 | 12% |
| Game | 21 | 9% |
| Mobile app | 18 | 7% |
| Chrome extension | 12 | 5% |
| CLI tool / developer utility | 10 | 4% |
Outcome Metrics
| Metric | Value |
|---|---|
| Projects still actively maintained (after 3+ months) | 68% |
| Projects generating revenue | 31% |
| Average MRR for revenue-generating projects | $840 |
| Highest reported MRR | $12,400 |
| Builders who reported getting hired because of their project | 14 |
| Builders who transitioned to full-time on their project | 9 |
Success Patterns
From analyzing all 247 submissions, the projects most likely to succeed shared these characteristics:
- Specific problem, specific user. "A tool for landscaping dispatchers" beats "a project management app" every time.
- Prompt specificity. Builders who shared detailed, structured prompts (average 150+ words per prompt) had measurably better outcomes than those using short, vague prompts.
- Early deployment. Projects deployed within the first 25% of total build time had a 73% continuation rate. Projects that waited until "done" to deploy had a 41% continuation rate.
- Real users during build. 82% of revenue-generating projects had at least one real user testing before the builder considered it complete.
- Two tools, not five. The most successful builders typically used one primary AI coding tool and one supporting tool. Projects that used four or more AI tools had lower completion rates, likely due to context-switching overhead.
Monthly Spotlight
March 2026 Spotlight: FleetTrack
Category: B2B SaaS / Logistics Builder: Raj Patel, 27, operations analyst at a logistics company Tools: Claude Code (Opus 4.6), Next.js 16, Supabase, Mapbox, Vercel Build time: 18 hours over one weekend
The Story: Raj managed a fleet of 40 delivery vehicles using spreadsheets and phone calls. He had never written production code before but had been following vibe coding tutorials on the EndOfCoding YouTube channel. When his manager complained about the lack of real-time visibility into delivery routes, Raj decided to build a solution himself.
His opening prompt to Claude Code:
Build a real-time fleet tracking dashboard with Next.js 16 and Supabase.
Core features:
1. Map view showing all active vehicles with live GPS positions
(use Mapbox GL JS). Each vehicle is a colored dot -- green for
on-schedule, yellow for delayed, red for stopped.
2. Sidebar with vehicle list, sortable by status, driver name, or
ETA to next stop. Clicking a vehicle centers the map and shows
route history for today.
3. Driver mobile view: a simple page where drivers tap "Arrived"
at each stop. Auto-captures GPS coordinates. Works offline and
syncs when back online.
4. Daily summary: auto-generated at 6 PM showing total deliveries,
average time per stop, vehicles that went off-route, and fuel
estimates based on distance traveled.
Auth via Supabase magic link. Role-based: admin sees everything,
drivers see only their own route. Use Supabase real-time subscriptions
for live vehicle position updates.
The dashboard must feel fast. Sub-200ms updates on the map.
Raj had a working prototype by Saturday night. By Sunday evening, he had added route optimization suggestions using a simple nearest-neighbor algorithm. He deployed to Vercel and showed it to his manager on Monday morning. Within two weeks, all 40 vehicles were using FleetTrack. The company cancelled its $800/month fleet management subscription.
Why we selected it: FleetTrack represents the next wave of vibe coding impact: non-developers building real B2B tools that replace expensive SaaS subscriptions. Raj's prompt demonstrates strong domain expertise combined with specific technical requirements -- the sweet spot where vibe coding delivers maximum value. The offline-sync requirement for drivers shows thoughtful product thinking that no AI would have suggested on its own.
Previous: February 2026 Spotlight: QuietPage
Category: Productivity tool Builder: Sana Mirza, 31, UX designer at a remote-first company Tools: Cursor, Next.js, Supabase, Vercel Build time: 11 hours over three evenings
The Story: Sana was frustrated by every writing app she tried. Google Docs felt corporate. Notion was too feature-heavy. iA Writer was beautiful but did not sync across devices. She wanted a writing tool that was quiet, distraction-free, synced to the cloud, and had exactly one feature beyond basic text editing: a daily word count streak tracker.
Sana opened Cursor on a Tuesday evening with this prompt:
Build a minimal writing app. I mean truly minimal.
One page. No sidebar. No toolbar. No menus visible by default.
Just a white page with a blinking cursor. The user types.
Auto-save to Supabase every 30 seconds and on every pause longer
than 2 seconds. Show a subtle "saved" indicator that fades in and
out -- bottom right corner, small gray text, disappears after 1 second.
One feature: daily word count streak. If the user writes at least
200 words today, the streak continues. Show the streak as a small
flame icon with a number in the top right corner. That is the only
UI element visible while writing.
Keyboard shortcuts (show on hover over a small "?" icon, bottom left):
- Cmd+B: bold
- Cmd+I: italic
- Cmd+Shift+H: toggle heading
- Cmd+/: toggle dark mode
No sign-up wall. Auth via magic link only. No password to remember.
If the writing app does not feel calm, it has failed.
The result was a writing app that four of Sana's coworkers started using within a week. She posted it on Hacker News with the title "I built the quietest writing app on the internet." It hit the front page. Within a month, QuietPage had 2,800 registered users and Sana was considering adding a $5/month premium tier for features like version history and export to PDF.
Why we selected it: QuietPage demonstrates that vibe coding is not just for building complex systems. Sometimes the hardest product decision is what to leave out. Sana's prompt is a masterclass in constraint-driven design, and the result is a product people genuinely prefer over established alternatives -- not because it does more, but because it does less, better.
Have a project that should be featured in next month's spotlight? Submit it using the template above.
Explore Further
- Get the complete prompt library in Chapter 17: The Complete Prompt Library -- 200+ production-ready prompts for every stage of AI-native development.
- Compare tools in Chapter 18: Tool Comparison Matrix -- Side-by-side evaluation of every major vibe coding tool.
- Secure your project with Chapter 19: The Security Playbook -- The pre-launch checklist every vibe-coded project needs.
- Try hands-on at vibe-coding.academy -- Interactive tutorials and guided projects.
- Join the discussion at endofcoding.com -- Community forum, Discord, and weekly office hours.
This chapter is updated monthly with new featured projects and refreshed community stats. Last updated: March 2026.
★ What Level Are You?
Answer 6 questions to discover your vibe coding level.
★ Glossary
- Vibe Coding
- AI-assisted development where the developer describes intent in natural language and evaluates output through execution, not code review.
- Accept All
- The practice of accepting all AI-generated code changes without reviewing diffs.
- Coding Agent
- An autonomous AI system that can plan, implement, test, and deploy code changes independently.
- Composer
- A mode in AI IDEs (like Cursor) that generates multi-file code from natural language descriptions.
- Error-Driven Development
- Debugging by copy-pasting error messages to the AI rather than reading and understanding the code yourself.
- MCP (Model Context Protocol)
- Anthropic's open protocol allowing AI assistants to connect to external tools and data sources.
- Prompt Engineering
- The skill of crafting effective natural language instructions to produce desired AI outputs.
- Vibe Coding Hangover
- The phenomenon of teams struggling to maintain, extend, or debug AI-generated codebases. Documented by Fast Company in Sept 2025.
- Zombie App
- An application that is functional but unmaintainable because nobody understands the AI-generated code.
- Complexity Ceiling
- The point at which a vibe-coded application can no longer be extended because the underlying code is too tangled.
- Hybrid Workforce
- An organization where AI agents work alongside human engineers, as pioneered by Goldman Sachs with Devin.
- The 80/20 Rule
- Vibe code the 80% (UI, boilerplate, standard patterns). Engineer the 20% (auth, security, business logic).
- Agent Teams
- A feature in Claude Code (introduced with Opus 4.6) allowing multiple AI agents to work in parallel on different aspects of a project, coordinating autonomously.
- Agent Mode
- A capability in coding tools (GitHub Copilot, Cursor, etc.) where the AI autonomously identifies subtasks, makes multi-file edits, runs tests, and fixes errors without step-by-step human guidance.
- Devin Wiki / Devin Search
- Cognition's documentation generation and code search tools built into the Devin platform, enabling AI-generated documentation and natural language querying of codebases.
- Multimodal Coding
- An emerging trend combining voice, visual, and text-based inputs for AI code generation — including screenshot-to-code and voice-to-code workflows.
★ Resources
Tools to Try
Cursor — cursor.com — AI-native IDE ($1B+ ARR, $29.3B valuation)
Claude Code — Anthropic's terminal coding agent with agent teams (Opus 4.6)
GitHub Copilot — github.com/features/copilot — Agent mode in VS Code (4.7M users)
Bolt.new — bolt.new — Browser-based app builder
v0 — v0.dev — AI UI generation by Vercel
Replit — replit.com — Browser IDE with AI agent
Lovable — lovable.dev — App creation for non-developers
Google Jules — jules.google — Async coding agent (Gemini 3 Pro)
Gemini CLI — github.com/google-gemini/gemini-cli — Open-source terminal agent
OpenAI Codex CLI — github.com/openai/codex — Open-source terminal agent
Devin — devin.ai — Autonomous AI software engineer ($155M+ ARR)
Windsurf — windsurf.com — AI IDE with persistent memory (now part of Cognition)
Further Reading
- Karpathy's original tweet (February 2, 2025)
"Vibe Coding in Practice" — arXiv research paper (2025)
"Vibe Coding Kills Open Source" — arXiv research paper (January 2026)
Tenzai security assessment (December 2025)
Cognition's Devin 2025 Performance Review
Fast Company: "The Vibe Coding Hangover" (September 2025)
IBM: "What is Vibe Coding?"
Google Cloud: "Vibe Coding Explained"
Vibe Coding — Wikipedia (comprehensive history and analysis)
Example Projects
Open the HTML files included with this ebook to see working applications built through vibe coding:
- Task Manager (
examples/task-manager-example.html) — localStorage, responsive design, animations
- Task Manager (
Snake Game (
examples/snake-game-example.html) — Canvas rendering, game loop, score trackingPrompt Examples (
examples/vibe-coding-prompts.md) — Ready-to-use prompts by category"The vibes are real. The exponentials are real. The security vulnerabilities are real too. Code wisely."
Last updated: February 25, 2026
What's New
Every update to this ebook is tracked here. Subscribers get monthly updates with new content, revised chapters, and fresh prompts.
April 2026
April 9, 2026
- Chapter 5 (Tools Landscape): Cursor 3 launch (April 2) — Agents Window replaces Composer (multi-agent side-by-side/grid/stacked), Design Mode (click browser UI → agent modifies component), cloud-to-local handoff; Claude Code April 4 OpenClaw policy change — subscription limits no longer cover third-party harnesses, pay-as-you-go required (one-time credit issued), plus PowerShell tool for Windows, 60% faster Write tool diff; GitHub Copilot — Copilot SDK in public preview, Autopilot mode, privacy policy change (training on user data by default from April 24 — opt-out required).
- Chapter 9 (Numbers): Added Claude Mythos 93.9% SWE-bench (restricted, Project Glasswing); developer trust declined to 29% (SonarSource 2026, down from 70%+ in 2023); 51% professional devs use AI daily; 64% started using AI agents; 75% PR turnaround reduction (9.6 days → 2.4 days, Index.dev); 3.6 hours/week time saved (survey median); 66% frustrated by "almost right" solutions.
- Chapter 19 (Security Playbook): Trivy Cascade extension — CanisterWorm self-propagating npm worm (64+ packages, blockchain C2, evaded domain-seizure takedown), spread to Checkmarx KICS/AST GitHub Actions and LiteLLM (95M monthly PyPI downloads); new "AI as Autonomous Vulnerability Researcher" section covering Claude Mythos/Project Glasswing — autonomous zero-day discovery, implications for vibe-coded app security posture.
- Chapter 21 (Intel Brief): Six new April 2–9 incident cards: Cursor 3 (Agents Window + Design Mode); Claude Mythos/Project Glasswing (93.9% SWE-bench, zero-day discovery, defense-only restriction); Meta Muse Spark (Meta Superintelligence Labs first model, April 8); Trivy Cascade → CanisterWorm (blockchain C2, 64+ packages, Checkmarx + LiteLLM spread); Claude outages April 6–8 (10-hour outage, 8,000+ Downdetector reports); GitHub Copilot privacy change (April 24 training-by-default). Numbers section updated with Mythos 93.9%, CanisterWorm 64+ packages, trust 29%, PR turnaround 75%. What to Watch expanded with Copilot opt-out deadline and Mythos GA timeline.
April 1, 2026
- Chapter 5 (Tools Landscape): Cursor valuation updated to ~$50B (Bloomberg, fundraising talks at $2B+ ARR); Anthropic acquires Bun (JavaScript runtime) — native Bun integration in Claude Code; GitHub Copilot Agent Mode now fully generally available on both VS Code and JetBrains across all Copilot plans.
- Chapter 9 (Numbers): Added 73% global daily AI tool usage (Stack Overflow Dev Survey, Q1 2026) and 41% AI-generated code share (Sourcegraph Code Intelligence Report, March 2026); Cursor valuation updated to ~$50B; GitHub Copilot paid users updated to 20M+.
- Chapter 19 (Security Playbook): New "Supply Chain Attacks: April 2026 Alert" section covering Axios npm hijack (March 31 — UNC1069/North Korea, WAVESHAPER.V2 RAT, ~100M weekly downloads); LiteLLM credential stealer (versions 1.82.7/1.82.8, March 24); Langflow RCE CVE-2026-33017 (unauthenticated, CISA KEV, exploited within 20h); Trivy Docker Hub compromise CVE-2026-33634. New "Vibe-Coded App Vulnerability Research" section with Georgia Tech Vibe Security Radar data (2,000+ vulns, 400+ secrets in 5,600 apps) and AI-generated code CVE trend (6→15→35/month).
- Chapter 21 (Intel Brief): Transitioned to April 2026 brief. Seven new incident cards: Axios supply chain attack (North Korean state actor), LiteLLM/Langflow/Trivy attacks, Georgia Tech vulnerability research, MCP 97M monthly downloads milestone, Cursor self-hosted cloud agents, Vibe Coding 1-year anniversary + Collins Dictionary Word of the Year, SWE-bench model convergence. Numbers section updated with April figures. "What to Watch in May 2026" replaces April watchlist.
March 2026
March 25, 2026
- Chapter 5 (Tools Landscape): Claude Code updated for /loop scheduled tasks, 1M token context, 64k max output for Opus 4.6 (v2.1.63→2.1.76 evolution); Replit updated to $400M Series D at $9B valuation; Lovable updated with M&A offensive; GitHub Copilot JetBrains agentic capabilities GA; Windsurf/Devin updated with Codemaps product.
- Chapter 9 (Numbers): AI-generated code share updated to 46% (GitHub); US developer daily usage updated to 92%; Replit $9B valuation added to Valuations section.
- Chapter 19 (Security Playbook): New "MCP Supply Chain" section covering OpenClaw attack (1,184 malicious packages, ~1 in 5 in ClawHub), CVE-2026-23744 (CVSS 9.8 MCPJam RCE), Azure MCP RCE (CVSS 9.6), 36.7% SSRF exposure across MCP servers, with actionable protection checklist.
- Chapter 21 (Intel Brief): Six new incident cards for week of March 18-25: Claude Code /loop, Replit Series D, Lovable M&A, Devin Review + Windsurf Codemaps, Copilot JetBrains GA, OpenClaw supply chain attack. Numbers section updated. "What to Watch" expanded with MCP security, Lovable M&A, Replit ARR target.
March 7, 2026
- Chapter 5 (Tools Landscape): Cursor updated to v2.6 (Automations, JetBrains support, MCP Apps). OpenAI Codex CLI updated for GPT-5.4 (native computer use, 1M token context). Claude Code updated with voice mode, $2.5B+ ARR, Pentagon supply-chain risk note. Added Kilo Code (open-source, 1.5M+ users). GitHub Copilot updated to 26M+ users with GPT-5 mini/GPT-4.1 included. Windsurf updated with Gemini 3.1 Pro and LogRocket #1 ranking.
- Chapter 9 (Numbers): Claude Code ARR updated to $2.5B+. Copilot users updated to 26M+. Added Emergent AI ($50M ARR in 7 months), Cognition ($500M raise, $10B valuation, $82M+ ARR). Added developer sentiment section (84% use AI, only 3% high trust, 60% favorable view down from 70%+, 15% professional vibe coding adoption). Collins Dictionary Word of the Year updated for 2026.
- Chapter 19 (Security Playbook): Added AI Tool Security Advisories section covering Claude Code CVEs (CVE-2025-59536 RCE, CVE-2026-21852 API key exfiltration) with actionable guidance on AI tool attack surfaces.
- Chapter 21 (Intel Brief): Added GPT-5.4 launch (computer use, 1M tokens, financial tools). Added Pentagon/Anthropic conflict. Added Claude Code voice mode and CVE patches. Added Kilo Code launch. Added Qwen 3.5 (open weights, 74.1% LiveCodeBench). Updated Cursor to 2.6. Updated Cognition $500M raise. Added developer sentiment and Emergent AI stats. Expanded "What to Watch" with EU AI Act, Kilo Code growth, Pentagon resolution.
March 6, 2026
- Chapter 21: Complete rewrite of Monthly Intelligence Brief for March 2026 — open source crisis, Gemini 3 in Jules, Cursor 2.5 subagents, Copilot multi-model access, Pega enterprise vibe coding, Opus 4.6 agent teams, Devin 2.2
- Chapter 22: New March 2026 Spotlight: FleetTrack — B2B fleet management built by an operations analyst using Claude Code
- Chapter 5: Updated tool references for Cline, Jules, and March 2026 landscape
- Chapter 9: Updated GitHub Copilot stats (26M+ users), Devin metrics (67% PR merge rate, $10.2B valuation), Claude Code revenue ($2.5B+)
- Landing page: Updated social proof stats, added Vibe Coding Academy cross-promotion section with UTM tracking
- All chapters: Updated badges to March 6, 2026
March 1, 2026
- Build System: Introduced automated build pipeline for chapter management and updates
- Changelog: Added this changelog section — subscribers can now see exactly what changed and when
- Per-Chapter Badges: Each chapter now shows its last-updated date
- All Chapters: Initial release of all 22 chapters with 200+ prompts
February 2026
February 25, 2026
- Initial release: All 22 chapters published
- Chapter 1: The Moment Everything Changed — complete timeline from Karpathy's tweet to Opus 4.6
- Chapter 5: Full tools landscape covering Cursor, Claude Code, Devin, Jules, Gemini CLI, Codex CLI
- Chapter 10: Security analysis including Tenzai study and IDEsaster disclosure
- Chapter 17: 200+ production-ready prompts across 10 categories
- Chapter 18: Comprehensive tool comparison matrix
- Chapter 19: The 30-minute security checklist for vibe-coded applications
- Chapter 22: Community showcase with submission guidelines
April 21, 2026
- Chapter 21: Monthly Intel Brief updated to version 1.7 — added two incident cards for April 15–21: Claude Opus 4.7 (87.6% SWE-bench Verified, April 18) and Azure MCP Server 2.0 stable release + OAuth 2.1 added to core MCP spec. Callout headline updated. Previous: April 15 — Vercel Vinext CVEs, GLM-5.1, Claude Code reliability cluster.