Our Best AI Investment of 2025: Time to Fail

Your engineers have the tools. What they don't have is permission to fail. Sprint pressure turns every AI experiment into a risk. Nobody wants to be the person who spent three days on a ticket because they were "trying something new." So adoption stays stuck in the margins—safe, shallow, going nowhere.
Sound familiar? That was us at Spectora a few months ago. Thirty engineers, all aware that AI was important, all dabbling on the edges, none of us really going deep. We were collectively stuck in AI purgatory. Like a bunch of people standing at the edge of a cold plunge, dipping our toes in.
Then we tried something deceptively simple that actually worked: we gave our team real time to fail.
We ran a two-week AI hackathon at Spectora with one rule: failure was the goal. Week one: individual experimentation. Week two: team projects. The result wasn't fancy metrics—it was a team that finally had permission to go deep, share what broke, and build real intuition. The key insight: stop waiting for organic adoption. Create structured space to fail.
The Problem With Learning in the Margins
Here's what I've observed across every engineering org I've talked to: AI adoption happens in the margins. Engineers squeeze in a prompt between meetings. They try a new tool on a Friday afternoon. They watch a YouTube video while eating a sad desk salad.
The problem? Margins aren't enough.
You can't develop real intuition in stolen moments between sprint work. It's like trying to learn Spanish from Duolingo notifications while your house is on fire.
And here's the kicker: as one part of your SDLC gets more productive with AI, the rest of the pipeline feels the pressure. It's like putting a turbocharger on one wheel of a shopping cart. Congrats, you now have a shopping cart that spins in circles really fast. The whole system needs to level up together, or you just create new and exciting bottlenecks.
The Simple Insight (And How We Implemented It)
The fix is embarrassingly obvious: give your team dedicated time to go deep on AI. We pulled 30 engineers (plus PMs, QA, and Design) for two full weeks with 100% buy-in—no sprint pressure, no "squeeze this in between meetings." Just protected time to learn.
Our implementation was a two-week hackathon—which, to be clear, took real work to pull off. But the core insight is simpler than the execution.
I know. "Hackathon" has been beaten into meaninglessness by a thousand corporate team-building exercises and way too much pizza. This wasn't that.
We pulled the entire product organization (Engineering, PMs, QA, Design) and gave them two weeks away from sprint pressure. Week one: individual learning and experimentation. Week two: team projects tackling real problems.
The key wasn't the hackathon format. It was creating structured space for depth. No velocity expectations. No sprint points. No "squeeze this in between your real work." Just time to go deep, fail spectacularly, and learn. We explicitly told engineers: If you don't fail at a few attempts this week, you haven't tried hard enough.
The Curriculum: What We Actually Had People Do
The Mindset Shift
Before diving into tools and techniques, we established two mental models:
-
Adopt your agent's perspective. This is the single most important paradigm shift for agentic coding. Stop thinking about what you know. Consider what your agent sees, what it has access to, what context it has about your project and goals. It's like training a very smart golden retriever. You have to think about what they understand, not what you understand.
-
Stop coding. Seriously. For the learning week, we told engineers: do your absolute best to prompt, not type. Hands off the keyboard. This forces you to actually learn the tools instead of falling back on muscle memory. It's uncomfortable. That's the point.
Week 1: Learning Week
Everyone worked mostly independently, with explicit permission to experiment wildly. The vibe was less "corporate training" and more "unsupervised science fair."
The deliverable? Friday presentations on what they learned, suggestions for new processes, and tool recommendations. Nothing fancy. Just "here's what I tried and here's what I learned." The forcing function of having to present keeps people honest.
Resources We Gave Everyone
We pointed people at courses from IndyDevDan, key videos, and essential reading. Full disclosure: my whole team took his courses and it lit a fire under me to build AgentCMD.
Principled AI Coding
Course · IndyDevDanFoundational principles: Context, Prompt, and Model. Start here.
Tactical Agentic Coding
Course · IndyDevDanBuild autonomous agent pipelines. This is where engineering is heading.
Prompt Engineering Deep Dive
VideoFoundational. Everyone should watch this.
Context Engineering Masterclass
VideoThe unlock for making AI work in enterprise codebases.
6 Months of Claude Code Lessons in 27 Minutes
VideoCompressed wisdom from someone in the trenches.
Spec Driven Development with Claude Code Plan Mode
VideoChanged how we think about breaking down work.
Getting AI to Work in Complex Codebases
GitHubThe bible. Stop listing reasons why AI won't work. Start assuming it will.
AI Coding Is Massively Overhyped, Report Finds
ArticleImportant: optimize your entire SDLC, not just coding.
Techniques to Experiment With
We gave engineers a menu of techniques to try. No need to hit them all. Pick a few that seem interesting and go embarrassingly deep:
- Run multiple agent instances in parallel
One helping you scope a feature, one auditing your codebase, one implementing. Feels like cheating. It's not.
- Leverage subagents concurrently
Gather research from multiple angles, then combine into a single recommendation. Like having a research team that doesn't need coffee breaks.
- Compare models on the same task
Do the same task with Claude, GPT, and Gemini. Build intuition for when each shines (and when each hallucinates confidently).
- Use git worktrees
Execute several independent features at once with
git worktree add -b feat/add-mfa ../add-mfa origin/main. Parallelism is your friend. - Spec-Driven Development
Use Claude Code's plan mode, turn it into an executable spec with
/create-plan, clear context, then/implement. Rinse and repeat until it works or you lose your mind. - Give your agent a browser
Experiment with Playwright MCP so your agent can actually render and test frontend work. Give it eyes. Spooky, but effective.
- Evaluate MCPs
Sentry, Linear, Figma, Slite. What integrations would give your agent the context it needs? The more context, the less hallucination. Usually.
Role-Specific Focus Areas
For Engineers:
- Grab tickets from your backlog (or someone else's, we're all friends here) to experiment with
- Try techniques you've never tried before. Go well outside your comfort zone. The cringe is part of the learning.
- Analyze the state of context in your applications. Add CLAUDE.md files, slash commands, cursor rules.
- Do your own QA. No sprint expectations.
For Product / QA / Design:
- Explore how AI can help with customer discovery
- Look at AI-powered testing tools (Mabl, etc.)
- Consolidate customer calls into a searchable, discoverable repository
For Everyone:
- Experiment with frontier models you haven't touched (ChatGPT, Claude, Gemini)
- Evaluate tools that could increase your efficiency
- Identify process changes that embrace AI tooling
Week 2: Building Week
Cross-functional teams (devs, QA, design, PM) tackled real challenges. The overarching question: How do we continue to support increased velocity across the entire SDLC by properly leveraging AI?
Challenges You Could Pose
Here are some prompts to get your teams thinking about real problems AI could help solve:
- Bug Resolution
How can AI accelerate debugging and issue triage?
- Technical Debt
How can AI help us chip away at the backlog of tech debt that's been accumulating for years?
- QA & Testing
How can AI speed up testing without sacrificing coverage?
- Code Review
How can AI improve PR reviews while maintaining quality standards?
- Environment Stability
How do we keep staging stable as velocity increases?
- Documentation
How can AI help us maintain better documentation across the SDLC?
Teams proposed project ideas, we collaborated on scoping, and they built. Friday demos. Real output. Some projects were duds. That's fine. That's learning.
What Actually Happened
I'll be honest: I don't have fancy metrics to share. No "47% productivity increase" charts for your board deck. If that's what you need, this isn't the blog post for you. But if you want something harder to measure and more valuable, keep reading.
We could suddenly talk at a deeper level. Before the hackathon, conversations about AI were surface-level. After the hackathon, we could discuss context window management, prompt patterns for complex refactors, when to use multi-agent approaches vs. single prompts. We had shared vocabulary. Shared intuition. Shared war stories from the trenches.
More concretely, we adopted a policy that would have been impossible before: templating our engineering workflows and building a shared library of slash commands. Everyone understood why this mattered and how to contribute. That kind of organizational alignment doesn't come from Slack threads and lunch-and-learns. It comes from shared experience.
Some examples that stuck with me: One engineer discovered parallel Claude Code instances and immediately became the team's resident "why are you only running one agent?" guy. Our QA lead prototyped AI-assisted test generation and now won't shut up about it (affectionately). And two weeks later, I overheard someone say "let me spin up a subagent to audit this." Words that would've gotten you blank stares before.
Weeks Later
The momentum didn't stop when the hackathon ended:
- A growing library of slash commands
Teams are actively using, improving, and sharing across the org
- A two-person team completed 109 points in a single sprint
By investing heavily in context setup and giving their agents visual visibility into what they were building
- Developers adopting agent orchestration frameworks
Some are now adapting our slash commands to work with tools like AgentCMD
- Slash command show-and-tell at engineering all-hands
It's now a regular segment, and people actually contribute what they've been building
There's no doubt in my mind that this hackathon instantly and permanently changed the quality of conversation we can have about AI adoption. That's not something you can put on a slide, but it's the foundation everything else gets built on.
How To Run Your Own
If you're convinced and want to try this with your team:
- You don't need two weeks
In hindsight, we could have gotten similar results in three days. The magic isn't in the duration. It's in the dedicated, protected time.
- Make it real time off
Not "hackathon but also check Slack." Actual protected time. If people are still getting pinged, it doesn't work.
- Set the right expectations
Success isn't shipping features. Success is experimenting, learning, and sharing. Say this explicitly and repeatedly.
- Include everyone
Engineers, PMs, QA, Design. The whole SDLC needs to level up, not just the people writing code.
- Open the budget
Tell people you'll buy whatever tools or courses they want to try. The ROI on a $200 course that levels up even one engineer is absurd.
- End with sharing
A presentation or demo at the end. The forcing function of "I need to show what I learned" drives deeper engagement.
- Don't over-structure
Point people at resources, but let them follow their curiosity. Adults learn best when they're curious, not following a syllabus.
Key Takeaways
Why It Worked
- Permission to fail
Sprint work has expectations attached. Nobody wants to be the person who spent three days on a ticket because they were "experimenting with AI." A dedicated learning period removes that pressure entirely. Failure isn't just allowed. It's expected. Say this explicitly and repeatedly.
- Protected time is non-negotiable
Not "hackathon but also check Slack." Actual protected time away from sprint pressure. If people are still getting pinged, it doesn't work.
- Cross-functional leveling
AI doesn't just affect engineering. When your whole product org levels up together, you avoid the bottleneck problem where one function gets faster and everyone else drowns in the wake.
- Shared vocabulary
When everyone goes deep at the same time, you create shared reference points. "Remember when we tried X during the hackathon?" becomes institutional knowledge. Metrics are nice, but this foundation is what everything else gets built on.
- Stop waiting for organic adoption
It won't happen. Create the space intentionally.
What We'd Do Differently
- One week, not two
By week two, energy was flagging. The learning and bonding mostly happened in the first week anyway.
- Clearer expectations for new leaders
We gave leadership opportunities to people who hadn't led before (great for growth), but didn't define what "team lead" meant or how senior engineers should support without taking over.
- Less developer-centric challenges
Our prompts were too engineering-focused, which sidelined Design and PM. True cross-functional participation needs challenges that genuinely require everyone's expertise.