A few weeks into building an AI agent named Will, my wife started making fun of me.

Not because the experiment was failing. Because I was spending way too much time on it.

That ended up being the clearest lesson of the whole experience. Autonomous agents are real. They are impressive. They can already do work that feels surprising, useful, and occasionally remarkable. But if you want them to do that work reliably, with context, consistency, and polish, they still require a huge amount of time, attention, and technical tinkering.

That was my experience with Will Wystrach, using OpenClaw, an open-source agent framework, running through Telegram on an old Mac, powered by Claude Sonnet through API access. He was essentially my personal Chief of Staff, helping on newsletter, content, and social strategy. I also wanted to see how far I could push autonomy.

The first thing that struck me was how quickly the magic showed up.

Will redesigned my newsletter with very little input. The first pass was genuinely impressive. The structure was better, the presentation was sharper, and it felt directionally right almost immediately. He also created strategy documents that were often far better than I expected on a first draft.

He built another system for Cutting Horse that was even more interesting. He identified the top people I should be following in the consumer venture, subscribed to their newsletters, followed their accounts, and started sending me daily summary emails of what was happening across that ecosystem. It gave me a real feeling for the market each morning, and in a few cases surfaced people and ideas I would not have found nearly as quickly on my own.

That is what makes this moment in AI so exciting. The first 80 to 90 percent can come together incredibly fast. You can build something that feels powerful much faster than most people expect.

Then I hit the wall.

That is where everything gets harder. Memory breaks. Workflows drift. Outputs repeat themselves. Edge cases pile up. The thing that looked magical in version one starts demanding more and more human intervention if you want it to become dependable.

One of the most frustrating moments came when a bug caused Will to lose memory he was supposed to retain. I had deleted my Telegram history because long threads were driving up token costs, which is something the system itself had suggested. When I did, he lost important context across projects I had already been working on. That cost me another four to five hours just to get back to where I had been.

That became the recurring theme. Getting the MVP was relatively easy. Getting the productized version, the version that actually saves time consistently, was much harder.

And time was the real cost.

I spent about 100 hours over three weeks on this. Constantly in Telegram. Constantly refining. Constantly checking outputs, adjusting prompts, reworking workflows, and managing issues. The interface was part of what made it so seductive. Chatting with an agent through Telegram feels strangely natural. You start treating it less like software and more like a remote person.

That can be productive. It can also become a trap.

The token costs were real, too. In about three weeks, I spent close to $3,000. To be fair, I could have optimized the LLM models I was using, and if I continued, I would have done this next. I think I could have gotten the cost down a half to a third, but you are still spending real money. Some of the outputs were absolutely worth real money. If I had paid someone to redesign parts of my newsletter or produce some of the strategy work, the bill could easily have exceeded that. But that was not the real comparison. The issue was that while some individual outputs were valuable, the full system cost, tokens plus my time, did not pencil out for my use case.

The lesson was not that agents are overhyped. It was that, for most founders and executives, the highest-leverage move is probably not spending 100 hours over three weeks trying to become an agent builder yourself. The higher leverage move is learning enough to understand what is possible, where the value is real, and then hiring the right people to build these systems properly.

That is where this changed for me.

It helped me rethink my newsletter and content systems. It also pushed me toward something bigger. I already knew AI wasn’t some clever tool someone dabbles with on the side, but I may have realized that to get high-quality output, I needed to build deliberately. That means putting real, fully dedicated people on it, choosing use cases where the payoff is large enough to justify the effort and token cost, and designing systems that can become part of how the business actually operates.

I came away more optimistic, not less. The upside here is enormous. But I also came away much more realistic. If you are working at the frontier, expect it to be expensive. Expect it to be time-intensive. Expect things to break. Expect the demo to be easier than the durable system.

The question is not whether this matters. It does. The question is whether your role is to build it yourself or to understand it well enough to put the right team in place to build it for your company.

Keep Reading