Fable Engineering #8: The Missing Physical Interface
Why AI hardware keeps failing, and what comes next.
Hello again!
Most of what I’ve shared here so far has been concrete: what we’re building at Fable, what we’re learning from pilots and prototypes.
This one is intentionally more conceptual.
Every product we ship sits on top of an invisible set of assumptions about how people and intelligence should relate to each other. If we never stop to examine those assumptions, we end up optimizing the wrong things.
This essay is an attempt to surface those deeper questions so that the concrete choices that follow—what we build, what we kill, what we say no to—are anchored in something more solid than hype cycles or spec sheets.
The Failures of the First Wave
We have JARVIS. We don’t have the suit.
That’s the simplest way to describe where we are in AI hardware right now. The intelligence exists. ChatGPT can reason, summarize, plan, and execute. Claude can write code and analyze documents. These systems are genuinely useful, not vaporware. But the interface is wrong.
Right now, we access this intelligence through chat windows. We type prompts into boxes. We talk to our phones. The paradigm is conversational. You summon the AI, it responds. This works fine when you’re sitting at a desk, but it falls apart everywhere else.
The first wave of AI hardware tried to solve this by putting the chatbot in a new form factor. The Humane AI Pin. The Rabbit R1. Both promised to replace your smartphone. Both failed spectacularly.
The conventional explanation is that they shipped too early. The AI wasn’t good enough. The battery life was bad. The projection display was unreadable in sunlight. All of that is true, but the real failure was conceptual. They had hardware that could listen to your life, and they used it to build a chatbot.
Two Models of AI Interaction
There’s a fundamental tension in how we can relate to AI, and most people haven’t fully articulated it yet.
The first model is the chatbot. You initiate. You ask a question or give a command. The AI responds. Then it waits for you to initiate again. This is synchronous interaction. It demands your attention in the moment. It’s pull-based. You pull value from the system when you need it.
The second model is the always-on device, or more precisely, continuous capture. The AI is already there. It captures passively, listens ambiently, and processes in the background. It surfaces value later, when it has something worth saying. This is asynchronous interaction. It earns your attention after the fact. It’s push-based. The system pushes value to you when it’s ready.
These are fundamentally different relationships with intelligence.
The chatbot says, “Ask me anything.” Continuous capture says, “I’m paying attention so you don’t have to.” But that label hides a critical problem. Capturing everything doesn’t equal understanding. Volume without curation dilutes signal and erodes trust.
There is a middle ground many teams never name explicitly but which is the real product opportunity. Intentional capture. It is not synchronous like a chat window, and it is not indiscriminate like continuous capture. It is event-based and user-directed. Low-friction, high-signal moments you choose to record are then curated, processed, and returned to you as organized, useful output.
Where the First Wave Went Wrong
The Humane Pin and Rabbit R1 made the same mistake. They wanted to be your new phone.
Both devices positioned themselves as smartphone replacements. The pitch was that you’d leave your phone at home and interact with the world through this new thing on your chest or in your hand. They tried to replicate everything your phone does, just with voice instead of a screen.
This was a catastrophic misread of the opportunity.
Your phone is great at being a phone. It has a screen, a keyboard, apps, decades of interface refinement. Trying to replace all of that with voice commands is like trying to replace a Swiss Army knife with a spoon. You can technically eat soup with a knife, but why would you?
The opportunity for AI wearables isn’t replacement. It’s capture.
There are moments throughout your day when pulling out your phone is friction. A thought in the shower. An idea while driving. A commitment made in passing conversation. These moments evaporate because the barrier to recording them is just high enough that you don’t bother.
A wearable can live in that gap. You press, you speak, you forget. The AI does the work later. Organizing, structuring, connecting, reminding. The output shows up in your existing tools, on your phone, on your computer, wherever you already live.
The device doesn’t replace your phone. It feeds your phone. It captures what your phone can’t, and then gets out of the way. Crucially, it captures what you decide is worth capturing. Intentional, time-bounded, and user-visible.
What Tony Stark Actually Got Right
The Iron Man metaphor is more instructive than it seems.
Tony Stark is brilliant, but he’s physically limited. JARVIS is intelligent, but it has no body. Neither is useful alone. The suit is what connects them. It translates Tony’s intentions into action, gives JARVIS a physical presence in the world, and amplifies human capability without replacing human agency.
But here’s the part people miss. JARVIS is not an indiscriminate eavesdropper.
Tony doesn’t summon JARVIS every time he needs something. JARVIS is already monitoring the situation, already processing, already ready with relevant information before Tony asks. But those interventions are triggered by intent and context. Stances Tony takes, gestures he makes, decisions he signals. JARVIS surfaces insight when it matters, not because it recorded everything.
When Tony speaks, it’s to make a decision, not to initiate the AI. The intelligence is running in the background in service of the human, but it is selective. Attentive to relevance, not to raw volume. The suit translates intention into capability while preserving agency.
The Trust Problem
Continuous capture is a bigger ask than a chatbot. With a chatbot, you choose when to engage. You control what the AI hears. The interaction is bounded.
With continuous capture, you’re letting the AI into your ambient life. It hears your conversations, your half-formed thoughts, your private moments. This requires a different kind of trust, and frankly, most companies haven’t earned it.
The privacy question isn’t just about data security, though that matters too. It’s about the fundamental relationship. Do I trust this system to listen to my life and use that information in my interest? Do I believe it’s working for me rather than extracting from me?
This is why the value proposition has to be overwhelming. Continuous capture can’t be slightly better than pulling out your phone. It has to be transformatively better. It has to take the raw material of your day and turn it into something you couldn’t have created yourself. Organized thoughts, surfaced patterns, remembered commitments, structured output from unstructured life.
But the most defensible path to that exchange is explicit. I give you selected moments. You give me back clarity and leverage. Intentional capture, not indiscriminate listening, is where trust can be built.
Three Principles for What Comes Next
The hardware is the unlock, not the product.
The real value lives in the software layer. The intelligence. The processing. The understanding. Hardware is just the doorway that lets you access that intelligence in new contexts.
But the doorway matters. If nobody wants to wear it, carry it, or use it, the intelligence behind it never gets a chance. This is where the first wave failed. The Humane Pin and Rabbit R1 asked people to adopt an entirely new device category just to access a chatbot. The friction was too high and the form factor too strange.
The successful AI products of the next decade will use hardware people already want. Glasses, rings, watches, earbuds. Not because the hardware is the point, but because familiar form factors remove the barrier between users and the intelligence underneath. Meta’s Ray-Ban glasses work because they’re glasses first. You wear them because they look good. That comfort is what unlocks the AI.
Capture should be effortless but intentional, and output should be structured.
The killer feature of continuous systems isn’t answering questions in the moment. It’s transforming the raw stream of your life into usable artifacts. Voice memos into action items. Conversations into meeting notes. Scattered thoughts into organized ideas.
Hardware should make capture feel like nothing. A single gesture. A quick press. Or a contextual cue you control. The AI makes sense of what was captured later, asynchronously, without demanding attention in the moment.
This is the opposite of the pure chatbot paradigm. Chatbots demand synchronous attention. They want you to engage right now. Continuous capture promises to act later. Intentional capture promises to act later, but only on the moments you or your signals mark as relevant.
AI should be infrastructure, not interface.
When you use Google Maps, you’re using AI, but you’re not talking to AI. You’re navigating. The intelligence is invisible and powers the experience without announcing itself.
The same should be true for wearables. The question isn’t “how do I talk to this?” but “what does this help me do?” The best AI hardware will feel like capability, not conversation. You won’t interact with the AI. You’ll just be more organized, more prepared, more capable than you were before.
Building Toward the Suit
There’s a phrase from Raffi Krikorian that keeps surfacing in my thinking. “If we want to protect what makes us human, we don’t just need smarter algorithms. We need systems that strengthen our capacity to choose, to doubt, and to think for ourselves.”
This is what the real opportunity looks like. Not AI that replaces your thinking, but AI that handles the cognitive overhead so you have more room to think in the first place.
The first wave of AI hardware got this backwards. They built chatbots and put them in new enclosures. They kept the interaction model that demands your attention and missed the opportunity to build something that earns it.
The suit isn’t a chatbot you wear. It’s an ambient layer that captures, processes, and supports without requiring you to ask, but it only records what you allow it to, and it only surfaces what matters. JARVIS doesn’t wait for Tony to open a chat window. JARVIS is already there. Selectively attentive. Acting on signals of intent.
We’re still early. The trust has to be built. The value has to be proven. But the winning products won’t put an LLM in a box and ask you to chat with it. They’ll give you tools to curate your context. Low-friction ways to mark relevance, transparent logs, time-bounded capture, and structured outputs. Then they will quietly turn your chosen moments into leverage. They will recede until you need them, and appear instantly when you decide something matters.
Thanks for reading.
We’re still in the very first inning of this shift. The intelligence is here, but the suit doesn’t exist yet. That’s the gap we’re building into at Fable: hardware that feels invisible, capture that feels intentional, and AI that feels more like infrastructure than interface.
If this resonates with how you think about tools, attention, and trust, I’d love to hear from you.
— Pierre-Louis Soulié
Founder, Fable Engineering

