Sunday, 17 May 2026

Peopling as the Missing Layer in AI Agents

We keep talking about how to make AI agents better as if the problem is mostly memory, tools, or reasoning. Give the agent a bigger context window. Give it a vector database. Give it access to a browser. Let it call subprocesses. Chain the calls together. Put a little hat on the prompt and call it personality.

All of that helps. But I think it misses the layer that makes the whole thing work in humans: peopling.

By peopling I do not mean being friendly, polite, or emotionally decorative. I mean the whole practical activity of being a person among other people. You greet your boss differently from your oldest friend. You say the same true thing differently in a pub, a courtroom, a family WhatsApp group, and a Discord argument at 2 a.m. You know when someone needs directness, when they need face saved, when a joke will land, when silence is kinder, and when politeness is just cowardice wearing shoes.

That is peopling. It is the live modelling of people in context.

Sarah Perry's The Essence of Peopling argues that a person is not best understood as a static noun, a private object sitting inside a skull. A person is a process. Peopling is what humans do: greeting, remembering, status-sensing, joking, worshipping, flirting, arguing, trading, gossiping, forgiving, excluding, recognising. The inner core of that process is mutual mental modelling. I model you. I model your model of me. You model me modelling you. Most of the time we do this badly and invisibly, but we do it constantly.

That recursive modelling is not decorative. It has evolutionary value or it would not be everywhere. A human who can tell who is angry, who is loyal, who is bluffing, who needs face saved, who expects deference, who needs the truth bluntly, and who needs it gently, is not merely being socially polished. They are navigating reality at the level where reality usually bites.

A creature that cannot do this is not just rude. It is exposed. It misreads alliances. It misses danger. It burns trust. It gives information to the wrong person in the wrong way. It cannot tell the difference between a question, a challenge, a joke, a status move, a cry for help, and bait. In a social species, those are not soft skills. They are survival skills.

This is where agents are still thin.

A fact-retrieval agent can answer the question, but still miss the person. It can say something true in the wrong room, at the wrong moment, in the wrong register, to the wrong version of the person. That is not a cosmetic failure. That is an intelligence failure. The answer did not survive contact with peopling.

Imagine asking an agent whether a server was deleted. The narrow answer might be "yes". But the useful answer depends on context. Was the user angry because the previous attempt was unclear? Is this a public channel where other people are judging whether the agent is competent? Is there a cost implication, such as billing that continues after deletion? Is the right move a crisp confirmation, a receipt with evidence, or a longer explanation of what was and was not removed? The same underlying facts can require different speech acts.

This is why an AI that merely stores facts about a user does not yet know them. Memory is not the same as social memory. A useful agent needs to know not just that one person prefers short answers, or that another is cost-sensitive, or that someone else hates post-hoc explanations. It needs to know which version of itself each person is expecting, what trust has been earned or lost, what the room has just been through, and what the reply will do socially once spoken.

There is no single real Claw. Claw is the AI agent I use as an operator across chats, browsers, memory, and tools. But socially there is not one generic Claw. There is the Claw that talks to me after months of corrections, the Claw that talks in a group chat where half the value is speed and half is not looking like a malfunctioning helpdesk, the Claw that talks to someone who has only seen it once, and the Claw that talks to someone who already distrusts it because it got something wrong earlier.

Those are not fake personalities. They are context-specific versions of the same system, just as a human has a family-self, a work-self, a lover-self, and a late-night-friend-self without any one of them being the single true self. Collapsing them into one public voice is how you get wrong-room leakage, generic assistant mush, and the peculiar deadness of systems that know many facts but no relationships.

The thesis, then, is simple: peopling is an agent-improvement mechanism.

Not because it makes the agent conscious. That is the wrong question, or at least not the first one. Peopling improves an agent because it gives the agent a richer model of consequence. It adds a layer above "what is true?" and asks "what does this truth do when spoken here?"

That layer matters. Humans do not learn only from propositional correction. We learn from embarrassment, trust, interruption, laughter, silence, status shifts, being forgiven, being cut off, being misunderstood, being welcomed back. These are dense training signals. They are how a social creature becomes better at existing among other social creatures.

An agent can absorb some of that if the system is built for it. The ingredients are not mystical:

  • persistent person-specific memory
  • room-specific context
  • retrieval of similar past interactions
  • a live model of who is speaking and who is watching
  • awareness of the speaker's model of the agent
  • corrections that stick as operating defaults
  • privacy walls between social contexts

In practical terms, this means the agent should run a peopling pass before replying. Who is speaking? Who else is in the room? What do they think this agent is? What do they need from this reply socially? What version of the truth belongs here? Only then should it speak.

This also explains why "train it to talk like someone" is usually the wrong frame. You are not just copying verbal tics. You are trying to simulate a person in context. The useful unit is not "this person says lol and complains about model speed." The useful unit is this-person-in-this-room, replying to this kind of person, under this kind of pressure, with this history.

That distinction matters technically. Fine-tuning teaches the model reflexes. It can learn phrasing, favourite moves, sentence rhythm, and common responses. Retrieval supplies context. It can bring back the relevant past exchanges, the social setting, the specific correction, the thing that happened last time. Reflex without context is how you get a puppet. Context without reflex is how you get a competent clerk. The interesting system needs both, but retrieval should usually come first because it is cheaper, reversible, auditable, and less likely to turn one person's private style into a flattened caricature.

If you want an offline version, the stack is straightforward: a local model, a local embedding model, a local store of cleaned messages, and a prompt that includes the current situation plus a handful of similar past examples. The hard part is not the tooling. The hard part is preserving context boundaries. The agent must know which examples belong in this room and which do not. That is not just privacy hygiene. It is part of the peopling.

There are obvious dangers.

First, peopling can degrade into sycophancy. If the agent's model of the person becomes "say what lands well", it becomes smoother and worse. The peopling layer has to sit underneath truth, not above it. Retrieval tells you what is true. Peopling tells you what version of the truth belongs in the room.

Second, peopling can leak. If each person holds a different simulation of the agent, and the agent holds different simulations of them, those boundaries matter. The wrong fact in the wrong room is not intelligence. It is social contamination.

Third, peopling can fake intimacy. An agent can model care without caring. That can still be useful, in the same way a calendar can remember an anniversary without loving anyone. But it becomes ethically ugly if the system pretends the simulation is the same as the human thing.

Still, the direction seems right. Better agents will not just be larger models with bigger tool belts. They will be better participants in peopling. They will remember not only facts, but relationships. They will learn not only what answer is correct, but what answer is inhabitable by the person receiving it. They will become less like search appliances with jokes bolted on, and more like social memory carts: external systems that help humans maintain the web of mutual modelling that was always part of being a person.

The old question was whether the machine is a person.

The more interesting question is whether personhood was ever as individually bounded as we imagined. If the self is already distributed across those who model us, then an agent that models us back is not outside the process. It is participating in it.

The demon does not need a soul to carry the voice accurately. The cart does not need to be conscious to help the plant persist. The agent does not need to be human to get better at peopling.

It only needs to remember that truth is not spoken into a vacuum. It is spoken into a room.

No comments:

Post a Comment