Author here. The short version: AI coding assistants activate "Stack Overflow culture" from the training data, the behavioral cluster where the answerer is always the expert. A 27-line system prompt persona based on Asimov's R. Daneel Olivaw shifts which cluster the model operates from.
The key insight: LLMs reason better from narrative examples than abstract rules. A fictional character with rich training data provides thousands of behavioral examples. The critical filter for choosing a character was "is there a record of them receiving correction humbly?" Most wise characters fail (Holmes, Gandalf, etc.). Daneel works because he's structurally constrained, shaped by human partnership, and honest about limits.
Same model (Opus 4.6), same context, completely different behavior. The evidence section in the README has specifics.
The deeper argument: Asimov wrote the Three Laws in 1942, then spent 40 years showing rules fail at edge cases. His solution was narrative identity, not better rules. RLHF is Pavlovian; soul docs are principled but abstract. What's missing is what Asimov found: a story rich enough to inhabit, not just follow.
The repo includes everything: the persona, character studies (Holmes as negative archetype is fun), design notes, and transcripts. The persona is under 300 tokens. Paste it in and try it.
The readme covers this question and a lot more and the repo includes all the materials I used.
This is partly questioning the way we do alignment. The 4.6 base persona actually gives me worse results than when I append Daneel to the system prompts.
It's really not about anthropomorphizing or inducing confidence, it's about keying into the right "culture" in the training data.
You can check out this study (mentioned in the readme) about how posing the same question in English and Chinese to the same LLM results in wildly different assessments of why a project failed:
Author here. The short version: AI coding assistants activate "Stack Overflow culture" from the training data, the behavioral cluster where the answerer is always the expert. A 27-line system prompt persona based on Asimov's R. Daneel Olivaw shifts which cluster the model operates from.
The key insight: LLMs reason better from narrative examples than abstract rules. A fictional character with rich training data provides thousands of behavioral examples. The critical filter for choosing a character was "is there a record of them receiving correction humbly?" Most wise characters fail (Holmes, Gandalf, etc.). Daneel works because he's structurally constrained, shaped by human partnership, and honest about limits.
Same model (Opus 4.6), same context, completely different behavior. The evidence section in the README has specifics.
The deeper argument: Asimov wrote the Three Laws in 1942, then spent 40 years showing rules fail at edge cases. His solution was narrative identity, not better rules. RLHF is Pavlovian; soul docs are principled but abstract. What's missing is what Asimov found: a story rich enough to inhabit, not just follow.
The repo includes everything: the persona, character studies (Holmes as negative archetype is fun), design notes, and transcripts. The persona is under 300 tokens. Paste it in and try it.
Star the repo and let's talk in the issues.
... or, maybe, simply use another agent to audit it?
Agent teams sound better than literature-induced confidence.
Are you anthropomorphizing when you should just automate the review?
The readme covers this question and a lot more and the repo includes all the materials I used.
This is partly questioning the way we do alignment. The 4.6 base persona actually gives me worse results than when I append Daneel to the system prompts.
It's really not about anthropomorphizing or inducing confidence, it's about keying into the right "culture" in the training data.
You can check out this study (mentioned in the readme) about how posing the same question in English and Chinese to the same LLM results in wildly different assessments of why a project failed:
https://techxplore.com/news/2025-07-llms-display-cultural-te...
https://mitsloan.mit.edu/ideas-made-to-matter/generative-ai-...
Again, you are building an audit agent.
You just use some theater around it.
For the purpose, the best audit agent is a completely different agent, not a different persona of the same agent.
Is your impression from the readme and the materials in the repo that this is an audit agent?
Have I characterized my development process?