This isn't surprising if you view these LLM conversations as a manuscript for a theater play, and the LLM as a document-extender being run against them.
The ego-less real-world algorithm doesn't--cannot--recognize "itself" in the content. It just supplies additional text which statistically fits after what's already there. Human readers then instinctively assume that the fictional character we perceive is also an author's attempt to insert itself.
Imagine the starter script changes from "you are an LLM" to "you are Santa Claus." The LLM might emit "Ho Ho ho, I am Saint Nicholas, welcome children", but that doesn't mean the real-world system considers itself to be Santa, or that Santa is real, or that the machine feels love and kindness to all the children of the world.
This isn't surprising if you view these LLM conversations as a manuscript for a theater play, and the LLM as a document-extender being run against them.
The ego-less real-world algorithm doesn't--cannot--recognize "itself" in the content. It just supplies additional text which statistically fits after what's already there. Human readers then instinctively assume that the fictional character we perceive is also an author's attempt to insert itself.
Imagine the starter script changes from "you are an LLM" to "you are Santa Claus." The LLM might emit "Ho Ho ho, I am Saint Nicholas, welcome children", but that doesn't mean the real-world system considers itself to be Santa, or that Santa is real, or that the machine feels love and kindness to all the children of the world.