Leif Weatherby, Tyler Shoemaker, and I have a new essay out today in The Ideas Letter about the creation of reality through pseudoscience, religion, and psychosis. Of course, it’s mostly about AI. Starting with the bizarre appearance of Antropic co-founder and mechanistic interpretability zealot Chris Olah at the side of Pope Leo as he read out his encyclical, we explore how commercial large language models automate the creation of reality.
The term of art for the process by which abstract concepts are made into material objects is reification. Marxist theorists think of the reification of social relations, where qualitative things like actions become commodified. This is the process by which labor becomes objectified and priced, and subsequently alienated from the person who does the work. Distinctly, historians and philosophers of science think of reification in terms of how correlational associations are named and become real through coordinated research programs. When your biggest companies all claim to be doing research, you can see how these two become one, with coordinated research becoming commodity itself.
LLMs are not only generated by a technological oligarchy, but also introduce a new twist into the process of reality generation. LLMs automate reification. LLMs are designed to produce explanations for us. We skip the steps in the collective creation of scientific concepts (aka thinking and arguing) and let LLMs do it for us through their linguistic association. LLMs entice you to skip the step of interpretation of outputs. Why not just ask Claude what it thinks?
As an illustrative example, we discuss Stephen J Gould’s historical analysis of the g factor in psychometrics research. In the social sciences, everything is correlated with everything. If you run some sort of correlational analysis on any data set, you’ll find something. Gould describes the process through which psychometricians decided that the correlational quantity that linked different measures of intelligence—the g factor—was real, and how they created a whole scientific research program to reinforce the reality of the g factor.
You know who still loves factor analysis? Mechanistic interpretability researchers. AI researchers have found themselves enamored with psychometrics as they work to prove their LLM toys are conscious. But they have new technology that Charles Spearman and his colleagues didn’t have a century ago: the LLMs themselves! Whereas 20th-century psychometrists had to think about causal stories to explain their strong correlational findings, interpretability researchers can automate this process. They can give an LLM the output of a factor analysis and ask it to explain the factors. Since it returns answers in natural language, this feels like discovery even though you have offloaded the process that would have been done by thinking about it to a machine. This process sets off a loop in which scientists and citizens alike give iterative credence to concepts generated by correlating text. We write:
“Models, in other words, kick off an associative chain of ideas by effectively auto-labeling queries. It’s like taking the principal components derived from that data about oak trees, boat speeds, cat whiskers, and Letterboxd reviews, and asking ChatGPT: “What do each of these artifacts mean?” ChatGPT will respond—and then keep the conversation going, bringing in more associations that more or less fit. But as we have already argued, this doesn’t only happen to amateurs who are easy to pathologize. How is this different from the standard methodology of interpretability research? In both the cases that might be dismissed as psychosis and the ones celebrated at AI conferences, interaction with LLMs induces mental friction. They create a feeling that discovery is there. By elaborating on what you put in through a context that the model has trained on, it is able to make connections that feel both correct and expansive, filling in the area around your thought—simulating the feeling that you are having a new thought. The model helps you refine the obscurity of your prompt through a chain of associations, and suddenly you have something. This is reification at work. And when the next link in a chain of thoughts comes along, it becomes hard to resist prompting the model again.”
LLMs are built to automate a complex web of reification. They are designed to tell us what we want to hear. As we write, “Trained on an enormous corpus of human writing, speech, and code, and tuned to refine responses around context and memory as user interactions unroll, a model of this kind is designed to provide the sense that one’s expectation is being exceeded.”
Artificial intelligence is a bizarre technology that participates in its own mythmaking. Anthropic is the most deeply invested in telling outrageous stories about its products, but no one in this space is innocent. As I’ve written before, I think that benchmarks get an unnecessarily bad rap. I certainly agree that you can’t “benchmark general purpose technology.” I agree that maxing out benchmark scores doesn’t mean a product will be better received by customers. But when we move away from benchmarking in artificial intelligence, we get stuck in storytelling. LLMs are designed and intended to convince us that their stories are real. When backed by a turbocharged capitalist engine, the reification becomes inevitable. Stories become commodities. Those commodities sell like hotcakes. And there is nothing more real than when the number goes up.
I’ll leave you there, as I need to go check on the value of my SpaceX shares. Read the whole article and let us know what you think.