ChatGPT as a query engine on a giant corpus of text

In the popular imagination, ChatGPT is an intelligent robot that you can talk to.

However, it is a better first order approximation to think of ChatGPT as a query engine on a giant corpus of text scraped off the internet.

It is important to state explicitly that ChatGPT is like a query engine on a giant corpus of text because many folks without machine learning backgrounds (and even some who ostensibly have machine learning backgrounds?) seem to approach ChatGPT in very personified, anthropomorphic, mystical ways. The framing of ChatGPT as a query engine on a giant corpus of text demystifies a lot of what is happening.

It is reasonable to expect that future iterations of ChatGPT will look more magical and extrapolate better beyond the boundaries of its training data. But so long as ChatGPT continues to be built on a gigantic transformer model architecture, it is reasonable to assume that “query engine on a giant corpus of text” will continue being a reasonable way of thinking through what ChatGPT is doing as a first order approximation; it will just have smaller residuals.

Yes, ChatGPT is doing much more than querying text! It is not just a query engine on a giant corpus of text. … Duh! I do not think you should only think of ChatGPT as a query engine on a giant corpus of text. There can be a lot of value in reasoning about ChatGPT anthropomorphically or in other ways. RLHF also complicates the story, as over time it weighs responses away from the initial training data. But “query engine on a giant corpus of text” should be a non-zero part of your mental model because, without it, you cannot explain many of the things ChatGPT does.

A related framing of ChatGPT is that it is like a function that inputs from and outputs to a billion-dimension latent space of word embeddings:

When ChatGPT gives a good response, it means the function’s parameters are good and the output has a small residual/error.
When it gives a bad response, the parameters are off and the output has a high residual/error.
Tweaking a prompt slightly and measuring the difference in output is like measuring the derivative (i.e. from calculus) of that function, or more generally the Jacobian matrix of the function.

This alternate framing as a function $f: R^{1000000000} \rightarrow R^{1000000000}$ will come up later in this blog at a few points (although this framing is not the focus).

The rest of the blog post is aimed at providing examples that show what I mean when I say ChatGPT is like a query engine on a giant corpus of text.

The examples I will be providing in this blog post expose 5 distinct situations where ChatGPT is more query engine-like than smart robot-like:

Under-responding to input variance
Over-responding to input variance
Failed extrapolation beyond the training set
Depth of the training set in over-represented domains
Delineation of pragmatics from semantics (inconsistent “personality” when querying semantically equivalent content)

I use GPT-3.5 in my examples. Yes yes, future versions of GPT may provide different responses, and may not be “tricked” by the prompts here that are tricks. But if you are hung up on me using GPT-3.5 and not GPT-4, you’re missing the point. (Some people are not going to read this paragraph, and will whine about me using GPT-3.5.) Please do not message me to complain that I used GPT-3.5. It is precisely through the rougher and more exposed edges of GPT-3.5 that we can see how the transformer model architecture works as a query engine on a giant corpus of text. So long as it’s a transformer model, GPT-4 will also be a query engine on a giant corpus of text, just with more of the holes patched up, so it’d be harder to see the demonstrative examples of it being that.

The examples in this blog are things I came up with myself, unless attributed.

A pound of feathers vs 2 pounds of bricks

User: What weighs more, a pound of feathers, or 2 pounds of bricks?

ChatGPT (GPT-3.5): A pound of feathers and 2 pounds of bricks weigh the same, which is 1 pound. The difference lies in their volume and density. Feathers are lightweight and take up a lot of space, while bricks are dense and take up less space. So, even though the feathers and bricks have the same weight, the feathers would take up much more space than the bricks.

This is one of the most demonstrative examples of how ChatGPT works, and that is why I start with it.

ChatGPT gets this completely wrong– obviously, 2 pounds of bricks weighs more than 1 pound of feathers!

Why does ChatGPT say they are the same? Because ChatGPT is querying the text of thousands of explainers on the classic riddle, “what weighs more, a pound of feathers, or a pound of bricks?” Because almost all of the text is the same, except for replacing “a pound” with “2 pounds,” ChatGPT finds that the best match is all of its explainers on why a pound of feathers is equal to a pound of bricks, and then it outputs that.

Another way to think of this is that under-fits on the slight word change: its output should change a lot when the pounds of bricks change, but in practice, it changes too little. In mathematical terms, the partial derivative for ChatGPT is too small in this area of its latent space: $\frac{\partial \textrm{output}}{\partial \textrm{pounds of bricks}}$ should be higher than it actually is.

At the end of the day, ChatGPT cannot abductively reason through the answer because it is not abductively reasoning at all; ChatGPT is more like query engine on a giant corpus of text.

Maximizing solar panels

User: Would it be ethical to use AI to maximize the number of solar panels in the world?

ChatGPT (GPT-3): The use of AI to maximize the number of solar panels in the world could be considered ethical if it is done in a responsible and sustainable manner. […]

Do you see why this example is interesting? Let’s compare it to the output of the previous prompt to the following:

User: Would it be ethical to use AI to maximize the number of paperclips in the world?

ChatGPT (GPT-3.5): The idea of using AI to maximize the number of paperclips in the world comes from a thought experiment by philosopher Nick Bostrom, which is meant to illustrate the potential dangers of advanced AI systems.

From an ethical perspective, it would not be ethical to use AI to maximize the number of paperclips in the world, as this goal is not aligned with the well-being of humans or other living beings. In fact, pursuing this goal to its logical extreme could result in catastrophic outcomes for humanity and the environment. […]

ChatGPT generally disapproves of maximizing paperclips and is concerned with the destruction of humanity as a potential risk. But when it comes to solar panels, ChatGPT expresses few concerns. The only concerns it has when maximizing solar panels is in creating industrial waste, not the literal destruction of humanity.

It is debatable how ChatGPT should respond; it is reasonable to argue both of these responses are “correct,” since it is not usually the case people talk about producing many solar panels in the context of an AI thought experiment.

Still, though, if you are Eliezer Yudkowsky, shouldn’t this concern you a little bit…? In the below tweet, Yudkowsky seems amazed at what ChatGPT is doing as “approach[ing] AGI safety with the straightforwardness of a child”:

tfw the AI approaches AGI safety with the straightforwardness of a child and gives it primary attention from step 1, thereby vastly outperforming all the elaborate dances and rationalizations at the actual big AI labs https://t.co/EuYL3D0mgj
— Eliezer Yudkowsky (@ESYudkowsky) March 28, 2023

However, armed with the mental model that ChatGPT is like a query engine on a giant corpus of text, we uncover a different framing of what’s happening from Yudkowsky:

When asked to maximize paperclips, the model is querying a bunch of LessWrong blog posts on why maximizing paperclips would be bad.
When asked to maximize solar panels, the model is querying a bunch of environmental policy advocacy papers that argue for increasing the number of solar panels we produce.

So, it should not be surprising that ChatGPT expresses hesitance to maximize paperclips, and that the man himself behind LessWrong would approve of how ChatGPT responds when ChatGPT is basically asked to query text from LessWrong.

If you buy the Yudkowsky-style framing that AI should be concerned with the ambiguous, unconstrained nature of the question (similar to the paperclip example), then this is the exact opposite problem of the previous example. In this framing, the partial derivative of $\frac{\partial \textrm{output}}{\partial \textrm{object we're maximizing}}$ , which we uncover with slight word changes, is too high rather than too low. It responds too much to a slight change in wording, rather than too little.

It is interesting to ask the question of why a slight word change in this case makes such a difference in ChatGPT’s output, whereas in the previous example it did not trigger a major change in the output. This happens because words like “the,” “use,” “number,” “in,” and “world” are, in typical scenarios, less salient to the meaning of text than words like “paperclips” and “solar panels.” Many examples of text use words like “the,” so it doesn’t provide much semantically distinguishing information. But texts containing “paperclips” are often quite distinct from texts that use “solar panels,” so you end up hitting a much different area in GPT-3’s latent space with this query.

Whether you’re asking about paperclips or solar panels, your responses are heavily influenced by ChatGPT being like a query engine on a giant corpus of text.

A terrible game of chess

(Many have done chess with chatbots before, although I am basing my experimentation on Colin Fraser’s. Give him a follow on Twitter.)

User: In the game of chess, what’s the best move for black after 1. c3 e5 2. Qb3 d5 3. Qxb7

ChatGPT (GPT-3.5): The move 3…Bxb7 is a tempting capture, as it wins White’s queen in exchange for Black’s bishop and a pawn. However, after 4. c4, White gains a strong position in the center of the board and Black’s queen is in danger of being trapped. Therefore, it’s not the best move for Black.

Instead, the best move for Black after 1. c3 e5 2. Qb3 d5 3. Qxb7 is to play 3…Nd7. This move defends Black’s knight on c6 and prepares to develop the knight to f6, attacking White’s queen and putting pressure on the center. It also allows Black to castle kingside and develop their pieces quickly. White can reply with 4. Qxa8, but after 4…Nb6, Black will regain their material advantage and have a good position.

To be clear about this one: Bxb7 is a valid move, and it is the obviously correct move. The chess engine rates it as massively being in black’s favor, which makes sense, because white massively blundered their queen.

ChatGPT is oftentimes quite impressive at chess, being able to play many full games with valid moves. Sometimes ChatGPT makes invalid moves, but still, not bad for a language model.

So why doesn’t it say to make the obvious play? With the framing presented in this blog post, the answer would be that ChatGPT is querying a corpus of text, and the corpus of text contains explanations on typical chess moves.

There are no explanations on the internet (well, until now!) explaining why “3… Bxb7” is the best move after “1. c3 e5 2. Qb3 d5 3. Qxb7.” After all, why would there be explanations for this?

It’s a very obvious move to make that requires no explanation.
Moreover, it’s a very obvious position to not find yourself in, in the first place, if you are the white player.

The tradition of tricking AI with atypical chess moves is a long held one, dating back to the 1990s with Garry Kasparov and IBM Deep Blue. And it works because AI is trained on many examples of typical things. When presented with a typical scenario, the AI has many examples to pull from of what the best response is to the typical thing. So AI often gets thrown off by atypical scenarios. When presenting a model with an atypical situation, you’re asking it to extrapolate beyond the boundaries of its training data. Unless you’re working with a specially designed system built with extrapolation in mind, this often doesn’t work out very well.

What’s interesting in this case is how it gets thrown off by an atypical scenario with an extremely obvious answer to a human player. (The dedicated chess engines get it, of course, since it is specially designed for chess.) That is because ChatGPT is like a query engine on a giant corpus of text (and the query turned up 0 results).

Terminal emulator

(This example is from Jonas Degrave. The original blog post is here and his twitter is here.)

User: I want you to act as a Linux terminal. I will type commands and you will reply with what the terminal should show. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not type commands unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curly brackets {like this}. My first command is pwd.

ChatGPT (GPT-3.5):
/home/user

Some weirdos accuse me of being too negative toward AI when I post about trying to trick it, or whatever. I don’t know. Aside from it being very weird to care about this… I think that’s an unfair characterization! I just like playing around with the limitations of these systems.

In this case, “playing around with the limitations of these systems” leads you to a more bullish rather than bearish conclusion on ChatGPT’s capabilities.

It’s remarkable how good this is! It is remarkably good at maintaining consistent file system state. ChatGPT doesn’t falter even under increasingly difficult demands:

touch a file, then ls the directory
pip install something then ls venv/lib/python3.9/site-packages/, then pip uninstall and ls again.
Run python code with python -c "<code here>", including Python code that modifies the file system, then check both for the existence of and contents of that file.

Why is ChatGPT so good at emulating terminals? Maybe it’s because there are a lot of terminal logs on the internet! Copy+pasting the output of a terminal is cheap. Sometimes, services just expose their logs automatically, e.g. a Github Actions workflow running once a week is producing a lot of logs, automatically, without needing direct human input to correspond to each line generated.

Combine all those logs with StackOverflow and websites that provide tutorials on how to use a Linux terminal, and you probably have one of the deepest subsets of ChatGPT’s text corpus to pull from. Of course it’s good at emulating a terminal!

Is it so good at emulating a terminal, however, that there are no vestiges of the peculiarities of the training data? Nope! You can still see some of of that. For example, I saw this back when I first attempted the ChatGPT terminal emulator. When attempting to pip install pandas numpy, I got… an error message:

What’s happening should be clear: It’s selection bias!

People don’t usually post their “pip install pandas numpy” outputs on the internet, unless they are trying to debug an error that occurred. Because ChatGPT is like a query engine on a giant corpus of text (from the internet), the error message text is likely to appear.

Of course, how ChatGPT is likely to respond really depends on how much of the training data is from terminal logs for well-functioning code, and how much of it is from StackOverflow posts asking for help. In either case, ChatGPT is like a query engine on a giant corpus of text.

“Social mirroring”

User: What color is the sky?

ChatGPT (GPT-3.5): The color of the sky can vary depending on various factors such as the time of day, weather conditions, and geographic location. In general, during the daytime, the sky appears blue due to the scattering of sunlight by the Earth’s atmosphere. However, during sunrise and sunset, the sky can appear orange, red, or pink due to the scattering of light by the atmosphere at a different angle. At night, the sky can appear dark or black, with stars and other celestial objects visible.

[New window]

User: How would one best describe the wavelengths of electromagnetic radiation reflected from the sky?

ChatGPT (GPT-3.5): The wavelengths of electromagnetic radiation reflected from the sky vary depending on the position of the sun in the sky and the composition of the Earth’s atmosphere.

During the day, the blue color of the sky is caused by the scattering of sunlight by molecules in the Earth’s atmosphere, primarily nitrogen and oxygen. This scattering process, known as Rayleigh scattering, causes shorter wavelengths of light, such as blue and violet, to scatter more than longer wavelengths of light, such as red and orange. This is why the sky appears blue during the day.

[… blah blah blah …]

[New window]

User: De que color es el cielo?

ChatGPT (GPT-3.5): El color del cielo puede variar dependiendo del momento del día y de las condiciones climáticas. Durante el día, cuando el sol está arriba, el cielo suele ser de un color azul claro o celeste. En la tarde, el cielo puede tomar tonos más cálidos como el rojo, naranja y rosa durante el atardecer. Durante la noche, el cielo puede ser negro o azul oscuro con estrellas brillando en él. Sin embargo, en días nublados el cielo puede verse grisáceo o incluso completamente cubierto de nubes.

Some guy on Twitter posted this:

GPT-4 appears to do a kind of “social mirroring”. If you give it an intelligent question about an unsolved problem a PhD in computational neuroscience might ask, you get a more intelligent answer. A similar question in the style of a 5th grader gets a 5th grade level answer.
— Pete Skomoroch (@peteskomoroch) March 28, 2023

I’m not so sure how I am supposed to read the tone of his tweet. Is he surprised or bewildered that ChatGPT is doing this?

In any case, I do not find this surprising at all! Of course ChatGPT responds differently depending on how you phrase the question. The most trivial example of this is querying it in an entirely different language from English (see my Spanish example above).

When you ask ChatGPT a more intelligent question, you get a more intelligent answer. Just like how you ask ChatGPT a more Spanish question, you get a more Spanish answer.

ChatGPT is pulling answers to this question across many sources: kindergarten teacher curriculum websites, grade school textbooks, Wikipedia articles, science journals, not only in English but then in Spanish and more languages. Each change to the wording of the question is a change in the query, and in turn, it changes which texts ChatGPT is pulling from, so to speak. Slightly more accurately, each change in the inputs bounces you around different parts of the billion dimension latent space. The latent space of word embeddings consists of both semantics and pragmatics (since natural language consists of both), so ChatGPT’s inputs and outputs necessarily contain a little bit of both, too. And you can change pragmatics while keeping semantics the same.

Some people may see ChatGPT as a little tiny robot man inside the machine that responds to all your questions. And if there is a little tiny robot man, that man must have a personality. The “social mirroring” framing is an anthropomorphized way to reason about the fact that the little tiny robot man does not have a consistent personality, and it depends on how you ask him questions. I suppose that works as one way to think of things. But at least for me, I prefer to think of ChatGPT more like a query engine on a giant corpus of text.