Mike Pope on the Gell-Mann Amnesia Effect/Knoll’s Law (“everything you read in the newspapers is absolutely true, except for the rare story of which you happen to have firsthand knowledge”) and ChatGPT.
Category: Explanations
-
Retrieval Augmented Generation
Article (from May 4) about feeding relevant information to a generative AI to reduce the likelihood of it making stuff up, an approach known as “RAG,” or “Retrieval Augmented Generation.”
“RAG can help reduce a model’s hallucinations — but it’s not the answer to all of AI’s hallucinatory problems. Beware of any vendor that tries to claim otherwise.”
-
Who are we talking to?
Long but excellent article about generative AI (LLM) chatbots: “Who are we talking to when we talk to these bots?”, by Colin Fraser.
(Article from early 2023; I’m sure some details of LLM behavior have changed since then, but I think the core ideas here are still valid and relevant.)
Fraser’s central points are that LLMs are designed to create sequences of words, and that a chatbot persona is a fictional character overlaid (by means of training) on that word-sequence-creation software. And that our experience of having a conversation with an LLM chatbot is partly shaped by the user interface and by our expectations about how conversations go.
A few quotes:
“At least half of the reason that interacting with the bot feels like a conversation to the user is that the user actively participates as though it is one.”
“the first-person text generated by the computer is complete fiction. It’s the voice of a fictional ChatGPT character, one of the two protagonists in the fictional conversation that you are co-authoring with the LLM.”
“A strong demonstration that the LLM and the ChatGPT character are distinct is that it’s quite easy to switch roles with the LLM. All the LLM wants to do is produce a conversation transcript between a User character and the ChatGPT character, it will produce whatever text it needs to to get there.”
“my interlocutor is not really the ChatGPT character, but rather, an unfeeling robot hell-bent on generating dialogues.”
“The real story isn’t that the chat bot is bad at summarizing blog posts—it’s that the chat bot is completely lying to you”
“[AI companies] actually have very little control what text the LLM can and can’t generate—including text that describes its own capabilities or limitations.”
“The LLM [is never truthfully refusing to answer a question]. It’s always just trying to generate text similar to the text in its training data”
-
Reverse centaur
Cory Doctorow on how several recent AI failures illuminate the idea of a “reverse centaur”:
“This turns AI-‘assisted’ coders into reverse centaurs. The AI can churn out code at superhuman speed, and you, the human in the loop, must maintain perfect vigilance and attention as you review that code, spotting the cleverly disguised hooks for malicious code that the AI can’t be prevented from inserting into its code.”
-
Why LLMs make stuff up
Some interesting stuff about why Large Language Model AI systems make stuff up. Also, article suggests using the word “confabulation” instead of “hallucination” when LLMs make stuff up.
Some quotes from the article:
“In the case of ChatGPT, the input prompt is the entire conversation you’ve been having with ChatGPT[…]. Along the way, ChatGPT keeps a running short-term memory (called the “context window”) of everything it and you have written, and when it ‘talks’ to you, it is attempting to complete the transcript of a conversation as a text-completion task.”
“ChatGPT […] has also been trained on transcripts of conversations written by humans.”
“When ChatGPT confabulates, it is reaching for information or analysis that is not present in its data set and filling in the blanks with plausible-sounding words.”
“In some ways, ChatGPT is a mirror: It gives you back what you feed it. If you feed it falsehoods, it will tend to agree with you and ‘think’ along those lines. That’s why it’s important to start fresh with a new prompt when changing subjects or experiencing unwanted responses.”
One possible way to improve factuality “is retrieval augmentation—providing external documents to the model to use as sources and supporting context”
Other possible approaches include “more sophisticated data curation and the linking of the training data with ‘trust’ scores”
-
A blurry JPEG of the web
Ted Chiang suggests a metaphor for ChatGPT and other Large Language Models: you can think of them as a blurry JPEG of the web. (Which is to say, a form of lossy compression.)
A useful metaphor, and a good article.