Generative AI Lies

Examples of generative AI making stuff up

Category: OpenAI

  • We don’t like to talk about that

    (, )

    If your ChatGPT prompt includes certain not-uncommon names of humans, ChatGPT says “I’m unable to produce a response” and ends the session.

    Turns out that those names are names of some people who have prominently reported that ChatGPT was making up lies about them.

    So apparently, on learning that ChatGPT is lying about specific people, OpenAI has decided to prevent ChatGPT from responding to any prompt that mentions those people’s names.

    Of course, usually there’s more than one human who has a particular name, so OpenAI is also preventing ChatGPT from talking about anyone who has the same name as someone who ChatGPT has previously prominently lied about.

    (Original Facebook post.)


  • Racism

    (, , )

    “ChatGPT and Google’s Bard answer medical questions with racist, debunked theories that harm Black patients”

    (Article from October.)

    (Original Facebook post.)


  • Legal filings

    ()

    A plaintiff’s lawyer asked ChatGPT for relevant citations. ChatGPT made some up. The lawyer cited them in court filings.

    Defending lawyers expressed puzzlement. The plaintiff’s lawyer asked ChatGPT to provide more info about the cases. ChatGPT obligingly made up the decisions in these nonexistent cases. The plaintiff’s lawyer submitted that output to the court.

    At some point, plaintiff’s lawyer asked ChatGPT whether the cases were real, and ChatGPT said they were, and plaintiff’s lawyer didn’t bother to check beyond that.

    When confronted by the judge about all these made-up filings, plaintiff’s lawyer apologized and said he didn’t know that ChatGPT could make stuff up.

    (Original Facebook post.)


  • Wrong phone prices

    (, )

    That thing we’ve been talking about lately, where an AI chat system gets incorporated into a search engine and then gives made-up answers to questions?

    Here’s a real example. Microsoft is now including ChatGPT (or some variation on it) as part of Bing, so Twitter user @GaelBreton tried doing some searches with it. They posted a (brief) thread that’s mostly about other aspects of the experience, but the part that interested me most is the final tweet in the thread, which shows a screenshot of Bing/GPT answering a question about phones. And it gives significantly wrong prices or specs for all three of the phones that it mentions.

    So I ask again, as I’m sure I’ll ask many times in the future: what good is a conversational AI interface for search results if it provides false answers?

    (Original Facebook post.)


  • No truth evaluation

    ()

    AI text generators like GPT-3 are really impressive. But there’s one fundamental principle that you should keep in mind whenever you’re looking at anything generated by such a system:

    It doesn’t evaluate the truth of what it’s saying.

    Sometimes the generated text says things that are true. Sometimes it doesn’t. The generator doesn’t distinguish between those situations.

    I know that I’ve said variations on that before, but I think it’s a point worth repeating.

    Today’s instance of this statement was inspired by the new ChatGPT chatbot. I just saw a tweet praising ChatGPT’s ability to explain a complicated regular expression; I agree that the explanation provided looks really impressive, but unfortunately, it’s wrong. But lots of people (including the person who posted the transcript of the chat) seemed to think that it was correct.

    The regex in question is really weird—it doesn’t at all do what it appears to have been intended to do. ChatGPT, impressively, gives a good explanation of what the regex was intended to do—but that explanation gets several details outright wrong, including saying that one part is optional when it’s really a different part that’s optional.

    Again, there are lots of really impressive things about this answer. But if a human relies on this answer to be factually accurate, they’re going to run into problems.

    Another example: ChatGPT explains the factors of a specified polynomial, but gives the wrong answer.

    One of the replies to the regex tweet said something along the lines of ~“Who cares if it’s wrong? It’s 99% of the way there. A future version will be able to look impressive and give the right answer!”~

    (My tildes there indicate that that’s my paraphrase, not a quote.)

    And it may well be true that a future version will fact-check itself.

    But for now, don’t believe anything that an AI text-generator says, unless it’s been fact-checked by a reliable and knowledgeable human.

    (Original Facebook post.)