Generative AI Lies

Examples of generative AI making stuff up

Posts

  • We don’t like to talk about that

    (, )

    If your ChatGPT prompt includes certain not-uncommon names of humans, ChatGPT says “I’m unable to produce a response” and ends the session.

    Turns out that those names are names of some people who have prominently reported that ChatGPT was making up lies about them.

    So apparently, on learning that ChatGPT is lying about specific people, OpenAI has decided to prevent ChatGPT from responding to any prompt that mentions those people’s names.

    Of course, usually there’s more than one human who has a particular name, so OpenAI is also preventing ChatGPT from talking about anyone who has the same name as someone who ChatGPT has previously prominently lied about.

    (Original Facebook post.)


  • Album release date

    (, )

    Today I did a Google search for [“field of stars” mccutcheon] and I forgot to append “-AI” to leave out the AI Overview. When I forget to leave out the Overview, I normally try to not even look at the Overview; but this time the Overview caught my eye. It starts out:

    “Field of Stars is an album by American folk singer-songwriter John McCutcheon. The album was released on January 10, 2024.”

    McCutcheon had an online concert today to celebrate the release of the album, so I spent several seconds wondering why he waited 10+ months after its release to have the concert. And then I realized that of course the AI Overview is just wrong, once again. The album will be officially released on January 10, 2025. (But is available now in various pre-official-release contexts.)

    But this is one of the reasons that I usually try not to even look at the Overview, because they often read to me as so authoritative that even though I know they include false information, I still sometimes believe them.

    (Original Facebook post.)


  • Kosher bacon

    ()

    If you do a Google search for [salt pork substitute kosher], the AI Overview tells you to try pancetta or bacon as a kosher substitute for salt pork.

    Yet another example of why you should never believe anything that generative AI tells you.

    (Original Facebook post.)

    (Update: Sometime in the year after I posted this, Google stopped returning an AI Overview in response to that query.)


  • Transcription

    ()

    Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said

    This is about Whisper, which I’ve heard praised in other contexts. 🙁

    “Whisper has a major flaw: It is prone to making up chunks of text or even entire sentences, according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text — known in the industry as hallucinations — can include racial commentary, violent rhetoric and even imagined medical treatments.”

    “medical centers [are using] Whisper-based tools to transcribe patients’ consultations with doctors”

    “While most developers assume that transcription tools misspell words or make other errors, engineers and researchers said they had never seen another AI-powered transcription tool hallucinate as much as Whisper.”

    (Original Facebook post.)


  • Scaling and reliability

    ()

    “A common assumption is that scaling up [LLMs] will improve their reliability—for instance, by increasing the amount of data they are trained on, or the number of parameters they use to process information. However, more recent and larger versions of these language models have actually become more unreliable, not less, according to a new study.”

    “This decrease in reliability is partly due to changes that made more recent models significantly less likely to say that they don’t know an answer, or to give a reply that doesn’t answer the question. Instead, later models are more likely to confidently generate an incorrect answer.”

    (Article from Oct. 3.)

    (Original Facebook post.)


  • Apple Intelligence

    ()

    In Apple’s launch event for the iPhone 16 yesterday, I was not thrilled with the amount of emphasis they put on the new “Apple Intelligence” features. But I did think that if those features work well, some of them could be pretty useful.

    Unfortunately, this review makes me even more dubious.

    “In the preview I’m using, Apple Intelligence does an uncomfortable amount of making things up.”

    “like the time it alerted me that Donald Trump had endorsed Tim Walz for president. (Ha.) And the time it made up the idea that I’m teaching at UC Berkeley. (No.) And the time it elevated an obvious Social Security scam to my ‘priority’ inbox. (Yikes). And the time it edited a selfie to make me bald. (Double yikes.)”

    “it feels weird […] to see fabrications and misinterpretations of your life appear on your lock screen, inbox and other core parts of your iPhone.”

    “I told Apple about the many times I saw Apple Intelligence get facts wrong (I’ve had at least five to 10 laugh-out-loud moments per day). It says it is working to improve accuracy. But so is every other AI company — and that has proved to be a giant challenge.”

    (Original Facebook post.)


  • Fact-checking

    ()

    Email from a political organization:

    “As we navigate new waters and integrate cutting-edge AI technology into our fact-checking processes, we understand that occasional hurdles are inevitable.”

    …I sure hope that they’re not talking about using generative AI as a fact-checker. I’ve now sent them an email to point out what a terrible idea that would be, just in case.

    (Original Facebook post.)


  • More financial advice

    ()

    Yet another article that talks about using ChatGPT for your finances. As usual for these articles, it both explicitly says that ChatGPT gets things ruinously wrong, and frames things as if using ChatGPT for your finances is a perfectly reasonable thing to do.

    Make sure to triple check all the numbers and suggestions it gives you, because some of the suggestions we got made no sense and were totally incorrect — if I had followed what ChatGPT suggested, it could have seriously wrecked my credit score and got me in even more debt.

    Here’s what I did and how you can do the same

    I continue to be baffled by this kind of take. If you went to a human financial adviser and they gave you advice that would wreck your credit score, you wouldn’t say “So when you go to this adviser, be sure to check their numbers!”; you would say “Don’t go to this adviser.”

    (Original Facebook post.)


  • Medical-journal image

    ()

    Obviously AI-generated image from an article in the journal Medicine.

    The article has now been retracted “after concerns were raised over the integrity of the data and an inaccurate figure,” but you can still view the whole article or just the image.

    The AI-generated image is figure 2 in the article, labeled “Mechanism diagram of alkaline water treatment for chronic gouty arthritis.” It seems clear that nobody reviewed that image in any way.

    Medicine’s website says: “The Medicine® review process emphasizes the scientific, technical and ethical validity of submissions.”

    (via Mary Anne)

    (Original Facebook post.)


  • Human nonsense generator

    ()

    “Listen, you don’t need a large language model like chatGPT. You can ask me questions and I’ll generate confident sounding nonsense at a fraction of the resource consumption. Half the time, I don’t even remember to drink water!”

    —existennialmemes

    (Original Facebook post.)