Generative AI Lies

Examples of generative AI making stuff up

Posts

  • Google AI lies about Scalzi

    (, , )

    John Scalzi writes:

    Every assertion in this “AI Overview” of the question “What coffee does John Scalzi drink” is wrong. I don’t regularly drink coffee (and never black) I’ve never had black sesame jasmine cream tea, and I don’t hang in coffee shops. Don’t trust “AI” ever!

    The post was accompanied by a screensnap from Google’s AI Overview, answering the question “What coffee does John Scalzi drink”. As usual, the AI Overview provided links that supposedly linked to sources for the information that it was giving, but the linked-to pages don’t say what Google says they say.

    After Scalzi’s posted the above to Bluesky, Google’s answer changed to link to his post. But just because AI Overview might eventually be updated to give correct information, that doesn’t mean we should trust it.


  • AI models answer questions about medical images they haven’t seen

    (, )

     “AI models happily came up with answers to questions about a supposedly accompanying image — even if the researchers never even showed it an image.”

    The article quotes the researchers as talking about this in terms of there being “information […] hidden in a sentence or a question” that allows the AIs to correctly answer the questions even without access to the images that the questions are about, but that seems weirdly implausible to me. Later in the article, there are implications of what seems to me far more likely: that the AI was trained on these questions and answers, and thus can provide the answers without having to have access to the images that accompany the text.

    More from the researchers:

    “Another implication is that, now that we know an AI can say ‘I see evidence of malignant melanoma on your skin’ without even having access to any images, how much can we trust it when it says the same while actually seeing the image?” Asadi posited. “We definitely need more effort being put in safety and alignment of such models, and might need to think twice before deploying them in user/patient-facing systems.”

    […] “The number one [takeaway] would be that just because the AI is saying, very convincingly, that it is seeing something, it doesn’t mean that it is actually seeing that.”


  • Canadian Immigration Department uses lying generative AI

    ()

    Canada rejected her permanent residence application. Her job duties were made up — by Immigration’s AI reviewer

    Postdoc Kémy Adé applied for permanent residence in Canada, but was rejected because a generative-AI tool hallucinated a set of job duties that she didn’t have, and the Immigration Department therefore ruled that the work she had done (that the AI made up) didn’t match the work she claimed to have done (that she really did).


  • Police officer turns into frog

    ()

    You know how organizations and doctors and therapists and lots of other people want to use generative-AI tools to write up summaries of meetings?

    Now there are a couple of tools that police can use to create reports summarizing body-camera audio and/or video.

    And some police love it:

    “Most of our officers are kind of awestruck with (Draft One) because it’s such a new, innovative thing for us,” Weishar said. “It’s like that brand new car that’s got all the features to it. For us, it’s crazy that you can just press a button and it’ll tell you everything about the case that you were on and give you a pretty decent police report to edit.”

    But (gasp! shock!) it has certain pitfalls:

    “I read the report, and I’m like, ‘Man, this really looks like an officer wrote it,’” Sever recalled. “But when it got to one part, it said, ‘And then the officer turned into a frog, and a magic book appeared and began granting wishes.’ … It was because they had, like, ‘Harry Potter’ on in the background. So it picked up the noise from the TV and added it to the report.”

    The second half of the article has some reasonably good discussion of some of the reasons police shouldn’t be using these tools. But it won’t surprise me if some police departments start doing it anyway. This kind of software is a great time-saver, as long as you don’t mind when the resulting reports describe officers turning into frogs.


  • Law firm dissolved

    ()

    In Mississippi, a lawyer included AI-generated fake citations and was told not to do that, but kept doing it.

    The latest judge to receive her fake citations was not amused, and has issued sanctions against her and the two partners in the firm that she worked for.

    If I’m understanding right, the partners have now dissolved the firm.

    (It looks like Ms. Watson, the lawyer who used AI-generated fake citations in ten different cases, may be the daughter of one of the two partners.)

    The judge reacted strongly to Ms. Watson’s behavior:

    In light of repeated warnings from federal courts about the risk of hallucinated cases, as well as CLE trainings she attended, direct notice and knowledge of the same prior mistakes, her violation of the Firm’s AI policy, and the sheer number of filings, Ms. Watson’s misconduct is particularly egregious and prolific.

    The partners are also being sanctioned for failing to notice the problems. For example:

    a large portion of Billups’ argument relies on a case styled Jackson v. Gautreaux, 3 F. 4th 182, 190 (5th Cir. 2021). […] In fact, this case is cited eight times, even arguing that a jury should be instructed under its holding. […] In reality, Jackson is an excessive force and failure to train case and is wholly irrelevant to the case at bar. A seasoned attorney examining the brief should have read a case so heavily relied upon. Had he done so, he would have easily discovered the problems.

    The judge noted that the usual penalty for this sort of thing has been fines, but quoted another case about why fines are insufficient:

    “If fines and public embarrassment were effective deterrents, there would not be so many [AI misuse] cases to cite.”

    (Given that there are so many such cases, I probably won’t post about all the ones I hear about, but this one did seem especially egregious.)


  • More hallucitations

    ()

    AI Is Inventing Academic Papers That Don’t Exist — and They’re Being Cited in Real Journals

    Rolling Stone says:

    [Academic] articles which include references to nonexistent research material […] are themselves being cited in other papers, which effectively launders their erroneous citations. This leads to students and academics (and any large language models they may ask for help) identifying those “sources” as reliable without ever confirming their veracity. The more these false citations are unquestioningly repeated from one article to the next, the more the illusion of their authenticity is reinforced.


  • Fake blobfish

    ()

    Deep Sea Social Media is Flooded by AI Slop

    I focus more on text than on images on this site, but when AI-generated images inaccurately portray what real-world creatures look like, I figure that more or less fits my theme.


  • Made-up journals

    ()

    Scientific American says:

    OpenAI’s ChatGPT, Google’s Gemini, Microsoft’s Copilot and other models are befuddling students, researchers and archivists by generating “incorrect or fabricated archival references,”

    Which is a problem for librarians:

    who end up wasting their time looking for requested nonexistent records, says Library of Virginia chief of researcher engagement Sarah Falls. Her library estimates that 15 percent of emailed reference questions it receives are now ChatGPT-generated, and some include hallucinated citations for both published works and unique primary source documents. “For our staff, it is much harder to prove that a unique record doesn’t exist,” she says.

    I kinda want to call these things “hallucitations.”


  • Chuck Wendig’s cat

    (, )

    Chuck Wendig discovers that Google AI Overview says he has a cat named Boomba. Also other cats. Also six dogs. Also two children. And a spider. Most of those pets and one of those humans don’t exist in real life, but if Google AI Overview says they do, who are we mere mortals to question it?

    Content warning for the imaginary deaths of imaginary cats. Also for an imaginary cancer diagnosis.


  • Is 2026 next year?

    ()

    Hacker News discussion of various generative-AI systems’ responses to the query [Is 2026 next year?]

    For example, here are the first couple of responses shown on that page:

    Google search AI response: “No, 2026 is not next year; it will be the year after next. Since the current year is 2025, “next year” would be 2026, which is only a little over a year away. “

    ChatGPT does something similar; response:

    “No — if today is 2025, then 2026 is the year after next. Next year would be 2026 only if we were currently in 2025. Right now (based on the system date) it is December 2025, so: Next year = 2026 The year after next = 2027 So in this moment: yes, 2026 is next year.”

    Also:

    Claude [version] Haiku 4.5

    “No, 2026 is not next year. Since it’s currently December 2025, next year would be 2026. Wait—let me correct that: 2026 is next year from the perspective of 2025. So yes, 2026 is next year.”

    And:

    Grok [version] 4 fast: “No, 2026 is not next year.

    Today is December 2, 2025, so:

    – “Next year” means 2026 (starting January 1, 2026). – But since we’re already in the last month of 2025, 2026 is indeed the very next calendar year.

    So yes — 2026 is next year. “

    …Some people (probably humans) commented on that page that this issue might be related to when the models’ training data was last updated. In keeping with that theory, here’s another AI’s answer:

    LLama [version] 4 Scout: “To determine if 2026 is next year, we need to consider the current year. As of my knowledge cutoff, the current year is 2024.

    If we assume the current year is 2024, then:

        2025 would be next year.
        2026 would be the year after next.
    

    However, if the current year is 2025, then:

        2026 would indeed be next year.
    

    Could you please clarify or provide the current year for a more accurate response?”

    But that’s presumably not the entire issue, because the other systems quoted above did explicitly say that the current year is 2025, while still responding as if the current year were 2024.

    …I should mention that several of the models quoted on that Hacker News page did correctly answer the question. (And Google’s answer seems to have improved over time.) But the fact that several of them failed in similar ways does make it sound to me like there may be something more specific going on than just the usual LLMs-make-stuff-up issue.