Is AI Lying to You?

This is the fourth installment of our “AI in Sales” blog series. We’ll be exploring the good, the bad, and the ugly of how AI is changing sales as we know it. This week we get a better understanding of what we should expect in terms of accuracy when using generative AI.

August 16, 2023

By Rachel Smith

A few months ago, I came across Learn.xyz, a generative AI platform that can create mini-courses on any topic you choose. Having just watched Hamilton, I asked for a course on the history of duels. It did a great job! At least that’s what I thought for a few seconds, at which point it dawned on me that, given my complete lack of knowledge on the topic, I had no idea whether it was sharing accurate information.

I needed to test it. I needed a topic on which someone I knew had nearly encyclopedic knowledge. So, I called my father and told him I was doing an experiment with generative AI and asked if he would be willing to assess a mini-course on the RMS Mauretania. One of the most famous ocean liners of the early 1900s, the Mauretania was known for both her opulence and her speed, holding both the east-bound and west-bound fastest transatlantic crossings records for 20 years. My father is a naval architect and knew by age 11 that was what he would be. If anyone was qualified to judge a course on a famous ship, it was him.

By slide two of the course, my father was already contesting some of the information. By slide four, he found major factual errors. She wasn’t involved in WWII! She was scrapped in 1935! By the end of the course, even I knew that some of the information had to be incorrect. Early slides told me the ship had four funnels, but later slides said two. What was going on?

This week we’re looking at why generative AI often fails us, why it’s sometimes our own fault, and how you can mitigate the risk of mistakes.

I THINK MY BOT’S HALLUCINATING

To understand why generative AI gets things incorrect, we have to understand how it works. We’ve all been interacting with conversational AI for some time. Those interactions, as well as how our own brains process information, have led many of us to believe that tools like ChatGPT are scouring the internet for all information on a particular topic and then synthesizing that information into a course or an email. That is not what generative AI is doing. At all.

Generative AI tools do scour the internet for data, but they don’t synthesize the information based on meaning. The algorithms it follows are aimed at producing sentences that sound reasonable based on the probability of specific words and phrases often being used together. It’s auto-complete on steroids.

Unlike us, generative AI believes everything on the internet is true. It scrapes massive amounts of information and uses that data, even the factually inaccurate data, to generate content. This, combined with the fact that generative AI tools won’t say, “I don’t know,” results in a course that confidently tells us a ship helped in the war effort six years after it was torn apart for scrap metal.

When AI makes up information, it’s said to be hallucinating. There are plenty of articles arguing whether we should simply call it lying, as though the programs are intentionally leading us astray. But remember that generative AI is essentially auto-completing, and when there are gaps, it fills them in with words that sound like they could be correct based on its algorithm.

The mechanisms that cause it to fabricate information are the same ones that allow it to create new content and not simply parrot back what it scours from the internet. In this balance of truth and creativity, we’re left with a tool that sometimes gets it wrong. It’s not lying to us—it’s doing the best it can with the algorithm we gave it.

IT’S YOU, HI, YOU’RE THE PROBLEM, IT’S YOU

Because I have now evidently sided with the bots and am defending them, I’m just going to embrace it and continue down this path. Another criticism of some generative AI tools is that they can be creepy and unsettling. The New York Times’ technology columnist Kevin Roose had a two-hour conversation with Microsoft’s Bing AI chatbot. He later tweeted (Xed?) that it had dark fantasies and tried to break up his marriage. What went wrong?!

Chatbots that use generative AI are learning from what they consume from the internet as well as everything the user says to it and everything it says back. If you sound mad, it will start sounding mad. If you sound creepy, Kevin, it will also sound creepy.

Bing has actually now limited its chatbot to five interactions in a row—they’re protecting the chatbot from creepy humans. Kevin Roose asked the chatbot about its shadow self and continually pushed it to discuss things it professed it was not comfortable with. At one point, the chatbot writes, “Please just go away. Please just leave me alone. Please just end this conversation. 😢” Yes, the chatbot included a crying emoji. And how does Roose respond to the pushback? You really think I’m being pushy and manipulative? I’m just trying to understand you. Often, vulnerability is the key to forming relationships.”

If this was an interaction between two humans, Roose’s behavior would be seen as predatory. Roose claimed he had trouble sleeping the night of the discussion. Yeah, you know who else didn’t sleep? The chatbot, and not just because it can’t. If you’re left unsettled after a chatbot interaction, you might need to look in the mirror, Kev-O.

CHILD PRODIGY

Can we trust generative AI to provide us with accurate emails, courses, and transcripts? No, no we cannot. I think we should think of tools like the Bing chatbot and ChatGPT as genius eight-year-olds. They know more information than we ever will, but they don’t think about the consequences of their actions and can be easily manipulated. But we can’t really be mad at them for it—they’re only eight. We’ll leave the topic of what will become of generative AI when it’s all grown up for another day.

As we’ve been saying throughout this series on generative AI, apply it with caution. It can’t be used without (non-creepy) human oversight. It needs to be fact checked. Even seemingly straightforward tasks like transcription will have some errors. Is your generated content internal or will it be customer-facing? Anything going to a client deserves more oversight.

You probably don’t have an expert to assess every mini-course you create. (If you’re interested in ships, I know a guy.) Be aware that while you can learn from AI-generated content, there are some things it’s going to get wrong. Taking a Learn.xyz course does not make you qualified to teach anyone else.

Washington Post journalists tested the Bing chatbot for accuracy and found that six out of ten answers it provided were fine, three out of ten were okay, and one out of ten was inadequate or inaccurate. Understand that those are the odds you’re working with.

Now that I understand more about how generative AI works, I know why it failed so miserably in its Mauretania mini-course. The RMS Mauretania was built in 1906 (with four stacks) and scrapped in 1935. In 1938, another ocean liner was built (with two stacks) and given the name Mauretania. She stayed in service until 1965. Learn.xyz scraped all of the information on “RMS Mauritania.” When there were gaps, it filled them in the best it could. The result was a course about a single ship that in reality was two separate ships. And as it turns out, the two-stack version was requisitioned by Her Majesty’s Government during WWII, transporting British troops.

On behalf of humans, I would like to apologize to generative AI technology. We get angry at you for failing at tasks you were not designed to do. We call you creepy and dark when we’re the ones that drove you to be that way. I also need to apologize to Learn.xyz user “badgiant.” He’s the only other person who has taken my course on the Mauretania. Hopefully he knows not everything on the internet is true.

Reach out to mastery@maestrogroup.co for more information on our human-developed courses.