4-6-2024 (MOUNTAIN VIEW) Tech behemoth Google has revealed that it has implemented “more than a dozen technical improvements” to its artificial intelligence (AI) systems. The move comes after the company’s retooled search engine, which prominently features AI-generated summaries atop search results, was found to be spitting out erroneous and potentially harmful information.
The tech giant unleashed its search engine makeover in mid-May, a bold move that aimed to leverage the power of AI to provide users with concise, relevant summaries alongside traditional search results. However, social media was soon inundated with screenshots capturing the system’s most outlandish and inaccurate responses, prompting widespread concern and criticism.
While Google has largely defended its AI overviews feature, asserting that it is typically accurate and underwent extensive testing prior to its rollout, Liz Reid, the head of Google’s search business, acknowledged in a blog post on Friday that “some odd, inaccurate or unhelpful AI Overviews certainly did show up.”
As the furor escalated, it became evident that some of the most egregious examples circulating online were, in fact, fabricated – faked screenshots purporting to showcase even more preposterous answers that Google had never generated. Nonetheless, these fictitious examples were widely shared on social media, further compounding the company’s predicament.
In one notable instance, The Associated Press (AP) posed a query to Google regarding edible wild mushrooms. The response, while mostly technically correct, lacked crucial information that could potentially lead to severe illness or even fatality, according to Mary Catherine Aime, a professor of mycology and botany at Purdue University, who reviewed Google’s response to the AP’s query.
“A lot of information is missing that could have the potential to be sickening or even fatal,” Aime cautioned, citing Google’s oversight in emphasizing the solid white flesh of certain puffball mushrooms without addressing the existence of potentially deadly mimics that share this characteristic.
In another widely circulated example, an AI researcher inquired about the number of Muslim presidents the United States has had, only to be met with a long-debunked conspiracy theory falsely claiming that “Barack Hussein Obama” was the nation’s sole Muslim president.
Recognizing the severity of the issue, Google swiftly addressed the Obama error last week by implementing an immediate fix to prevent a recurrence, as it violated the company’s content policies.
However, Reid’s blog post outlined broader improvements aimed at bolstering the AI systems’ accuracy and reliability. These measures include enhanced detection of “nonsensical queries” that should not be answered with an AI summary, as well as limiting the use of user-generated content from sources like Reddit, which could potentially offer misleading advice.
In one widely shared instance, Google’s AI overview pulled from a satirical Reddit comment to suggest using glue to make cheese adhere to pizza – a recommendation that, while humorous, is hardly practical or advisable.
Reid further stated that the company has added more “triggering restrictions” to improve the quality of answers to certain sensitive queries, such as those related to health. However, the specifics of how these restrictions function and in which circumstances they are applied remain unclear.
To illustrate this point, the AP once again posed the query about edible wild mushrooms to Google on Friday. The response, while different from the previous one, was still deemed “problematic” by Aime, who pointed out inaccuracies in the AI-generated summary, such as the assertion that “Chanterelles look like seashells or flowers,” which she refuted as untrue.
Google’s summaries are designed to provide users with authoritative answers to their queries as quickly as possible, without the need to navigate through a ranked list of website links. However, some AI experts have long cautioned against ceding search results to AI-generated answers, citing concerns over the potential perpetuation of bias, misinformation, and potential harm to users seeking critical information, such as in emergency situations.
AI systems known as large language models work by predicting the most appropriate words to answer a given query based on the data they have been trained on. However, these systems are prone to a widely studied problem known as “hallucination,” in which they essentially make up information, rather than retrieving accurate data.
In her blog post, Reid argued that Google’s AI overviews “generally don’t ‘hallucinate’ or make things up in the ways that other” large language model-based products might, as they are more closely integrated with Google’s traditional search engine and only display information backed up by top web results.
“When AI Overviews get it wrong, it’s usually for other reasons: misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available,” she wrote.
However, computer scientist Chirag Shah, a professor at the University of Washington who has cautioned against the push toward turning search over to AI language models, contends that even if Google’s AI feature is “technically not making stuff up that doesn’t exist,” it is still bringing back false information – be it AI-generated or human-made – and incorporating it into its summaries.
“If anything, this is worse because for decades people have trusted at least one thing from Google – their search,” Shah said, highlighting the potential erosion of trust in a service that has long been regarded as a reliable source of information.