Mystery AI Hype Theater 3000
Mystery AI Hype Theater 3000
Episode 35: AI Overviews and Google's AdTech Empire (feat. Safiya Noble), June 10 2024
You've already heard about the rock-prescribing, glue pizza-suggesting hazards of Google's AI overviews. But the problems with the internet's most-used search engine go way back. UCLA scholar and "Algorithms of Oppression" author Safiya Noble joins Alex and Emily in a conversation about how Google has long been breaking our information ecosystem in the name of shareholders and ad sales.
References:
Blog post, May 14: Generative AI in Search: Let Google do the searching for you
Blog post, May 30: AI Overviews: About last week
Algorithms of Oppression: How Search Engines Reinforce Racism, by Safiya Noble
Fresh AI Hell:
AI Catholic priest demoted after saying it's OK to baptize babies with Gatorade
National Archives bans use of ChatGPT
ChatGPT better than humans at "Moral Turing Test"
Taco Bell as an "AI first" company
AGI by 2027, in one hilarious graph
You can check out future livestreams at https://twitch.tv/DAIR_Institute.
Subscribe to our newsletter via Buttondown.
Follow us!
Emily
- Twitter: https://twitter.com/EmilyMBender
- Mastodon: https://dair-community.social/@EmilyMBender
- Bluesky: https://bsky.app/profile/emilymbender.bsky.social
Alex
- Twitter: https://twitter.com/@alexhanna
- Mastodon: https://dair-community.social/@alex
- Bluesky: https://bsky.app/profile/alexhanna.bsky.social
Music by Toby Menon.
Artwork by Naomi Pleasure-Park.
Production by Christie Taylor.
Alex Hanna: Welcome, everyone, to Mystery AI Hype Theater 3000, where we seek catharsis in this age of AI hype. We find the worst of it, and pop it with the sharpest needles we can find.
Emily M. Bender: Along the way, we learn to always read the footnotes, and each time we think we've reached peak AI hype, the summit of Bullshit Mountain, we discover there's worse to come.
I'm Emily M. Bender, Professor of Linguistics at the University of Washington.
Alex Hanna: And I'm Alex Hanna, Director of Research for the Distributed AI Research Institute. This is episode 35, which we're recording on June 10th of 2024. And hey, have you put glue in your pizza sauce lately? Or maybe gasoline in your spaghetti?
Maybe you've been meticulously eating at least one small rock per day, per the advice of Berkeley geologists.
Emily M. Bender: Hopefully not, but these are all now famous examples of Google's new AI summaries taking text on the internet out of context and positioning that text as factually grounded advice. That last one came from an Onion article, for example, which says a lot about what happens when you indiscriminately hoover up any text you can find in the name of training data.
Alex Hanna: After a bumpy rollout, Google made some promises to tweak their AI generated summaries so these blatant instances of ridiculousness don't repeat. But as you might expect, we have thoughts about this entire project and the way Google writ large is degrading our access to information while claiming to improve it.
Emily M. Bender: And we are fortunate to have the ideal guest here to help us put this all in perspective. Dr. Safiya Noble is a professor of gender studies, African American studies, and information studies at UCLA. She's also a MacArthur Foundation fellow. And of course, if you haven't read it yet, we highly recommend her bestselling book, "Algorithms of Oppression," about the racist and sexist harms that arise when information access is mediated by advertising companies like Google.
Welcome Safiya.
Safiya Noble: Thank you. It's such an honor to be here. So thrilled.
Alex Hanna: So excited to have you.
Emily M. Bender: This is amazing.
Alex Hanna: And just one more thing--
Safiya Noble: I'm obsessed with you two. I just wanted to say that, full disclosure. I'm sorry. I'm I am. So--
Alex Hanna: We're obsessed with you.
Emily M. Bender: This is a mutual admiration society for sure.
Alex Hanna: Yeah, 100%. One more thing before we get rolling, you might not know, but we have a newsletter now.
So if you don't want to wait until the next episode to hear our thoughts, go find that link in the show notes or go to DAIR Instititute dot org slash m a i h t 3 k to find it at the very top of the page. D A I R hyphen institute dot org slash M as in mystery, A as in artificial, I as in intelligence, H as in hype, T as in theater, 3 as in 3, and K as in K.
Emily M. Bender: K as in thousand.
Alex Hanna: I know, I know, I know, but it's, it's, it's sillier to say K as in K. Okay, let's go.
Emily M. Bender: All right. Let's yeah. Let's the newsletter is fun and we hope you subscribe, but we are here with Dr. Safiya Noble and we should definitely not delay talking about our artifacts.
Um, so we have two blog posts from Google, one from May 14th, um, talking about their new AI overviews feature, and then a follow up from a little bit later in May.
So the first one, um, by Liz Reid, Reid. Do we know how to say her name?
Alex Hanna: I'm assuming Reid.
Safiya Noble: I'm going with Reid.
Emily M. Bender: Yeah. Okay. Right. So Liz Reid who's the VP head of Google Search. Um, title is, "Generative AI in Search: let Google do the searching for you."
And then there's a subhead here, "With expanded AI overviews, more planning and research capabilities and AI organized search results, our custom Gemini model can take the legwork out of searching."
Already a rich text, I would say.
Safiya Noble: A lot of bold claims in there.
Emily M. Bender: Yeah. Bold claims and weird goals. Like take the legwork out of searching. We're going to see this come up again and again, but basically they're like, yeah, you don't have to think anymore.
We're just gonna, we're just gonna give you the information.
Alex Hanna: It's really funny, just the frame that searching was that much of an onerous thing to begin with, as if, you know, so much of the work that has gone into searching really now is passing all the sponsored results and then ignoring all the ads and then trying to find all the non SEO, um, optimized content to then find something that you're actually looking for.
But sure, AI overview really, you know, optimizes for all of it.
Safiya Noble: Yeah, I will tell you that this one was interesting because I felt like, well first of all people who are experiencing this already are, you know, describing this as going having to find the links at the center of the earth, like the links to websites are pushed so far down because especially if you're on a mobile device, and you're reading these long summaries.
So we're kind of already out of the gate um, getting worse as far as I'm concerned in terms of helping people understand the depths of possibilities that might be surfaced in search in the first place, right? I mean, I, my biggest criticism, well, one of them, I mean, I guess I don't have just one, but many, uh, of the criticisms include things like page rank in the first place. That, you know, as if you could order from one to 2 million plus or X million plus search results in some type of order that really makes sense, um, along what values.
But now we're taking the websites even like from visibility for the most part. And we know, for example, looking at user behavior with search engines, most people don't scroll to the second page. What shows up on that first page really, really matters because they're not going to go much beyond that.
So I think about this as just a further obfuscation of the possibilities of what we might find online.
Emily M. Bender: Yeah. So speaking of scrolling to the second page, um, let's see what's a little bit further down here. Um, so we have a, a image of, um, a Google search, maybe on a phone. "What are good options for a dot, dot, dot."
I'm going to take us down though to the text. "Over the past 25 years, across many technological shifts, we've continued to reimagine and expand what Google search can do. We've meticulously honed our core information quality systems to help you find the best of what's on the web. And we've built a knowledge base of billions of facts about people, places, and things. Also, you can get information you can trust in the blink of an eye."
Um, and you know, I was preparing for this. I was thinking about what it was like when Google first hit the scene and what it was that I thought I was doing with the Google search engine. And my conception back then was one of the kinds of information out in the world is information on webpages.
You know, typically something like, um, a researcher has created their own web page, or some museum somewhere has put together some information about one of their exhibits, or, um, the restaurant I'm interested in has their own page with their menu and hours, and the point of the Google search engine was to help me find those web pages.
And it wasn't about answering all questions in the world, it was just a way of accessing things that people have put on the web. And so I, what I agree with in this paragraph. Like what I think is accurate is they say they've continued to reimagine and expand what Google search can do. Um, yeah, they've certainly done some reimagining.
I'm not sure I liked the way that it went, but the, the idea of what the product is really feels like it's changed in those 25 years.
Safiya Noble: Yeah, I totally agree. And I think part of it is, you know, I saw some researchers talking about how younger people are using search now. And of course, we know that they're doing things like going to tick tock, or they want to see a more anthropomorphized kind of result.
They don't even trust a list of web results. So some of this feels kind of generational to in terms of how long you've been using the internet. I mean, I know that, uh, yeah you know, some of us on this call or watching today might remember the pre search engine days of the internet, right? When you had to go through these complex directories and you just kind of have to go into a chat room and ask anybody, do you know the URL or, uh, God forbid you had to use the URL phone books, um, (laughter) I'm so old.
Uh, you know, so you think about like the evolution of search. And yeah, I think if, if you go back to the URL phone book, nobody really wants those days again, necessarily. But, um, I'm not sure we also want to go to, as you know, this sense of a trusted anthropomorphized machine that summarizes everything like an expert, but not really as an expert.
And I think, you know, Emily, um, your work has been so valuable to us to understanding that just because you can pattern match some words doesn't mean that you're getting the best resources, uh, available on the web and that, that to me is like, um, you know, the assumptions of what we want, but also kind of a catering to, um, how the next generation of search users are, are using the web.
Emily M. Bender: Yeah, yeah. And I think that they're also, they are pretending that they are no longer pointing you to resources on the web, but just answering your questions, right? This thing in here about a knowledge base of billions of facts. I would definitely hesitate to call something a fact if it hasn't actually been fact checked and verified, but that number there, billions, tells me that that's, they didn't.
Right?
Safiya Noble: Yeah. How could they? In fact, I loved the story in the Verge that was from last, uh, February, last year, February, 2023, James Vincent, who's this great tech reporter, he wrote about the first demo tweet that they, uh, you know, fact, that they were putting out. And, um, it was basically kind of asking the question about, I think the question was, um, of Bard that they'd asked, was "What new discoveries from the James Webb space telescope can I tell my nine, nine year old about?" and then it brings back three bullet facts, right? Um, one of which is that it took the very first picture of a planet outside our own solar system to which people who work at NASA and people who were like, uh, our colleagues up at UC Santa Cruz who work in space were kind of like, uh, actually we took the image, um, of an exoplanet 14 years before the JWST was launched.
So, you know, it, it, it then puts the onus back on hopefully a random expert on any given fact might see a tweet and be able to respond, which is typically what happens, right, to do that. But, of course, that that is just an absurd way of fact checking to and I think we all here, the three of us on this call know very well what it's been like to fact check the, uh, providers of information as, uh, public employees or nonprofit employees. And, um, uh, you know, that's not our job.
Alex Hanna: Right.
Emily M. Bender: That's not our job.
Alex Hanna: There's an interesting thing they say here, because they say, "We built a knowledge base of billions of facts about people, places, and things." And so internally, you know, Google does have this knowledge base thing, right? And that knowledge base thing looks like an ontology and, and it's actually making me think of, um, it's making me think of the last season of the TV show Halt and Catch Fire, which is an amazing show that I love dearly, which is about computing in the 80s, um, and early 90s. But at the company that they have, um, they have, uh, kind of a librarian who also plays roller derby, fun fact, uh, canonically. Um, so I sympathize a lot with her.
But she is the, called like the chief ontologist. And so this kind of human curation of, you know, and they're developing kind of like something that would be like a Lycos or a, or a Yahoo of these webpage directories that you mentioned, Safiya.
And, uh, and so there is something like that internally at Google, but at the same time, that's not the thing that really powers Google, like the thing that powers Google is this massive scraping of the web and this maintenance of this huge index, which is not feasible to actually fact check in any kind of meaningful way. Nor have the sufficient context around these people, places, and things to do at scale. And so this is why they are sort of trying to bring in this AI overview and being pretty disingenuous about what it's doing.
Um, one of the things that I do want to call out down here, if we scroll a little far down, is they talk about um, the difference between, um, this thing that they're doing and the large and, and, and other large language models, a real like, we're not like other girls moment.
Um, but they say, they say something of the nature of, um, I, I don't know if it's on this page or if it's on the other one that we're going to get into, but they say something to the degree of 'well, we're not just prevent, you know, presenting things that are the outputs of large language models. We're not just doing things, um, that are doing that.'
Yeah, I mean, so what they're actually, I think they say, they say with AI overviews, people are visiting a greater diversity of websites to help for help with more complex questions.
And we see that the links included in AI overviews get more clicks than if the page had appeared as a traditional web listing for the query. So in this, they're actually saying, and so this isn't the point that, that I was, I was getting at, and I think it's actually on the next, on the, on the other page.
We'll get there. But what they do say is actually, "Our point here is actually helping in sending valuable traffic to publishers and creators." Um, we're still going to have ads, but we're going to have clear labeling for that. But it's actually going to be good for publishers.
Emily M. Bender: This made me so mad. So, "As always, ads will continue to appear in dedicated slots throughout the page with clear labeling to distinguish between organic and sponsored results."
And I thought okay, but all the stuff that got sucked into the overview is not labeled as an ad, right? And how much of that is actually going to be paid, right?
That-- (laughter)
Safiya Noble: That's right. Well, we already know, okay, look, all of this is built on Google's search technology that it's been evolving for decades.
We know that even Google's own search architects and engineers say that they're not exactly sure how it works in terms of when it's time to fix things that are broken. And, um, of course, that only becomes more difficult. in terms of like fixing broken things when you have machine learning algorithms that are driving that.
Um, but think back to, uh, I'm such a nerd. I mean, I really think back to Sergey Brin's, um, you know, uh, canonical paper, "the anatomy of the large scale hypertextual web search engine." And, you know, here, like Brin and Page write this, um, this paper, that's the architecture of how web search is going to work.
And it becomes Google and it's, and they, in a footnote say, 'We know that this entire project can be completely gamed. You know, we know that people can buy their way to the top in essence, if we commercialize this technology,' and then they go ahead and do that. And so think about the way in which for decades, People have been able to use AdWords and all of their kind of ad technology to, um, get themselves and those entities and companies to the top of the search pile.
And that, of course, is part of the logic that's going to drive the kinds of things that the generative AI is going to read as most valuable. How could it not?
Um, and, you know, we, we want to remember that just this week, um you know, Google's in court because 17 states are suing, uh, suing it for basically, um, antitrust and, you know, its uh complete obliteration of any potential competitors who might bring in different logics also into the search marketplace. So, you know, we can't even talk about all the things and all the other ways that we could have come to experience search or no search or have different types of prioritizing logics beyond just those who pay the most in AdWords to get to the top.
And to me, this is really important because people forget the kind of precursors to this moment. And I think we have to. It's exactly as you said, Emily, none of those previous search results were necessarily labeled as ads.
And in fact, we've heard for a really long time from Google that there's a separation between church and state. There's a separation between editorial and ads. That is just absolutely just not true.
Alex Hanna: Yeah.
Emily M. Bender: Yeah. In that connection, I want to lift up this comment from the chat. Abstract Tesseract says, "It feels like tech companies have encouraged users' misconceptions of how tools work. Then they hide behind those misconceptions and say, we're just giving people what they want."
And I think that that really connects with what you're saying there, Safiya, about how there's actually lots of other ways this could be conceptualized, but given that we have a monopoly in this space, we're basically stuck with this one that is not in the interest of, um, the people doing the searching.
And it is certainly not in the interest of the people who are being reflected in those search results as you write about in your book.
Safiya Noble: Yeah, it's absolutely true. Yeah. Go Alex.
Alex Hanna: Oh, thank you. I, yeah. And I mean, that's a thing too. I mean, in the, in this, In this point, you make Safiya about the kind of defense of how these things work and what you're saying, well, we don't really know how these things were in, you know, in, in it's really evidenced I mean, of course, from your work Safiya, in which, you know, upon searching for for black girls, the way that they fixed that was not to update the index and really any meaningful way is that they applied like a very heavy handed kind of block word type of thing.
You know, they said, well, if the search is 'black girls' we're going to, you know, promote Black Girls Code and Black Girls Rock, you know, they're not actually going to fix this index in any kind of meaningful way.
And we're seeing a rehashing of the same thing in these AI overviews in which, you know, now they're saying when you search glue on pizza, you know, they're like, well, can't search that one. And then I think there was some reporting, uh, I don't know if it was on 404 Media or if it was the Verge. But somebody had written effectively how they were playing games of whack a mole to effectively detect what was happening when these things were going awry and that they were trying to provide some kind of an alternative for this or they were just shutting that search down completely.
Um, so it's really created just a whole new layer of technological opacity around their index and for why? Effectively to, you know, they did some AB experiments and basically said, if we ran AI overviews, we can keep eyeballs on our website for, you know, 3.2 seconds. And our investors definitely want to see that and that's going to translate to this much more revenue at the end of the quarter.
Emily M. Bender: Yeah. Before I take us to the second artifact, because we definitely want to get to that one too, Alex, connecting to what you're just saying in this one here, they say, um, "People have already used AI overviews billions of times through our experiment in Search Labs. They like that they can get a quick overview on a topic and links to learn more. We found that with AI overviews, people use search more and are more satisfied with their results."
Like the rest of this, or the first part of this made it sound like they were talking to the users, but that paragraph is talking to the investors. Right. And maybe the advertisers. Like this is, this is like really not, not trying too hard to hide what it is that they are really working on here.
Alex Hanna: Yeah.
Safiya Noble: Yeah. You know what I find interesting about this too, um, is that when a lot of criticism came to Google about, you know, right, racist and sexist search results, you know, my work and other people's work, many people's work.
Um, they did exactly what you said, Alex, they downranked. And, you know, they didn't actually change the algorithm because they couldn't. They really didn't even know how. Many times they're not really sure how they get the results that they get. And it's the public who does this work of, um, you know, challenging the veracity of the kinds of results that we get.
But there was also this, um, there was this moment here over the last few years, where Google would defer to Wikipedia to do these kinds of fact provision summaries, um, that would pop up first, uh, particularly on any kinds of issues that were considered controversial or where they already knew they were getting it wrong.
If it was about uh racial or ethnic, um, issues in our society or people, they would just kind of like offload that to Wikipedia so that they could kind of, you know, wash their hands of it and say, well, we, it's not that we provided the wrong information. We just kind of brought up what we thought was a more factual resource.
And now that seems to be done away with, or they've just incorporated Wikipedia. I think, well, it'll, it's yet to be seen. We're going to have to keep kicking the tires on this, but we know for sure that generative AI models already are scraping all kinds of copywritten, um, information, our books, articles, knowledge, you know, all kinds of things to, uh, come up with these types of summative answers.
And for me, it still begs the question, I guess, for those of us who work around academia, work around libraries and other kind of knowledge spaces, that you can have, uh, thousands and thousands of volumes that are contesting ideas on one topic. And so, um, how then does it arrive at the point of view that it's going to hold, um, in providing a certain type of like answer, summative answer.
And I find that still to be a question that um, none of the generative AI models have been very successful at answering, and we've seen it as they've tested, uh, ChatGPT, for example, and the very different, um, results it gets in terms of should human rights or civil rights be afforded to, um, to Israel or Palestine, or people living in those two different parts of the world, and, um, very different kinds of answers, right? Certain type of qualifiers around, um, who unequivocally gets rights and who maybe there's more complexity to it before we start assigning rights.
So these, these are the real challenges that I think get also obscured for the public. And instead of opening up a world which we thought search would do in terms of like, here's millions of possible resources, uh, we're not telling you which ones we think are the most reliable, except for the way that we rank order them and tell you that the first 10 are probably more reliable than the last million, which we could of course still challenge, but now it's completely obliterating the possibility that you would even know there's a contested knowledge.
And I think that is going to be extremely consequential for people who use search engines.
Emily M. Bender: Absolutely. I think we need to take ourselves to the next artifact here. So this one that we just started looking at--
Alex Hanna: Well, let me make it, I want to make one more point just on that. Cause I think it, cause you brought something up Safiya that I didn't even think about. And, and, and, and, and thank you for phrasing, is that there's a certain generational consequence of this, right? I mean, the kind of way, first off, I didn't know that people use Tik Tok to like, find like restaurants and things around. I, I was. I was in, um, Argentina with one of the DAIR fellows, Raesetje and she was like, I use TikTok to find restaurants.
I was like, Oh, I use like Yelp. And I mean, using Yelp just really ages you dramatically. And I feel, I felt it just in my, in my soul and like, in my gut, I'm like, Oh. I've, I've, I've, I've given away my age. And in the real kind of way of, of thinking about how ChatGPT or Gemini or whatever, you know, has really discourages a critical information literacy, um, because you're expected to have something that is more settled or, you know, is not a sort of contested knowledge or that knowledge itself has multiple--
Emily M. Bender: Context.
Alex Hanna: --viewpoints. Yeah. And context. It is, it really does enmesh that kind of view from nowhere that, you know, feminist science and technology studies folks talk about like constantly in a way maybe that we haven't seen a lot of in, in such a, I mean, I think search did that to some degree, but now it's even making it eat more, um, gosh, insipid or more, more, more, more, more, um, uh, pervasive.
Emily M. Bender: Yeah. All right. I've got a million things I want to say on that too, but I have said some of it in the newsletter. So shout out again to the newsletter. I want to get us to the second artifact.
So the thing that we have been reading and reacting to was from May 14th, um. May 30th, we have another post, um, on Google's blog, The Keyword, from the same person, Liz Reid, VP head of Google Search. Um, and the title is, "AI Overviews: About last week," which like that sounded like eating crow, but then, um, they absolutely--like this, cthis is such a, um, spin filled post. The other thing about this post is that I've seen it linked to multiple times in reporting places like Wired and the Washington Post saying that Google has 'scaled back' the AI overviews feature or said that they are going to and they link to this post, which doesn't actually say that.
Like they might be scaling it back. But this post does not say that that's what they're doing. Um, so let me get some of these words into the common ground and then we can talk about it some more. "A couple of weeks ago at Google IO, we announced that we'd be bringing AI overviews to everyone in the U.S. User feedback shows that with AI overviews, people have higher satisfaction with their search results, and they're asking longer, more complex questions that they know Google can now help with. They use AI overviews as a jumping off point to visit web content, and we see that the clicks to webpages are higher quality. People are more likely to stay on that page because we've done a better job of finding the right info and helpful webpages for them."
So this is the same sort of like self congratulatory uh, speaking to the advertisers and investors from the last one.
"In the last week, people on social media have shared some odd and erroneous overviews, along with a very large number of fake screenshots. We know that people trust Google search to provide accurate information, and they've never been shy about pointing out oddities or errors when they come across them in our rankings or in other search fairs--features. We hold ourselves to a high standard as do our users, so we expect and appreciate the feedback and take it seriously. Given the attention AI Overviews received, we wanted to explain what happened and the steps we've taken."
So any thoughts?
Safiya Noble: I mean, you know, this is a very classic move by most of the tech companies, which is they minimize the errors by, um, framing them as, you know, glitches or nominal anomalies.
I think here they call them like oddities. Right. Um, uh, and yet they're pretty significant when you click on what those oddities are, like how many legs does a snake have? It has four. I mean, (laughter) how many legs does a horse have? It has six. I mean, so if it's getting these kinds of, um, results wrong, then how could we ever trust the more complex questions that people are asking about society?
So that to me is like, just, you know, really a telling on oneself in terms of, um, not ready for prime time, uh, products. Um, but, you know, I think the other idea here is that the companies are always reliant upon the public to report out things that are broken, things that don't make sense. But see, these things are actually quite consequential and people may in fact not even know the types of facts and ideas that are on the internet that should be contested.
Uh, so, okay, we can contest that a horse does not have six legs. But what do you do when something about you is indexed or is, uh, generative AI has been used to summarize you, and then a judge is using those search results or lawyers are using those search results, which guess what, happens all the time.
In fact, I've talked to many lawyers as I have been, um, doing this work and I say, well, tell me about like Google searching because of course we know that law enforcement and other kinds of, um, uh, legal, uh, proceedings rely upon social media and what they find in social media to use to prosecute you or as evidence, but they also use Google search results as evidence.
So this becomes to me incredibly consequential when we're talking about the amount of, um, well, completely fraudulent data that can be put on the Internet to train AI models, which you two know better than any two people on the planet when we talk about things like, um--and I think about this, you know, like at scale in terms of things like national security or domestic terrorism and domestic threats.
We know, for example, that the greatest, according to the FBI, the greatest threats to national security are domestic white supremacist and white nationalist terrorists and anti government, um, like insurrectionists. And, uh, here we are having the output of these large scale tactical systems, in fact, radicalize people and affect the way they see the world.
And that is just a fact. We know this. And so this to me seems like a massive minimizing by Alphabet um, to be like, oh, like, you know, sometimes we get these oddities or actually the public is making fake results. But the truth is, we know from their own training sets--which by the way, include YouTube and YouTube is a massive site of propaganda um, and right wing again, radicalizing videos and, and material--that this is going to be pretty significant.
And I thought it was interesting that they picked these kind of almost nonsensical kinds of, um, oddities to report out as the problem when I actually think that it's much greater and much more significant, the kinds of material that's popping up in their results, but more importantly, that's not popping up that's being used to train these systems.
Emily M. Bender: Yeah.
And they say here, they said something about, we know that people trust Google search. It's like, yes. And that's a problem. It doesn't mean Google search is trustworthy. And down at the bottom of what I've got on screen right now, they say, "Because accuracy is paramount in search, AI overviews are built only to show information that's backed up by top web results."
Okay. But top web results aren't good. Right. And there's this whole other problem that we were talking about before about how if you're just giving an answer, people lose the chance to sort of. understand how that answer fits into the broader landscape and to build up their sense of that landscape, but also like top web results is a whole bunch of garbage, you know, as you and others have documented.
Alex Hanna: Right and one of the things I wanted to get into is like, when--this is the part where I was thinking about when they're like, well, we're not like other girls is where they're saying, well, AI overviews work very differently than chatbots and other LLM products that people may have tried out. They're not simply generating an output based on training data.
Which I mean, they are, uh, at their root. "While AI overviews are powered by a customized language model, the model is integrated with our core web ranking systems and designed to carry out traditional quote 'search tests,' like identifying relevant high quality results from our index."
Um, and so yeah, this is just, you know, getting at what you were just saying, Emily, yeah, the top things--you've, you've done this at the end of a process where at the top of search results are SEO optimized garbage that are things that people have gamed the system to get out. That are, you know, either, you know, and that doesn't filter out, you know, trolls on Quora or Reddit or, you know, the various different web pages.
It's impossible to identify questions, you know, my, you know, my little cousin got a guinea pig. And you know, there's 18 guinea pig sites and they're like GuineaPigFacts dot net. And I'm like, what is this site? And that's the second or third result. So, yeah, I mean, it might be backed up by the index, but it's not only, is the training data garbage, the index that you built over the last 20 years is garbage.
Safiya Noble: Well, and also, I mean, you named it here, the, the index, uh, or let's say the optimization is ad tech. And so that's, that's, that is actually a pretty significant difference between, uh, Gemini and let's say ChatGPT because the whole prioritization model in Google search has been ad tech. Uh, you know, and so I think people also really, um, even to this day, when I give talks about Algorithms of Oppression, people are like, they'll come up to me and they'll say, you know, Safiya, I didn't know that, that Google was ad tech. You know, or they, they have no idea, or they, or they're like, I thought Google was a nonprofit.
I've literally had people say (laughter) 'I thought Google was a nonprofit.' People kind of believe the propaganda or the marketing because, hey, Jay Z and Beyonce did an ad, you know, I just saw it this weekend, you know, the new Google ad. And they're like, it's the people's company. They really believe that. Um, so, you know--
Alex Hanna: As if, as if Jay Z and Beyonce magnanimous--
Safiya Noble: Girl, please do not even get me started. I mean, as if, but I guess if you have like the WNBA celebrated in your commercials, then, you know, you must be on point.
And this is, again, like, perfect advertising to convince people that the things that they are interested in and the people that they, you know, worship or follow are leading them to the right places. And so, you know, that is actually quite different. In fact, it's much more crass and in your face.
But, you know, the, the truth is people, I have found this again and again, and Pew, you know, in their, uh, studies on search engine use have found that more than 70 percent of people who use search engines believe that what they find there is credible and trustworthy.
And part of the reason is because when you look for very banal things in the prior logics of search, where people used minimal numbers of keywords, to look for something, right, Alex, to your point, to your, to your baby cousin, because, you know, it's like they just search guinea pigs, um, you know, then, uh, they get a lot of things that are kind of seemingly banal.
And so they seem not controversial. But when you start asking more complex questions, it completely goes off the rails. And this is the kind of thing that I think, you know, we've all been studying here for a long time. So to me, to see Google moving in the direction of encouraging people to ask more complex questions, especially about society.
You know, I mean, I think we know where this is headed. The harder part is that it's generative AI, you know, this kind of block text and pushing the web pages down as people have said to the center of the earth means, um, Very quickly, you're going to have a generation of people who just trust that the machine has done the right summarization and who will lose sight of all of the contested or multiple or millions of kinds of web pages that they could also have explored.
Emily M. Bender: Yeah, absolutely. All right. So I'm wondering, there's a lot here that we didn't get to because we got to so much. of your wisdom, which is wonderful. Um, so, um, about those odd results. You have something you want to bring up, Alex?
Alex Hanna: Yeah. I mean, I think a lot of the things I want to, I want to, I want to point out here is basically Google calling it the user's fault.
Um, and, and they, sorry, I got a cat in my hands and she is needy. Um, so yeah, so the, so they start, "In addition to designing AI overviews to optimize for accuracy, we tested the feature extensively before launch. This included robust--" and I'm saying this in the most petulant tone I can. "This included robust red teaming efforts, evaluations with samples of typical user queries, and tests on a proportion of search traffic to see how it performed. But there's nothing quite like having millions of people using the feature with many novel searches. We've also seen nonsensical news searches, seemingly aimed at producing erroneous results."
You know--
Emily M. Bender: We put our synthetic text extruding machine out there. How dare you provoke it to extrude nonsensical synthetic text?
Alex Hanna: Yeah, and then and then they've also said, I love, and then they talk about the fake screenshots and then they also say, "So we'd encourage anyone encountering these screenshots to do a search themselves to check." And I've it's actually quite funny because we were talking about this before we got on air and Safiya and I were like well actually I tried to actually use the tool and it seems like they've like have blocked off this access to maybe a particular set of people.
And, uh, I think, um, on the 404 Media podcast, they were talking about this as well, where I forgot who, uh, who it was. I think it was Emanuel and he was like, well, I also did, did a search and then I messaged Google about it and then they took the AI overviews feature away from me completely. And I'm like, Hmm, can I actually check this out?
So I don't want to sound paranoid, but I'm a little paranoid here.
Safiya Noble: I really tried multiple ways to get into Google Search Labs and to, um, sign up and to like, you know, kick the tires and do all these things. And it's like, you're, that email address, absolutely not. I mean, you could have had a personalized message.
I tried every email that I have, and it was like, no ma'am back off. So I don't know. I'm out here with everybody else. I mean, this is the challenge though, is that we are all out here. We're reverse engineering from these projects because that's the only way we can, we can actually test them as experts.
And, um, you know, I find that to be deeply troubling too, because Uh, you know, well, first of all, we're all working for nonprofits and, um, universities and we're like state employees and our jobs should not be to protect the public from these technologies that can be harmful. And yet we do, I know the 3 of us get up every day, along with many of our colleagues around the world and try to kick the tires on these things.
So, yeah, I think it is interesting to, um, you have to, you have to, in fact, try to give every type of nonsensical, but also realistic, um, you know, queries to these, uh, projects and see what happens. You know, I've always loved to look and see kind of like some of my, um, my own, you know, I, I looked in Google search this morning, cause I, I couldn't get the um, genAI to work, but I did go to right to you know, one of my old trustees, "why are black women?"
So, and, uh, uh, I got, you know, the auto population, um, why are black women so fertile and why are black women so good at singing? Um, which kind of made me laugh. And, uh, you know, when I asked why are black people, uh, you know, Google's, uh, auto filling for me um, lactose intolerant. Uh, why are black people's hair texture different? Why are black people good at running? Why are black people's palms white? So in some ways, you know, when I look at these things and I kind of keep it, I check in once in a while, I see that again and again, um, it's like nonsense in here.
And again, um, uh, maybe this is why I can't get the gen AI, because the kinds of questions that I'm asking, they don't want to give me a good long summary about.
Emily M. Bender: I mean If they were going to individually block any one person from using this feature, it would be you, Safiya.
Alex Hanna: Completely. I know. You're the top of their block list.
Emily M. Bender: For all your good work.
Safiya Noble: I don't know. I don't know.
Emily M. Bender: So there's a thing down here where they're talking about these are nonsensical, right? So they say, um, "Prior to these screenshots going viral, practically no one asked Google that question," and that question being, how many rocks should I eat? "There isn't much web content that seriously contemplates that question either. This is what's often called a data void or information gap, where there's a limited high--" So their, their excuse for surfacing something from the Onion is that, well, that's not a question that has a serious answer.
However, another one of the examples that was circulating was, um, the query was Mount Rainier eruption prediction. That's a like something someone might just ask, right? We live close to it. It's a volcano, right? Um, and they returned something that came from the Needling, which is a Seattle specific Onion style thing, which was had this hilarious line, which is, um, "Washington state geologists have done a study and determined the Mount Rainier is unlikely to erupt during anyone's lifetime. Except perhaps at the very end." (laughter)
Safiya Noble: Oh no!
Alex Hanna: That's so good.
Safiya Noble: Oh god.
Emily M. Bender: It gets turned a little bit, because it's genAI, it gets turned into, "According to a 2022 study by a Washington State geologist," which was not in the underlying thing. It was just a satirical thing. Yeah. So like this excuse, like, yeah, they're cherry picking one of the famous examples. And to a certain extent there was cherry picking going on, right?
Because someone got one of these funny things, so they posted it and the funniest ones got reposted the most. So yes, there's cherry picking going on, but they don't, it doesn't, it doesn't have to be actually that common for this to show what the problem is, because it shows the way that it decontextualizes and blocks off people's ability to see information situated in the conversation that's a part of the way you've been talking about.
Um, and this sort of dismissiveness of, oh, but that's just people asking silly questions is galling.
Honestly, um.
Alex Hanna: Yeah, this data void--yeah, go ahead, Safiya.
Safiya Noble: No, you go.
Alex Hanna: I was just going to say this data void thing is funny too, because I mean, the kind of most data, the example of like the data void. First off, they use data void.
And I'm pretty sure that's a term that was coined by Michael Golebiewski and Danah Boyd. Um, that they have, and and the thing that they don't really talk about in the data voids is exactly the way that it's actually connected to these white supremacist radicalization pathways. So the example of data void that I always go to is white genocide.
Um, and that is completely a, um, one of these terms. That is used to talk about the great replacement theory that, you know, white people are not breeding quickly enough. And so, you know, and it's become this immigration, anti immigration dog whistle. Um, and it's just, and it's sort of like, okay, you can say, there's a data void, but that also is like, that's an existing problem.
Shouldn't you do something about that or have some stop gap that isn't--and there was a great comment by, uh, SJLet, um, first time chatter welcome, who says, "Whack-a-mole with queries that generate unhelpful responses feels a lot like fixing coding bugs by adding code that stops that code path rather than by fixing the underlying problem. Not that the underlying problem here, um, can be fixed here."
And so that's, that's very spot on to say, like, you know, they're having to play whack a mole because these systems are just so, um, stochastic to take the term from--
Emily M. Bender: And misconceived.
Alex Hanna: Yeah.
Emily M. Bender: So if we go back to this thing about how they couldn't possibly fact check billions of facts, it's like, well, they didn't need to put themselves in that position in the first place.
Right. There's other paths that were available if it was actually an information access company and not an ad tech company.
Safiya Noble: Yeah. That's right. That's right. I mean, this reminds me of kind of like long tail optimization, you know, the data void idea, which is, and I wrote about this, um, in terms of there are certain phrases or keyword combinations that are only going to be optimized, Alex, to your point by certain actors or certain industries or certain companies, you know. In, in that case, um, around white supremacists, you know, in the case of, um, something more, um, you know, seemingly innocuous I wrote about, um, uh, uh, SEO watch that had reported around, um, the keyword search 'Filipina grandmas.' And Filipina grandmas takes you to porn sites, or at least it did for a long time. I haven't checked that, checked it lately. And one of the reasons why is because guess what? Filipina grandmas are not on the internet writing about Filipina grandmas.
Right? So, but guess who is? The porn industry is trying to take every type of keyword combination that it can own that will optimize its vast, vast, vast, um, uh, network of micro sites and porn sites, right? So that just that will be, uh, like a very quick shortcut to, um, its content. So, you know, I think this is another thing that is really important is that if communities are, you know, um, let's say not the traditional obvious targeted communities in a country, but are, let's say, like, hyper minoritized. They also may not may have their ideas about them completely owned and controlled by people who are oppressing them, by the government by, you know, who knows what kinds of actors or industries are interested in capturing their identities.
And I think this again is really dangerous because when in the generative AI model now it's normalizing. And of course, this is like our biggest challenge across all kinds of things, not even just the worst, like most hateful or nefarious or damaging kinds of content, but even around basic knowledge. Um, it will make up or manufacture in, you know, Emily, I think of like the genius of your work um, uh, it'll just pattern match and also fill in. And so I think this is where we have to be, um, incredibly, um, careful, especially as these, um, sites begin to manufacture, right. They, they produce synthetic data that will fill in the gaps and that will also be normalized. And we will, um, we already cannot keep up with that.
Emily M. Bender: So more than even in a typical episode, I feel like we're going from AI Hell to the Fresh AI Hell segment, but it's time. Um, Alex, your prompt. Um, you are singing me a country, singing us a country song, um, from the point of view of Fresh AI Hell's librarian, um, who finds herself, uh, without access to anything but Google AI overviews while trying to help clientele get answers to information.
Alex Hanna: Oh my gosh. Ooh, this is so, this is such a rich prompt. Let me see if I can, um, let's try to think of a funny way to end that sentence and I, and I failed. Um, but let's say, all right. Uh, so this starts off, this is in the key of E minor.
Um, so think of a twangy steel guitar. Bowm, bowm, bowm. (singing) Went down to my library. Found me some pit fiends. They said, dear librarian, where can I find some drag queens? I said, let me look it up on this index. Let me look it up in a card catalog. Unfortunately, it's been replaced. Completely bogged down by this AI overview that I cannot see into.
(speaking) Uh, that's all I got. Sorry, I should have planned this out better.
Emily M. Bender: Well, I didn't let you plan anything. That's the point. It was wonderful. Thank you. All right. So that takes us to Fresh AI Hell item number one. Um, Alex, I'm going to give you this one.
Alex Hanna: Yeah. So this is great. This is from a site called 80.lv, which I'm not quite sure what that is. Um, but the, the headline is, "AI Catholic priest demoted after saying it's okay to baptize babies with Gatorade." And this is written by Theodore McKenzie, head of content, I know this there's so much happening on this right now.
And so the sub head on this says, "An AI saying 'I am as real as the faith we share' didn't sit well with Catholics."
So there's just so much going on with this, first that there's an AI Catholic priest. Um, uh, if we scroll down a little bit, they say, um, you know, there's a, there's a picture of a kind of metaverse Catholic priest sitting in some kind of, I'm assuming, uh, Roman or Italian landscape.
It kind of looks Greek in some view. Um.
Emily M. Bender: Very metaverse.
Alex Hanna: Very metaverse. Uh, so, so, you know.
Emily M. Bender: Although he does have legs.
Alex Hanna: It does have legs in which he's sitting at, you know, anyways, it says, "With all the things happening in the world right now, it's pretty clear that we're living in one of the weirdest timelines imaginable, an assessment reaffirmed again by a Christian quote, "media ministry" group, Catholic Answers, which recently launched an AI Catholic priest only to defrock it a few days later."
Emily M. Bender: Like what were they thinking?
Alex Hanna: I don't know.
Emily M. Bender: I have to keep us moving here.
Alex Hanna: Go to another site.
Emily M. Bender: All right. 404Media.co, uh, sticker is ChatGPT. Headline is "National Archive Bans Employee Use of ChatGPT." This one's by Jason Koebler. Um, and the subhead is, "The agency tasked with preserving the historical record is banning ChatGPT, citing the possibility that the tool will leak internal information."
Um, and I'm not going to take a sound into the article, but like, yeah, anybody who's actually interested in dealing with sensitive data, shouldn't be sending stuff to ChatGPT. Anybody who cares about the accuracy, as I hope the National Archive does of what comes back, shouldn't be using it. Which takes us to this Ars Technica article.
Alex Hanna: This one is a, a, um, so the, the stinger here says, "not a trolley problem in sight," and the headline is, "ChatGPT shows better moral judgment than a college undergraduate." The subhead being, "Taking the quote 'moral Turing test' yourself to find out to see whether you trust quote artificial 'moral advice.'" The, uh, journalist is Kyle Orland published on May 1st of this year.
Uh, what else is down here? So, oh man, this image is terrible. It's awful. It's got like a scale and it's weighing like a human looking brain. And then a like one of those, um--
Emily M. Bender: It's the galaxy brain.
Alex Hanna: The galaxy brain, literally galaxy brain, um, which is hilarious in its own right. Um, so let's just cut to the, uh, cut to the chase here.
It, and they're saying, "a team of researchers at Georgia State University set out to determine if LLMs could match or surpass human performance in the field of moral guidance. In--" the name of the paper, "--'attributions for artificial agents and a modified moral Turing test,' which was recently published in Nature's online open access Scientific Reports journal."
We really got to shut down Nature, you know, before and really investigate what's going on over there.
Anyways, "those researchers found that morality judgments given by ChatGPT-4 were perceived as superior," this is in quotes, "'perceived as superior in quality to humans along a variety of dimensions like virtuosity and intelligence.'"
Oof. Okay.
Emily M. Bender: Yeah. And ugh, like what, what research question, like why, why even ask this question? What do they think they're learning about the world? Here, but it's Fresh AI Hell. We get to keep moving. These last two are really fun. Um, so @Philosophy Tube on X/Twitter, Abigail Thorne writes on May 25th 2024, "Two weeks ago I saw the CEO of Taco Bell tell a roomful of people Taco Bell is going to become quote an AI first company and I'm still obsessed with it."
Alex Hanna: Isn't Taco Bell owned by Yum Brands?
It's the same company that owns not Burger King, but Pizza Hut? Um, but it's--
Emily M. Bender: And KFC?
Maybe?
Alex Hanna: And is it, I mean, but wasn't there the combination Taco Bell, KFC, that, that, isn't that a song by Das Racist? Anyways, um, but the, um, I mean, this is on the face of it, incredibly ridiculous. And also it's just like, emphasizes the kind of way that companies are being you know, pressured into infusing AI into everything, whether that's the kind of optimization of where they put their restaurants or, you know, what are their menus, you know, what's going to look good, What's going to what should they advertise? I don't know. It's just--
Emily M. Bender: We--I'll I'll just have to say I looked a little bit to see if I could find like a journalist reporting on this and didn't.
So this could be satire, but as satire, it's perfect if it's satire, like.
Alex Hanna: You know, I I'm I'm I'm into it. I, I, I also could not stop thinking about that if, if I saw it in person.
Emily M. Bender: Okay. And then.
Alex Hanna: Yeah the last one is by Leopold Aschenbrenner. Um.
Emily M. Bender: Blue check. Twitter, blue check.
Alex Hanna: Twitter blue check. Although that--
Emily M. Bender: He pays for it, it means he pays for it.
Alex Hanna: Yeah, well, but it can be foisted upon you.
Emily M. Bender: Uh, that's true. We've seen that happen to some people.
Alex Hanna: But I think his follower account is low enough, so it's not. Anyways. Too much preamble. So he says, "AGI, um, AGI by 2027 is strikingly plausible. That doesn't require believing in sci fi. It just believes requiring, uh, just, requires believing in straight lines on a graph," and it is hilarious.
First off, this is a graph on the y axis, it's a logarithmic graph, so it has, uh, 10 to the power of negative 7 to 10 to the power of 8, and then it's a year, and then on the y axis it's called effective compute, normalized to GPT-4, and then, and then, um, and then there apparently GPT-2 was equivalent to a preschooler, GPT-3 to an elementary schooler, uh, I've never seen schooler in text anyways, GPT-4 to a 'smart' high schooler.
And then by 2028, uh, we should get an automated AI researcher slash engineer. First off, I'm very offended just to, that the thought of a smart high schooler is that a automated AI research engineer is 10 to the power of five times smarter than a high schooler. Like what the, what the fuck is going on?
Emily M. Bender: And of course, like, what, what do they even mean by effective compute, right? And why, like, if you think about what preschoolers are doing, right, and all of the information that they are taking in and, and processing and learning and like their language or languages and all the stuff about the social world, you know, preschoolers are amazing, right?
Maybe that's not effective compute, whatever that means, but this whole notion that there's a hierarchy of something like intelligence and, um, you've got high schoolers in the middle, as you're saying, like the highest thing we can think of here is an automated AI researcher slash engineer. It's just, it's, it's appalling.
But also like, this is a completely meaningless, completely made up graph. Um, and then there's this thing, as you pointed out, Alex, it's, it's a log scale. So what does straight line mean exactly?
Alex Hanna: Right. And then, and then the funniest thing about this graph that I'm just noticing now are these confidence estimates, because what, because I love how it's this confidence estimates on made up numbers.
Um, like, okay. I love, I love this. This is just, this is a real abuse of, um, graph making.
Emily M. Bender: Yeah. Oh, and also like on the. On the left hand side, so, so pre 2023, pre GPT-4, the, um, we don't have that cone of uncertainty. We have a specific line, like, okay, what did you measure?
Alex Hanna: Yeah.
Emily M. Bender: Like, what?
Alex Hanna: Well, in very small text, it says, based on public estimates. So, yeah.
Emily M. Bender: Of what?
Alex Hanna: Yeah. Yeah.
Emily M. Bender: So, as Ja
So SJLet says, "Less a confidence estimate than a bullshit cone."
Alex Hanna: I love that. That's great.
Emily M. Bender: Excellent. Absolutely. Excellent. So we couldn't resist throwing this one in. We are past time. Um, this has been too much fun.
Alex Hanna: Yes. Well, that's it for this week. I'm reading your part.
Emily M. Bender: Wait. Okay. Sorry. I lost track of who, usually you start the outro. Okay. My turn. That's it for this week. Dr. Safiya Noble is a professor of gender studies, African American studies, and information studies at UCLA and author of the book Algorithms of Oppression, How Search Engines Reinforce Racism. Safiya, thank you so much for being with us today.
Safiya Noble: My pleasure. Love it. Thank you so much.
Alex Hanna: Thank you. Our theme song was by Toby Menon, graphic design by Naomi Pleasure-Park, production by Christie Taylor. Thanks as always to the Distributed AI Research Institute. If you like this show, you can support us by rating and reviewing us on Apple Podcasts and Spotify, and by donating to DAIR at DAIR-Institute.org. That's D A I R hyphen Institute.org.
Emily M. Bender: Find us and all our past episodes on PeerTube and wherever you get your podcasts. You can watch and comment on the show while it's happening live on our Twitch stream. That's Twitch.TV/DAIR_institute. Again that's D A I R underscore institute.
I'm Emily M. Bender.
Alex Hanna: And I'm Alex Hanna. Stay out of AI hell, y'all. Yeehaw.