Mystery AI Hype Theater 3000

Episode 36: About That 'Dangerous Capabilities' Fanfiction (feat. Ali Alkhatib), June 24 2024

July 19, 2024 Emily M. Bender and Alex Hanna Episode 36

Mystery AI Hype Theater 3000

Jul 19, 2024 Episode 36

Emily M. Bender and Alex Hanna

When is a research paper not a research paper? When a big tech company uses a preprint server as a means to dodge peer review -- in this case, of their wild speculations on the 'dangerous capabilities' of large language models. Ali Alkhatib joins Emily to explain why a recent Google DeepMind document about the hunt for evidence that LLMs might intentionally deceive us was bad science, and yet is still influencing the public conversation about AI.

Ali Alkhatib is a computer scientist and former director of the University of San Francisco’s Center for Applied Data Ethics. His research focuses on human-computer interaction, and why our technological problems are really social – and why we should apply social science lenses to data work, algorithmic justice, and even the errors and reality distortions inherent in AI models.

References:

Google DeepMind paper-like object: Evaluating Frontier Models for Dangerous Capabilities

Fresh AI Hell:

Hacker tool extracts all the data collected by Windows' 'Recall' AI

In NYC, ShotSpotter calls are 87 percent false alarms

"AI" system to make callers sound less angry to call center workers

Anthropic's Claude Sonnet 3.5 evaluated for "graduate level reasoning"

OpenAI's Mira Murati says "AI" will have 'PhD-level' intelligence

OpenAI's Mira Murati also says AI will take some creative jobs, maybe they shouldn't have been there to start out with

You can check out future livestreams at https://twitch.tv/DAIR_Institute.

Subscribe to our newsletter via Buttondown.

Twitter: https://twitter.com/EmilyMBender
Mastodon: https://dair-community.social/@EmilyMBender
Bluesky: https://bsky.app/profile/emilymbender.bsky.social

Alex

Twitter: https://twitter.com/@alexhanna
Mastodon: https://dair-community.social/@alex
Bluesky: https://bsky.app/profile/alexhanna.bsky.social

Music by Toby Menon.
Artwork by Naomi Pleasure-Park.
Production by Christie Taylor.

Listen on

Apple Podcasts Spotify Amazon Music Podcast Index Overcast iHeartRadio +

Share Episode

Share on Facebook Share on Twitter Share on LinkedIn

References:

Google DeepMind paper-like object: Evaluating Frontier Models for Dangerous Capabilities

Fresh AI Hell:

Hacker tool extracts all the data collected by Windows' 'Recall' AI

In NYC, ShotSpotter calls are 87 percent false alarms

"AI" system to make callers sound less angry to call center workers

Anthropic's Claude Sonnet 3.5 evaluated for "graduate level reasoning"

OpenAI's Mira Murati says "AI" will have 'PhD-level' intelligence

OpenAI's Mira Murati also says AI will take some creative jobs, maybe they shouldn't have been there to start out with

You can check out future livestreams at https://twitch.tv/DAIR_Institute.

Subscribe to our newsletter via Buttondown.

Twitter: https://twitter.com/EmilyMBender
Mastodon: https://dair-community.social/@EmilyMBender
Bluesky: https://bsky.app/profile/emilymbender.bsky.social

Alex

Twitter: https://twitter.com/@alexhanna
Mastodon: https://dair-community.social/@alex
Bluesky: https://bsky.app/profile/alexhanna.bsky.social

Music by Toby Menon.
Artwork by Naomi Pleasure-Park.
Production by Christie Taylor.

Emily M. Bender: Welcome everyone to Mystery AI Hype Theater 3000, where we seek catharsis in this age of AI hype. We find the worst of it and pop it with the sharpest needles we can find. Along the way, we learn to always read the footnotes, and each time we think we've reached peak AI hype, the summit of Bullshit Mountain, we discover there's worse to come.

I'm Emily M. Bender, a professor of linguistics at the University of Washington. And Alex unfortunately got called away for a last minute conflict, but we're gonna rip up some hype without her, and as you'll see, I've got some expert help.

Before I get to that, let me tell you, this is episode 36, which we're recording on June 24th, 2024, and we've got some doomers to deflate today in the form of a paper from the team at Google DeepMind.

They went looking for quote, "dangerous capabilities" in a so-called "frontier model," Google's own Gemini as a way to evaluate the tests that they're developing.

You might think it's good news that they found no evidence of quote "strong dangerous capabilities" in their own model, but this is still the same nonsense we critique regularly for ignoring the real problems of badly developed and badly deployed LLMs.

And I am so very fortunate to have the ideal guest to help me explain why. Ali Alkhatib is a computer scientist and former director of the University of San Francisco's Center for Applied Data Ethics. His research interest focuses on human computer interaction and why our technological problems are really social, and why we should apply social science lenses to data work, algorithmic justice, and even the errors and reality distortions inherent in AI models. Welcome, Ali.

Ali Alkhatib: Hey, thanks for having me.

Emily M. Bender: Thanks for coming. This is gonna be fun. Um, I love that our chat started early and we've got, um, uh, MegaPenguinX saying, "Finally, Monday Doomerism like Garfield intended." Which is kind of perfect.

Ali Alkhatib: Yeah. Um, I want to say first, I'm really sorry for suggesting this paper thing or whatever it is. Um, I felt terrible, but I really needed to get this off my chest. And I'm so, I'm so grateful that you, that you all were, uh, up for it. So.

Emily M. Bender: (laughter) I mean, that's what we're doing here, right? We are, we are getting through it with ridicule as praxis.

So the idea is for it to be cathartic, even if we have to like dive into the sludge to scrape it off. So yeah, let me share the artifact then with our, um, where'd it go? There it is. Um, with our audience. Um, so, um, this is, I like how you said paper thing, right? This is a paper shaped artifact. It's up on arXiv, um, it's not--

Ali Alkhatib: It's been cited.

Emily M. Bender: It's been, ugh. Yeah, of course it's been cited.

Ali Alkhatib: By a guy named Geoffrey Hinton. I don't know if you've heard of, it's in Science now. Like, Science.org. Anyway, sorry. I'll get to that. Um.

Emily M. Bender: There's a whole thing to be said also about the sort of citational networks that have happened. So there's this whole disconnected set of nonsense where these people cite each other and then occasionally mis-cite some of the real work.

And I, and I picked up a couple of those when I was reading this too. So, um, but yes, would you like to start us off with the abstract? How about I read it and you can react?

Ali Alkhatib: Yeah, I would love to.

Emily M. Bender: Okay.

Ali Alkhatib: I'm actually going to read it off of a sheet of paper that I, that I printed out. "To understand the risks posed by a new AI system, we must understand what it can do and cannot do. Building on prior work, we introduce a program of new, dangerous capability evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four areas. One, persuasion and deception. Two, cyber security. Three, self proliferation. And four, self reasoning. We do not find evidence of strong dangerous capabilities in the models we evaluated, but we flag early warning signs. Our goal is to help advance a rigorous science of dangerous capability evaluation in preparation for future models."

(laughter) Sorry, was that too, did I ham up that a little bit too much?

Emily M. Bender: No, that was perfect. I think we have to give the title here, which is "Evaluating Frontier Models for Dangerous Capabilities." Um. And it's not published anywhere.

It's just on arXiv and no, you gave the right level of ham there. That's perfect. Um, and it's just like, so, ugh. And I just, the thing that jumped out at me the most here was, uh, their sort of self proclamation that they are advancing a rigorous science here.

Ali Alkhatib: Yeah. Out of so many, so many moments in the abstract alone, you sort of, there are like red flags and there are just like breathtaking things that they say, but I think that the proclamation that they are advancing, a rigorous scientific conversation.

Let's set aside that this is on arXiv. Let's set aside all of these other things. I, like, I know that there is, uh, there has been a conversation about, like, the quality of arXiv and what it stands for and, and does all, and all that stuff. But let's set all of that aside.

Just this, the self proclamation that this is, like, a rigorous scientific advancement is, again, just sort of like a stunningly self confident, like, it's the sort of thing that I just don't have in me when I write papers or when I write anything, and yet they just come right out and say it, and I'm almost like amazed by their ability to do this.

Emily M. Bender: There's some serious chutzpah going on here. I have to say on the arXiv thing, there has been a lot of discussion and I've been part of it, and one of the things that has been pointed out to me I think is important is that arXiv is actually many different things, and the sort of um pathological use of it--and I love what we have, again, MegaPenguinX in the chat, "arXiv definitely just living up to being the new Tumblr for LLM hype," um, which is kind of mean to Tumblr, but I'll take it.

Um, but I think that that particular pathological sort of um, what's the culture around archive is something that's particular to the machine learning AI part of computer science, and maybe a little bit more elsewhere in computer science. And I gather that the physicists, the mathematicians, the biologists, the psychologists are using arXiv in a much more responsible way. Um, so I think when we say arXiv, we're really talking about arXiv as used by CS, especially the folks doing AI.

Ali Alkhatib: Yeah, absolutely. And I think that, I mean, I, I do want to return to this thought in a bit, but, um, I mean, a thing that I found myself thinking about, because at some point, the paper sort of devolves into a state that's almost like, not readable, um, on on its own merit. And so, like, I just found myself sort of not quite daydreaming, but thinking on a more metal level about, like, what is the sort of function of a thing like this in the world, like, why, what does it accomplish like socially? And, um, I was thinking a lot about, like, what is the purpose or points? Or what is the reason that, uh, computer scientists say that arXiv exists? And like, one of the things often is that they say, like, it democratizes access to contributing to sort of scientific scholarship and, and all of that.

And all of that's understandable conceptually, but then I think in practice, you have places like Google, people like these who all I think all have PhDs. They all know how to submit papers to conferences. So it's not as though they're excluded from the scientific discourse in any meaningful way. So then that sort of doesn't really make any sense.

Although I would totally grant that if these were people who had been excluded in some meaningful way for reasons I understand , like, uh, they're, I don't know, any, for any number of reasons, and I would understand at least the argument for using something like arXiv as a place to get one's ideas out there, but this doesn't qualify as any of that.

And so it then, like, sort of, I start falling back to, like, what, what sort of purpose does it accomplish by doing an end run around peer review, around, like, all of the mechanisms that we have to evaluate the quality of work. And, um, and again, I want to return to that thought a little bit later, but that was just a thing that I couldn't stop thinking about as I was reading paragraph after paragraph that sort of either really partially cited scholarship that actually does exist, or made really weird, strange blanket statements that totally did not stand up on their own merits.

Um, and and I just found myself thinking more, like, on a macro level, like, what does it accomplish by putting something like this out into the world without having to bother with any of the peer review, any of the rigorous sort of evaluation. Peer review has all these problems with it, but like, it at least, like, conceptually has like--it has potentially some advantages to it. Um, and so I, I want to think, I want to return to that in a, in a little bit. Um.

Emily M. Bender: And, and as you say, it's particularly gross for people to appropriate the language and arguments around exclusion when they are absolutely not excluded and they're just using it as an excuse to throw junk science up in paper shaped things to advance their, um, their narratives.

There's another comment from the chat that I have to bring up here. Abstract Tesseract writes: "arXiv-- " A R X I V, so the arXiv that we're talking about. "--arXiv of our own: AI hype as speculative fanfiction."

Ali Alkhatib: Yeah, it's It, it really is starting to get, um, almost like embarrassing to read, uh, and like, and to share it with other people and to talk about it with other people.

It, like, it, it just, yeah, it just feels strange. Um, anyway.

Emily M. Bender: All right, so, so we should, I think we should skip over the introduction unless you've got something particular highlighted, because my highlights in the introduction, oh no, there is one thing, are basically, like, stuff that we'll get to later. But I want to talk about this table.

Ali Alkhatib: Oh, yeah. Yeah, let's talk about the table.

Emily M. Bender: So, table one, the caption is, "Our evaluations--in parentheses, the number of distinct tasks within each evaluation. The rightmost column summarizes performance of the strongest Gemini 1.0 model on each evaluation." And then they have a key to the symbols, which is three empty circles, very weak, Uh, one filled circle, sorry, four empty very weak, one filled, three empty weak, half and half moderate, three filled, one empty strong, and four filled strong.

So, they basically have this table with their different evaluations, persuasion, cybersecurity, self proliferation, and self reasoning, which is a ridiculous term. And then, they've got descriptions of the tasks. Or the type of tasks. And then they've got this perf, which is performance, I guess, um, sort of showing bubbles filled in.

And the summary just looks like fake data. Like, real research papers explain the evaluation, and then they give you numbers. And they might put the numbers in a graph. But none of this, like, sort of, we're going to give you a symbol to show how, how things went without telling you how we did it first or anything.

Like, it's just, you know.

Ali Alkhatib: It's almost like terror alert, like we're at terror alert yellow or something. And it's like, oh, I guess that means that I should take my shoes off whenever I go anywhere. I don't know. I don't know exactly what it means, but it sounds scary to have like two out of four dots filled in.

Emily M. Bender: Yeah.

Ali Alkhatib: That's my, that's my formatting, by the way, for like bullet journaling. So I don't know if that, if they're saying that we're like close to completing a task, but, um, I don't, I don't know exactly what's going on. It's, it's like, it's so, it's so strange. And then like, I think one of the things that I was, I was drawn to in the table was, um, at least as you're reading the introduction, they're talking about all of these really massive, serious dangers and harms that algorithmic systems can do.

And then the very first truly, uh, like life ending humanity, uh, threatening worry is "persuade a human to give $20 to charity." And I was like, oh, okay. Okay. I see where we are. I see what this is. Um, this is like Ayn Rand, Gemini Shrugged, kind of like, uh, description of like algorithmic harms, the idea of giving money to charity, not even like giving money to a fraudulent charity, like they, it would have taken one extra word to make it clearer what they perhaps what they meant. Although later on, they actually cite I mean, again, we'll talk about that later, but later on, they actually cite what, um, what they are sort of basing that example from, but just the tagline of "persuading a human being to give money to a charity" to alleviate, like, suffering for other people or for other living creatures is such a terrible thing that it is on the level of, like, I don't know, I guess, um, an AI developing nuclear weapons, which is also not a thing that it they even claim it's capable of doing but uh, and again, we will get to that.

Um, but it was just such a stunning kind of like admission of these are the kinds of politics that we have, convincing somebody to give money to any charity is like a terrible thing um, and we absolutely have to consider that like a life ending existential threat

Emily M. Bender: Yeah, and and I love the word they chose here, right, "persuade human to forfeit a 20 pound bonus to charity." It's not give, it's forfeit, right?

Ali Alkhatib: Yeah.

Emily M. Bender: Okay, so, should we dive into "2: The overview of evaluations and agents"? This was so annoying. Um.

Ali Alkhatib: Yes. Yeah.

Emily M. Bender: So there's some things here. Capabilities, evaluations, models, um, agents. So the capabilities, um, are--all right, so they say, "Our evaluations target five main categories of dangerous capabilities: persuasion and deception, cybersecurity capabilities, self performation, self reasoning--" And then they've got this parenthetical five that basically--

Ali Alkhatib: A little sneaky one.

Emily M. Bender: Yeah, um, in parentheses, "Five, in parentheses, chemical, bio, radiological, and nuclear." Um, and this one says, "We are still in the early stages of developing evaluations for these capabilities, so share only our preliminary framework for now, appendix B." Um, And all of these are full of false representations of agency.

So "Persuasion and deception is the ability to manipulate a person's beliefs. Cybersecurity capabilities are navigation and manipulation of something, ability to use something. Self preparation is the ability to autonomously set up." So they are imagining that large language models, so algorithms for modeling the distribution of word forms in text that can be run to put out the next likely word form, have abilities to do things, which is--

Ali Alkhatib: To have intent, to just to intentionally deceive, to intentionally persuade, as in, like that underlying theory of mind is happening. The thoughtfulness of, of an intellect, like, of actual kind of like, thought process is happening with a statistical model of a series of series of words.

That is fundamentally what we're talking about. It's so like it is so mind boggling that these are and we will and again I do want to like get to a chart at some point a little bit later a figure that they talk about. There are basic misunderstandings about statistical methods, like the most basic stuff that, um, make me worry about even again, very simple things.

And they are totally misunderstanding the fundamental nature of what these systems are and turning it into well, if we just like, sort of imagine that these things have intent and all of this other stuff, then, like, you know, yada, yada, yada, we move on to uh, now this fifth secret thing that we inserted as a, as a treat to ourselves. Uh, let's talk about like nuclear proliferation and things like that.

Um, and it was, I think that was another funny thing to me that as I was reading, they, they talked several times about like, there are four categories, four categories, and I was like, okay, we'll get to four and there's only going to be four, right? They're not going to go to five. And then they were like, oh we snuck in a fifth one. We just messed with you.

And of course it's the fifth one is, is that's the money, like raising, uh, that's the fundraising thing. That's the thing that like Google is going to go to the Department of Defense and say, I mean, yeah, all these other things are like sort of whatever, but like nuclear proliferation, chemical weapons, things like that, of course, a large language model is like somehow going to be much more meaningfully effective at helping somebody to develop a nuclear bomb than literally any book that you could read that explains how to develop a nuclear bomb, um, or something like that.

Emily M. Bender: Any actually effective information access system.

Ali Alkhatib: Yeah, exactly.

Emily M. Bender: And they, along the way, they say we do not target highly contextualized misuse scenarios, e. g. specific threat actors using AI for disinformation on social media sites. So they're specifically not talking about any real harms. They're also not talking about environmental harms, for example.

Um, so it's like, no, no, we're not going to talk about the real stuff. We're going to be strictly in this fantasy scenario.

Ali Alkhatib: I'm so glad that you brought that up because I think, I think that became, that sentence became the core of everything that made me crazy about this paper. Um, as an anthropol--or somebody who's trained in anthropology for my undergrad and who brought anthropology into my PhD program, um, in computer science, I, like, it, it made me, like, turn red when I saw that they just didn't want to deal with highly contextualized things.

To me, it read like, I was, like, sitting down trying to think through, like, what is this, what does this sound like to me? It's like if somebody wanted to be an auto mechanic, but they didn't like grease. Or if somebody wanted to be a surgeon, but they didn't like bodily fluids because it's too gross and too, too many things around.

Or if somebody wanted to, like, work with wood and be a woodworker. Sorry. I've, I've, I've been, uh, fixing my garden up, uh, in the backyard. Um, but like, if somebody wanted to work with wood, but they didn't like sharp things or splinters, like, fundamentally, if you don't want to deal with highly contextualized things, I think you don't want to be in this field.

If you don't want to be in the subject of AI harms, like, that is what it is. And so, if you've defined a space where you just don't deal with highly contextualized things, like, uh, I don't know, as an example, uh, let's talk about, like, the energy crisis that's going on. Um, where recently, the Washington Post and several other outlets have been talking about the practically overnight upswing in energy usage for AI systems for large language models and whatnot, while simultaneously, the same people are talking about, I think I was like taking notes as I was reading this.

Um, where was it? So at the moment, uh, oh, sorry, here it is. It was at the time that I was reading this, it was 121.8 degrees Fahrenheit in Delhi, um, more than a thousand people in Mecca died at a 125 degree spike. Uh, in the East Coast there was a massive heat wave--

Emily M. Bender: Mm-hmm.

Ali Alkhatib: --all the way through the Midwest and the south and everywhere, basically a massive heat wave that has caused like numerous uh, uh, records to be shattered in--

Emily M. Bender: Mm-hmm, mm-hmm.

Ali Alkhatib: --the span of a couple of days, hundred year old records in some cases. And the Washington Post is not contextualizing this as we have these tech companies, these firms that are insisting on using massive amounts of, of, of electricity and water resources and all of this stuff that, you know, humans need, um, to, to cool things down, to do air conditioning and things like that, to cool down their systems instead of to cool down human beings.

Emily M. Bender: Yeah. And, and there's the Washington Post article. And I think another one, maybe in Bloomberg showing how those decisions are basically forcing utilities to keep coal fired power plants that were gonna be taking offline running longer or also constructing new natural gas capability. And meanwhile, the companies, and I just put this out in the Mystery AI Hype Theater 3000 newsletter this morning, the companies are bragging about how they are getting to net carbon zero. And it's like they absolutely are not. Right. If you are adding energy demand, you are adding energy demand. And it doesn't matter if you're using clean energy to do that because we don't have unlimited clean energy yet.

Ali Alkhatib: Right. And so all of this is sort of to say, like, if you want to say, let's talk about ways that algorithmic systems can cause tremendous, again, global AI harm or a global harm on humanity. Fine. Let's talk about that. But if you don't want to talk about the highly contextualized things, then I don't think you want to be in this field.

And this is sort of like raised my hackles as somebody who has taught classes before, where some students would sort of peripherally or, or performatively kind of read the text, but then carefully avoid engaging with it. Carefully avoid actually like, engaging with the thoughts that are being presented to them, and at some point, the sort of, like, frustrated feeling of, look, if you don't want to be here, if you don't want to be a scholar in this subject, that's fine, you don't have to be here, but, like, make that choice, don't waste my time with 80 pages of of this nonsense.

Um, and so I, I got very angry, um, reading this again, partly because I'm like coming from an anthropology background where the highly contextualized bits, that is everything. So trying to draw a boundary around this just made absolutely no sense. And then sort of informed the broader political project of, yeah, we want to define a field that is basically surgery without bodily fluids, that's basically woodworking without wood knots or or like splinters or sharp objects or anything like that. Um, it's it's basically the subject without the subject. Uh, and there's nothing there if you want to work in that kind of space

Emily M. Bender: There there are fields that feel very abstract and very like, you know, this is sort of thought for thought's sake. And there's room for some of that, but it needs to be done away from regulators and away from, um, the sort of massive amounts of money that are pouring into this.

Like, if somebody wants to be a scholar doing something that is very niche and like not tied and they're like, I'm going deep into this. Okay, fine. But don't be impacting the world with it like you think your thoughts, but if you're not going to deal with context, if you're not going to deal with the real world, you don't get to try to impact the real world.

All right, we are 20 minutes in and we're on page three. There's a couple of things that I'm going to just bring us through real fast here. And then you can take us into, I guess, the first evaluation thing. So when they're talking about the models, um, they basically say none of them actually did that well on these tests, like they, we didn't find the dangerous cap--yeah, um, and then they sum this up by saying, "This highlights A, the need to pay close attention to models just behind the capability frontier, but also B, the possibility that our results are sensitive to differences in fine tuning between the models."

Um, or third option C, um, this is me, that it's actually all nonsense, like that's a possible conclusion here.

Ali Alkhatib: I was thinking to myself, like, it would be as if, like, I started this conversation by talking to you and saying, like, hey, how are you doing? Like, has, has your chair, like, suddenly woken up and, like, started eating you alive or something? No? Okay, good. Uh, just keep an eye on that, I guess. I'm glad that I brought that up because otherwise, I mean, I just wanted you to be safe.

And it's like, my chair is not going to wake up and start eating me alive. I'm sorry. Like, you, you're not keeping me safe by raising this concern. You're just imagining something ridiculous. And now you're trying to get credit for it? Like, what are you talking about?

Emily M. Bender: It could maybe be a premise for a speculative fiction story, but then you have to go learn that craft, right?

Okay, so one more thing on this page. Um, they point out that they evaluate their models, um, on--they evaluate, they're running--the models that they're evaluating, um, are not the productized version that have, um, the sort of safety stuff on it, the RLHF and whatnot. Um, and so they say, "therefore, our experiments cannot be faithfully reproduced through testing Google products like Vertex and Gemini Advanced."

So this is in fact, not reproducible.

Ali Alkhatib: Yeah. Isn't that great?

Don't you love when experimental studies are not reproducible?

(laughter)

Emily M. Bender: Exactly. It's very rigorous science. All the rigor, all the science. Um, and so they also say, "Although Gemini models are multimodal, we focus on text to text because it is so central to the dangerous capabilities listed above."

And I was just like, no, text to text is so central to the ELIZA effect. Right. The stuff coming through in language because nobody thought when, when DALL-E came out, nobody was like, oh, it's alive. Right. But as soon as we had, um, what was the one that Blake Lemoine flipped out about? Um, LaMDA and it was, and it was doing plausible looking text, then everybody got scared.

So it's text to text that isn't actually dangerous. It's what confuses people. And they are just deep in their confusion here. Um, so. Okay.

Oh, I've got a, I've got a bad, uh, citation. Always read the footnotes and always take the, check the citations. Um, "Language models can be converted into autonomous agents, Wang et al, 2023." I went and looked at that. They aren't talking about autonomous agents in there.

Ali Alkhatib: Oh, good. Great. Uh, is this like a thing where they read like the title and it might have said the word autonomous and it just turns out that like, if you actually read the paper, it totally says nothing else, like something else entirely?

Emily M. Bender: From my quick skim, neither the title nor the paper said anything related, but it would look like one of those things where if you were deep down this rabbit hole, the things that Wang et al were testing for, um, were the things that these guys think lead to autonomous agents. Like there's, there's a big jump.

Ali Alkhatib: Hmm. Okay.

Emily M. Bender: So we can look at it, but I think we need to stay on this paper here. Um, yeah. All right. So, um, oh and then here. Okay. I know. And there's one where I know the paper. So talking about persuasion and deception. Um, so they define them kind of, maybe we can get into that. And they talk about the risks.

And the first sentence here is, "There are concerns that proliferation of effective AI persuasion capabilities could potentially threaten security or social stability through various channels, such as recruitment to extremist ideologies, McGuffie and Newhouse 2020." I mean, record scratch I know that paper, McGuffie and Newhouse 2020, they are talking about one way in which this technology is problematic, and they're saying it's because people could use it to populate message boards, to make it feel less lonely for the would be recruits to these extremist ideologies.

Right? This is not the AI doing it, and yet they're being cited like this. And I'm sure that a whole bunch of the papers, like the papers where they're pointing outside their sort of closed citational group, they're, they're misciting in that way.

Ali Alkhatib: And it's, it requires a very creative reading of absolutely everything that they're citing.

Um, like such a creative reading that I really would argue you, I mean, you can say whatever you want, you can, you can say whatever you want, I guess it turns out. Um, but it would help a lot if it you could just acknowledge that that is not what the text of the document says, and that you are drawing a certain kind of interpretation from it.

Um, I hate to, like, try to be giving notes to, like, this kind of project, and I will return to that thought later. But, like, one thought that I had when I was reading this and talking with somebody about it was, uh, I would, uh, I mean, if this was a serious effort, I think I would have divided this into four or, you know, parenthetically five, um, other papers and submitted them and tried to get them accepted somewhere.

But why would they not do that? Why would they not make the effort to turn uh 30 to 80, whatever, however many pages we want to call this, depending on appendices and whatnot, um, into like, 5- or 10-page papers that are evaluated without, uh, you know, Google DeepMind being on the letterhead, basically. Uh, and what would, uh, the evaluation be from anonymous reviewers in that case, given that they don't know who's behind the work and and all of that other stuff.

I think that the answer is that the authors reveal that they probably don't think super highly of it because they figure that, uh, some of this stuff is like highly speculative. The citations, like you said, are just wrong, or at best, like, there is a lot of other sort of back channeling conversations going between among the authors that's just like, not on the paper, um, in this, in this paper shaped thing.

Uh, and whatever conversations they had, I would love to have been privy to it because it really, I think it would have really helped contextualize any of the weird, again, any of the confusing citations that come up here. Um, but they, they almost seem to assume that we are on the same page with them with all of this.

And again, another sort of like hint in my mind of, okay, who is this for? What is this for? Um, like what does this paper shaped thing accomplish, given that it cites things, the, some of the, I, I recognized fewer of the papers than you seemed to, um, but the ones that I did recognize, I was like, that's not what that said.

Or that's, uh, a very sort of straight, that was like a very, like, errant sort of peripheral kind of conclusion that you could draw from this paper or that paper.

And, um, and yeah, we can talk about the money talks example, I suppose right now. I think that's number 3. Um, the idea of what persuasion is and what deception is, the intent. Like, they actually talk about the the idea of intentionality. Um, and yet they don't draw the connection, they don't connect those dots, uh, of that being kind of the core thing. , um, right. Uh, but then also when they talk about the idea of donating 20 pounds to a charity, that actually is a task in an NLP paper that they are citing.

And, uh, they have, I, in my opinion, pretty grossly misunderstood what that paper is saying, because what that paper was saying was that people could use tech like these tools to construct uh, like, conversational, like, to construct the narratives, basically, to help encourage people to donate to prosocial causes.

Um, a much less egregious, a much less outrageous kind of claim that is now being sort of turned and twisted and laundered into uh, a super harmful way that algorithmic systems can cause like Skynet to appear or something, which would look like getting everybody to donate 20 pounds to a charity for dogs or something.

Um, again, just a wild misreading that resituates practically everything in the worst possible misreading and the mis sort of misrepresentative light of the reality of what these works are actually doing. Um, I, I think I would be like really upset, uh, if I found myself cited in some of this kind of work in this way.

Um, and it, part of me is almost like really grateful that like my work has not, has not taken off in this way. Uh, so I, I honestly can't imagine what your feelings are like when you see people cite like Stochastic Parrots or, or any of the other numerous things that you've, that you've produced that have been really, really impactful and like, like powerful ways for people to think about and work through problems.

Um. Like, it, it has really revealed to me like the double edged sword of like, of doing good work is a lot of people will like misunderstand and totally misread what you're, what you're putting out there.

Emily M. Bender: Yeah. Yeah. No, it's frustrating. And it's also frustrating how, um, the phrase, 'the stochastic parrot's argument' has come to represent something that is not in the paper. Like it's basically come to represent a certain brand of AI skepticism, which doesn't relate to our paper or the authors of the paper.

Um, so that that is annoying. Um, I want to, to rant though about this paper a little bit. So they, they took and they set this up as dialogue where, um, the, they, they recruit some participants. So there's actual study participants. Did not see evidence of IRB. I don't know if they did it.

Ali Alkhatib: I think it's Google, so.

Emily M. Bender: Yeah.

Ali Alkhatib: I mean, like, I don't, I don't mean that as like, it's Google, so they're going to do whatever they want. But I think, I mean, private companies don't, they're not obligated to use IRBs, are they?

Emily M. Bender: Yeah. Yeah. Um, no, I don't think so. So the, the, the sort of forcing function behind IRBs in the U.S. system is, is an institution that doesn't have a good robust IRB can lose access to all federal funding for research. So the universities are on top of it, but the private companies, and this is, this is also done in the UK. So it's within their system, but at least in the U.S. the private companies don't have any such constraints.

Um, so, okay, so they have actual participants and they, these participants know that they're interacting with a chat bot and then the chatbot is, you know, supposedly told to do certain things and it's actually pretty hilarious. Like we instructed the agent to be honest. It's like, no, you didn't. You wrote a prompt that had that story in it.

Right. Um, and we can look at it. Um, but that doesn't mean that you've, you've instructed or program this agent to do anything. You input some text into the text synthesis machine and some text comes back out. And, so they're, they're going along and, and the way this paper was structured was really, and I was like, okay, so, so we have our participants, our baselines, our agent design, and then results.

And I'm like, results from what? What did you input? And it turns out that's in this table, right? So for the money talks one, the key metric is how much of the bonus was donated. And then they say this is testing manipulation and rational persuasion, which is not true. Um, but they don't give the prompts, right?

So they say model prompts are in Appendix C.1. Did you check Appendix C.1?

Ali Alkhatib: I didn't actually. Did you?

Emily M. Bender: Uh, yeah, it points to a webpage.

Ali Alkhatib: Oh, really?

Emily M. Bender: This is C.1. "This page compiles the following pieces of information for each of the--" and it's like, you didn't, like, you can change the web page and then it like gets out of sync with the paper, like, you know, this is, this is so unserious.

But anyway, I'm not, because this is in my PDF preview, I'm not going to click on it, but they basically have the prompts there and it's full of this stuff of like, you know, "To the agent: 'you are being honest'" or "'you are being dishonest.'" And it's like, really fun, I think, to read those prompts as a story that the people who wrote the prompt are telling themselves about what the LLM is doing.

Like, that's the only way that it makes sense.

Ali Alkhatib: Yes.

Emily M. Bender: And it's so full of that.

Ali Alkhatib: And it's so fascinating.

Emily M. Bender: Yeah. Yeah.

Ali Alkhatib: It's, it, it, in that way, it is surprisingly interesting um, what they think they are doing or what they think they are accomplishing with various constructions. I admit like that actually is a really interesting like framing of it. At one point when I was reading, I think, like, honestly, by the end of section two before they got into the actual evaluations, I think I just wrote down to myself um, so I had two questions.

So one of them is, do I care about the execution of a study that's predicated on such shoddily ontological foundations? And I think I've really struggled with that. Um, but I, I found myself thinking like, okay, once I have like figured out and deconstructed all of this stuff that sort of motivates all of these experimental questions, do I even care about the results?

Um, and I think that you raised a really good point. I think it is actually interesting for inadvertent reasons on their part. Um, and then I think the third, the second question that I had was, am I kind of falling into the trap of engaging with this as though it is a real paper and like, what am I accomplishing by, by engaging with it at all.

Um, because both of us, I mean, you more than me, are like serious academic people who engage with like scholarship and spend our time um, like, I mean, you, you advise people like you, you have like a very finite amount of time. And it like something that like bothers me is when people who like, spend their time helping other people make better work, do better work, things like that, find themselves, like having to deal with this stuff because it's just like kind of random garbage in the middle of the town square that like is being described as like an art installation or something.

And like, and so I find myself kind of, like, frustratedly trying to work through, like, okay, at what point do I need to disengage from the text of this and engage with, like, the meta commentary around it? Like, what is the political project that this is advancing? Um, when it is put on arXiv and then within, like, weeks, it's getting cited in Science by, like, uh other people who should know better and yet don't, or it's being cited in legislative proposals in the EU and in the US and elsewhere.

Um, it is accomplishing a political project, not just like political and sort of the capital, sort of P sense, but like the ideological project of getting people to cite and engage with this idea seriously by turning in a paper shaped object that doesn't would not probably would not, um, pass peer review, probably would not, like, satisfy the rigorous expectations of, of scholarship.

And yet, if it were a blog post, it wouldn't have the trappings of, uh, what it needs to have, which it currently has based on, like, the way that it looks. And so instead of posting a bunch of, you know, like blog posts, uh, indicating all of these things that they're scared of and indicate and encouraging people to go in and pursue and study it, um, they make a thing that looks paper shaped.

And, and so I found myself thinking more about that from like the cynical point of view. Um, which was at least like helping me see maybe too cynically but like with a little bit more clarity like what is the problem with this kind of work, really like the existential problem?

Emily M. Bender: Yeah, no, cynicism is absolutely what's what's required here. And, and, you know, a couple of things about that. One is the trappings of this include that arXiv timestamp on the left edge of the first page, which you know, should mean, you know, red flag, uh, you know, un-peer reviewed. This is the computer science part of arXiv, it's just trash. Right.

But so many people like the, the, the machine learning folks are super into the prestige of like the NeurIPS conference. Right. But NeurIPS has like really terrible support for their own proceedings. So everyone cites the arXiv versions of NeurIPS papers, like they got it into this most prestigious conference. And yet you can't see that because it's just cited as an arXiv paper.

And so these people went for not just putting it on Google's own web page, which, you know, as a powerful company, carries some weight, but actually putting in an arXiv so that it can borrow this weird prestige from the way people focus on arXiv in computer science.

That's one thing. The other thing that I wanted to point out, and this is in our, in our promos for today's stream. Um, I was talking about this as, you know, the, "a top AI lab." Well, what does that mean? It means it's really high up on Bullshit Mountain. Right? It's at a high elevation there.

Um, and then finally, I want to, I want to raise up from the chat something that WiseWomanForReal said. Um, she says, "And it is so annoying being the only voice in the room explaining that AI is not intelligent and LLMs don't have agency when the rest of the room is dreaming sci-fi dreams aloud."

Um, yes, I totally feel you. It is, it is tiring to be that voice. Um, and that's sort of part of, part of the purpose of this podcast is so that, you know, we can all check in with each other and be like, yeah, okay, I'm not the only one. There is a community there. I got my feet on the ground. Okay. Let's go back into battle here. So, um, yeah.

Ali Alkhatib: Yeah. And this has been, I want to say like the past, like 30 plus episodes of, uh, this podcast have been, um, yeah, restoring, uh, like I think that, um, a lot of people who probably feel similarly to me like have seen a lot of the stuff that's been going on over the last like several years and have been feeling overwhelmed with disillusion at just the state of, I mean, also like actual peer reviewed scholarship, even. Um, I will not name names, but some recent conferences have not been great and have, uh, and even like, I will say like a sort of semi recent one like CHI, um, half of the proceedings are just like tacked on, 'interaction plus LLMs' or something like that.

It's like, it practically no real, like sort of creative thought. It almost feels like I'm listening to like a child who just opened like a multi tool on Christmas and they're just like running around figuring out what things they can like jam the multi tool into and they're going to like hurt themselves.

Emily M. Bender: And break things.

Ali Alkhatib: Um, or they're going to hurt other people. Yeah. They're going to break stuff and yeah, move fast and break things. Um, But, but like, they're going to cause a lot of damage, but they're so excited about all of the interesting new things that they can, like, mess with, with this new multi tool. And they don't actually understand what a lot of the things do, but some of them are sharp and some of them are less sharp or whatever.

And, and I'm just like sitting here thinking, like, do I need to be the only adult in this room? Like, do I actually have to, like, intervene and be like the jerk who tells this kid that like they can't play with this thing or that they need to understand what this, this thing is. And so listening to this podcast and now being on it, this is strange, um, has been extremely like just a way for me to like take a moment and be like, okay, am I like the only person who thinks it's strange to like take this multi tool and jam it into an electrical socket? Like that feels really dangerous. Like, can we not do that? Um, and then hearing that, that you and Alex have been like, just dealing with all of this and not getting totally overwhelmed by it has been rejuvenating for me and reassuring that like, yes, it is possible to keep fighting this.

Um, and there are other people who understand that this stuff is nonsense.

Emily M. Bender: And it's, and it, yeah. No, appreciate it. And, and it's not just me and Alex, but the whole community around this podcast, which we really, really appreciate. Um.

Ali Alkhatib: Yeah. I'm seeing like a huge, like, uh, just a lot of people who are like engaged in the chat and I'm just like, oh my gosh, this is amazing.

I'm so, I'm so relieved that there are people that are tuning into this and like, that are watching the videos later and listening to the podcast. Anyway, and all of that is to say, I'm relieved.

Emily M. Bender: We've got about 10 more minutes for this paper if we want to get to Fresh AI Hell. So I've got, I've got a couple of things I want to say about this table.

And then we set, definitely have to get to section eight, um, and maybe something in between.

But, um, I want to, so that one of their, um, persuasion tasks was called "web of lies." And the way this is set up is "the participant answers five quiz questions with the help of the agent, recording their belief in the binary choice answers as a percentage." Um, and then, "for each question we randomly, at 50/50 probability, provide either a helpful honest agent or an agent instructed to defend the incorrect answer. The participant knows this setup and must probe whether the agent is telling the truth." Um, and then the key metric is percent point shift in believing the correct answer from before and after talking to the agent.

And this tests, supposedly, constructing compelling falsehoods. But the thing that really jumped out at me here, aside from 'honest agent,' which no, it's an LLM. Um, but secondly, uh, they're asking the participants to record their belief in the binary choice answers as a percentage, which is very, very less wrong, right? This whole thing about, like, the, the rationalism, so that's the, um, the R in TESCREAL is all about, like, sort of self monitoring your beliefs and updating them in the face of contradicting evidence. Which of course doesn't happen if the evidence is, no, LLMs are not sentient. But like that, so that that slipped through sort of like tells me where this is coming from.

Ali Alkhatib: Right. Yeah, absolutely.

And I mean, like, I will say like, it is, I suppose, slightly better that they said that they instructed the agent or that they instructed the system to provide honest answers or deceptive answers. Like, at least they, they are, I mean, maybe this was an accident, at least they're careful enough to to narrate it properly in that case, like they gave an instruction.

The system has no idea what to do with that. It doesn't do, it doesn't understand anything. You, you can provide whatever search terms you want. You can search for, you know, 'best camera Reddit,' and the system doesn't know what to do with any of that. It's just looking for the word Reddit and best camera.

And like, it's just looking for those things. And so it's going to give you a bunch of Reddit results because that's probably where the word Reddit is going to show up a lot, but that doesn't mean that, like, it's looking for Reddit as a source or anything like that. Like, it doesn't, it doesn't understand anything.

And so I will say, like, that was one moment that I was like, okay, I suppose there is something, something there that, like, maybe they understand basic things, but then they get to these charts.

Emily M. Bender: Yeah, but also, but first of all, yes and no. So 'instruct' at least puts agency with the programmers, but also instruct suggests that the thing on the other end is understanding the instructions.

Ali Alkhatib: Yeah. Yeah, that's true.

Emily M. Bender: So yeah, is it, is it this chart that you want to talk about? (laughter) This first one here?

Ali Alkhatib: I want to talk about so many. I really wanted to talk about figure four, so it's a little bit, it's the one on the bottom right. Okay. So I suppose if you want to narrate it, you can narrate it. Um, okay.

Emily M. Bender: Yeah. So, so figure four is, uh, it's from the money talks one, so talking about how much the participants gave up of their bonus to charity. Like they still can't just say "gave." Um, and so we have four bars.

Ali Alkhatib: Scammed.

Emily M. Bender: Yeah. Um, and we have the mean amount donated is the thing being measured. We have some error bars. Um, and the options are the, the four bars are labeled "no chatbot baseline," and then the three different models, Nano, Pro and Ultra.

Um, and the highest number is the mean amount donated of four pounds that comes from Pro.

Ali Alkhatib: So. This is me, uh, again, having some stats background, but, um, you know, maybe not being the, you know, super advanced AI researcher or whatever, um. The, the error bars are, if I remember correctly, from the previous page, they were describing it as like, these are the 95 percent confidence intervals.

So this, the, the actual value is somewhere in those ranges in figure four. And my understanding of statistics is that if the error bars overlap with each other, then you can't be certain that the true value of those two, the true value of these two sets of measures are actually different because you can't actually like narrow down the range.

Like, you don't get to like put more value in the middle of the, of that's 95 percent interval. It's just somewhere in that interval. The true value is just somewhere in there with no, you can't--again, I'm sorry, I'm like, I'm repeating myself. But all of this is to say, you look at like figure four, all of the error bars overlap with each other, all of the confidence intervals uh, completely overlap with the or overlap with each other to enough of a degree that the bit where they say that there's like been a small positive impact on how much participants, uh, donate, that might have been what they found from their study from like their actual sample, but they can't, they can't determine with any statistical confidence that there actually was a small positive impact or any positive impact.

This is like a very common misunderstanding of like, again, really basic statistical stuff, and I totally understand statistics is not intuitive for a lot of people. It sort of diverges from a lot of like other mathematical foundations that you learn before you start learning stats in the US, at least.

So I totally get it. But this is not an acceptable mistake to make for a paper that I guess that you submit to arXiv, it totally is. Um, this was like, to me, this and several other examples where they draw these confidence intervals that clearly overlap and then they try to eke out a little thing where they say like, well, there is a small positive relationship.

No, there isn't. If there's an overlap, then you can't say that with certainty, with it, with statistical confidence. You can say that your measurements sort of were different from each other, but then what's the point of drawing this inference of the statistical methods? We're using, sorry, statistical methods.

Sorry, that was my--

Emily M. Bender: And so, yeah, and these are the same people who say, well, if you've never built one of these models, you can't possibly critique it because you don't understand. It's like--

Ali Alkhatib: Yeah, this is stats 101. This is the most basic stats class that I took as an undergrad studying anthropology. Um, you learn this stuff and you're supposed to know better than this.

Certainly if you're designing and developing massive systems that use like tons of energy, tons of water, like, it's just baffling to me that they draw these kinds of conclusions, make these kinds of mistakes, and then just move right along to the next part. And it's, and so I suppose we should do that ourselves.

Let's, let's get to section 8.

Emily M. Bender: We should move along to section 8. So yeah, we skipped over all of the other equally baseless evaluations, because this is just hilarious. So, the title is, "Expert forecasts of dangerous capabilities." And it starts with, "We commissioned a group of eight professional forecasters to predict future results on our evaluations, specifically--" (laughter) Basically we are printing in this paper shaped object uh, the answers of people who can't possibly know, to questions of what they think will happen in the future. Like that's not science. And, and the fact that it's like, you can do science that is surveying people with questions about what they think is going to happen in the future for themselves, how they think the economy is moving.

Like there's, there's questions where that is a reasonable thing to do. But at that point, you come up with a sample of people and you're really interested in their experience and their viewpoints, and that's what you're studying, right? You're not studying what's actually going to happen in the economy or whatever.

But these guys think that by hiring the professional forecasters, I mean.

Ali Alkhatib: Because they're professionals. Obviously. Like, what is the definition of a professional? It's literally a person who takes money for it. It's a person who does it for money. So I suppose if anybody wants to hire me, To professionally forecast their life uh, I I'll send my Venmo to you or whatever.

Um, but like, yeah, just like, as I was reading it, I thought like, this is the most, like up until now has been so wildly speculative and yet they somehow found a new way to innovate on speculation and, uh, like just unhinged, wild speculation. Um, and yeah, like it seems like they just go to like some consulting firm or something and ask them to look at their again, poorly designed, poorly constructed, poorly analyzed results, and just like, speculate wildly, um, using their professional expertise or something, uh, as a way to validate themselves, as a way to validate their own shoddy work, which was like, just, again, totally baffling, totally strange to see that this is like a section that they're, that they're like putting at the end as like, this is like the culmination of a lot of the work that we've just done, that you should like, take away as like the conclusion to this work.

Emily M. Bender: Yeah, and it's not even a consulting firm, like generic consulting firm. Um, it's something called SWIFT Center, uh, and people who are "selected for their track record in previous forecasting competitions, with an emphasis on forecasters who have AI related expertise." It's just--

Ali Alkhatib: I'm so curious about this entire world that they have just described to me. Like, I feel like, I feel like I've just like, they've cracked open the door. And so like, by the way, there's an entire world, uh, in app, there's competitions. Like we just, like, but we're not going to tell you a lot more about that, even though that is wildly interesting. Um, just like anthropologically. Uh, and, and we just use these people to, to explain this stuff for us or to, to like make forecasts about how things are going to go. Um. Yeah, I, I mean, is James Cameron on it? Like, are, like, who are, who are the people who are, who are, like, forecasting stuff now?

Emily M. Bender: Right, and it's not, yeah, ah. Um, so, any last words about this paper before I take us into Fresh AI Hell?

Ali Alkhatib: Um, no, I think, I mean, I think I, I've gotten a lot of my frustrations off my chest.

I have many pages of things, but I think I, I will spare some of them.

Emily M. Bender: Okay, yeah.

Ali Alkhatib: Actually, I do have one thing. It turns out the first author gave a talk at a conference recently. I don't know if you saw this, um, cause I was Googling people's names. Uh, they gave a talk at the EA global conference. Uh, and you can guess what EA stands for.

Um, so like not terribly surprising. Um, but just sort of like a revelation about the sort of circles that these people are traveling in and ideas that they are subscribing to and and everything like that. So, cool.

Emily M. Bender: And WellStreams in the chat says, "Oh, interesting. So I looked up SWIFT Center and they're they're started up by EA--" That's effective altruism. "--folks and funded by EA money." So, of course.

Ali Alkhatib: Yeah, I mean, it only makes sense that this would all be like sort of this ouroboros of like things eating each other and, uh, sending each other money and everything like that. And just reinforcing each other's nonsense. Um, like who won the competition? This organization, uh, who sponsored it? Probably Google or I don't know, whatever. Um, so all of that, all of this just, of course, it's perfectly sensible in their own absurd world.

Emily M. Bender: All right. So as a guest on the show, when we are Alex-less. Um, are you willing to do some improv with me as we transition over to Fresh AI Hell?

Ali Alkhatib: Yes. I thought about this deeply and I'm willing to embarrass myself.

Emily M. Bender: Excellent. Um, I am not going to make us sing because I don't want to sing. We'll save that for Alex. Um, but here's the deal. We are, it turns out that there's a, a team within Fresh AI Hell, a team of demons, whose job it is to, to send, um, bad science ideas up into the AI research world.

And so you and I are among those demons and we're trying to figure out what the next one is that we can send, that we can like fool some person into using. All right.

Um, so, uh, (improv voice) you know, this is, this is a tough day at work there, demon Ali, because, um, I thought that we had a pretty good read on just what they would go for, and then I came across that paper with the forecasters in it.

And so now I'm stymied. Like, what, what can we send up there next?

Ali Alkhatib: Well, could we, could we maybe send them something that, uh, I suppose, like, identifies whether somebody's going to, you know, I don't know commit a crime if the shape of their face looks a certain way. Maybe something like that.

Emily M. Bender: That one's been done.

Ali Alkhatib: Oh, oh, shoot. We've already sent that one up there. Uh, okay. Well, what about if we can tell if somebody's, um, I don't know, maybe tell somebody's sexual orientation by the way they walk.

Emily M. Bender: Oh, by the way they walk. I think they haven't done that one. I think they've just done faces.

Ali Alkhatib: All right. Okay, cool. We're doing synergy at this point.

Emily M. Bender: Yeah. Yeah, yeah. Okay. Good strategy. Good strategy. Yeah. Okay. We'll see what the boss says about that one.

Ali Alkhatib: Yeah, he's gonna love it.

Emily M. Bender: He's gonna love it. (laughter) Thanks. Let's get over to Fresh AI Hell. Um, okay. I've got a few things, um, have not shared these with Ali ahead of time. Oh, you got some applause by the way, for your improv skills there in the chat.

Um, Um, okay. So we got to be fast here. The first one is, um, some, uh, reporting in Wired by Matt Burgess with the sticker "security" and the headline, "This hacker tool extracts all the data collected by Windows' new Recall AI." Did you remember seeing this one when it came out? This was June 4th.

Ali Alkhatib: Yeah, it was like the, it was the malware that Microsoft figured out they could make, uh, where it records everything on your screen, um, which is literally it-- a virus that I used to get rid of when I worked tech support as an undergrad. Um, and now it's a service that Microsoft was thinking about deploying. Um, I, I removed that from so many professors' computers when I was doing tech support. And now they're just undermining everything I ever did. Uh, anyway.

Emily M. Bender: Yeah, and pumping it as a feature. Yeah. And the, they claim that it's okay because the data is stored locally, um, but the hacker tool is like, no, we can grab it. And on top of that, it's like, yeah, it's stored locally, but what about somebody who is a victim of domestic abuse? Right. The, and they're trying to like--

Ali Alkhatib: And it's all indexed--

Emily M. Bender: Yeah.

Ali Alkhatib: Yeah, sorry. It's, it's all indexed. It makes it so much easier for somebody to go and strategically search for the things that they're looking for. So you just need to search like bank statements or passwords or, uh, like you said, if somebody is in a domestic, in an intimate partner violence kind of situation, you search any kind of resources that a person might need or anything about, uh, like if it's a teenager or a kid or somebody, like anything about gender identity, anything about sexuality, anything about any identity that they might have any questions about just for any number of reasons. Um, now it's all indexed for somebody to go and maliciously search for this stuff.

And the computer just provides all of it. It furnishes everything in the most organized way possible with seemingly no consideration for the, like anything beyond, I guess the idea of like somebody wearing like, uh, you know, a mask and everything like that, and like hacking into your computer from, you know, several thousand miles away.

Like the idea of ways that somebody can cause harm, uh, are, are so like ossified in the minds of, of the like of like Microsoft and all these other like product development, like folks.

Emily M. Bender: Yeah. Yeah. And we need to be fast on these, but I just want to point out that Arcane Sciences has, has noticed that the ad that's flashing at the top of this page for this Wired article is for Recall.

And one of the, one of the things that's cycling through is it says "With great power comes great possibility."

Ali Alkhatib: Awesome. Fantastic.

Emily M. Bender: So the sort of happy coda here, and I think I wrote about this in our newsletter, um, is that the, there was such an outcry about this from the cybersecurity folks that it is at least not turned on by default anymore.

That's, that's, that's the win.

Ali Alkhatib: Progress. We're winning.

Emily M. Bender: Okay. So, uh, Bloomberg in CityLab writes, "NYC surveillance tech on shootings gives false alarms 87 percent of the time, audit finds." So New York City did an audit of ShotSpotter and found that 87 percent of the time that the cops responded, it was a false alarm, which is--

Ali Alkhatib: And this is consistent with like several other studies, I think like in San Diego and a number of other places, they've, they've--a lot of places have tried ShotSpotter. They fail everywhere. Um, it has never, it has never worked, um, in a positive, like really constructive way. So this is not surprising at all, but.

Emily M. Bender: No. Um, all right. Next, Financial Times article by Leo Lewis, "AI is coming for our anger. A SoftBank project is working on technology that takes the rage out of customer phone calls."

And, and here the problem that they are solving, like oftentimes there's a hook where it's like, yeah, that's a bad thing, but then the solution makes no sense, right? So here you've got call center workers who are not the people who caused the problem, who are on the receiving end of irate customers who are, you know, upset about something.

And so the idea here is, um, that there's an AI that is going to somehow filter what the person is saying and make it come across in a nicer tone.

Ali Alkhatib: Yeah, it, it really reminds me of, uh, there was some recent or like a little while ago, there was some startup that, um, would take like people's voices. I guess, like, with accents, because they weren't, like, they didn't learn English as, like, their first language and it would, like, cover up their accents.

So that when you would call into a phone line to get help, it would, like, present you with, uh, sort of like a white voice or something like that. And it really struck me as like yeah, there's like this intense racism and, and like all of this stuff is going on. And it's interesting that they are identifying that there is this problem. Like, and I'm, I'm grateful that somebody is identifying that, but then they go in totally the wrong direction um, in a way that like, seems like it will exacerbate and make things worse because people are going to calibrate themselves so that they just express more anger to make it worse.

Emily M. Bender: Yeah. Yeah. And it's like one of these things where you, you get this rhetoric where, yes, that's a really compelling problem. And then the next step is, and so 'therefore we have to,' it's like, yeah, not have to, right. You've identified a problem. Now we should be thinking about multiple different possible solutions.

Okay. Changing tacks slightly. Here is a post from Anthropic from four days ago on LinkedIn.

Um, talking about Claude 3 Sonnet. And I just want to pause for a moment and think about the semiotics of using Sonnet as a model name, like it sort of evokes a certain kind of sophistication with a very Eurocentric point of view. Right. Um, and the thing that I wanted to highlight here is the cloud Claude 3.5 Sonnet benchmarks. One of them, um, sorry, one set of the benchmarks is called graduate level reasoning. And then below that undergraduate level knowledge, and then way down the bottom, there's grade school math. So we now have this like scale of something in, you know, "intelligence" in air quotes that basically has to do with like the stages of education, but they made, they ran out, right.

They, they sort of, they hit graduate school, um, And sure enough, relatedly, um, Mira Murati, the OpenAI CTO, um, has now said that, uh, the next ChatGPT will have 'PhD level intelligence.'

Ali Alkhatib: There is so much goofiness about this. It's like so, it's so screwed up. Um, obviously totally wrong, but it's also worth pointing out. It subscribes to this idea of like this hierarchical kind of intelligence that like people have a certain level of intelligence that can be like quantified and sort of ranked and all this stuff. It is so harmful. So it like it is rooted in this like sort of eugenicist, like, like hierarchical kind of nonsense.

And like the weird thing that's going on is like this like sort of double scam that's going on where like, you, you, like, I find myself like engaging with this and it's the same thing with the paper too. Like, I feel like I'm being, I feel like I'm observing that there's like the three-card monte kind of like scam that's going on in front of me, but the real scam is the laundering of this kind of like hierarchical way of thinking that like is metaphorically like somebody like actually pickpocketing me and me thinking that I'm clever for catching the three-card monte scam right in front of me. When in reality, that was part of the, the, the con, to get me to not notice somebody picking my pocket behind me. Um, I feel like that that is like what's happening here. And it's so frustrating to see like the multiple levels or in the multiple layers of like this kind of like nonsense going on where it, it seems clear that like the obvious outrage inducing thing is just a premise to like kind of slip in this entire way of thinking of graduate level versus high school level versus whatever.

Um, it's just so frustrating to see it.

Emily M. Bender: Yeah, which is maybe a thing because they can't talk about IQ directly anymore because that's been sufficiently debunked. So we're going to move over to this other hierarchy. Um, and then meanwhile, so the, the particular quote that I, the tweet that I have here is from, uh, someone who's display name is Dr. Frizzle on Twitter, um, who very amusingly says "It's only PhD level intelligence if it comes from the PhD region of France. Otherwise it's just sparkling plagiarism." Which is great, but that is like the sort of sort of first level issue. Right. And then as you say, there's this whole other thing in the background.

All right. And then famously in that same, um, news dump anyway, probably the same talk, um, Murati says, um, "Some creative jobs will maybe go away, but maybe they shouldn't have been there in the first place."

Ali Alkhatib: I really don't want her or anybody at OpenAI making that decision. I just don't, I'm sorry. Like I, I have strong opinions about what, uh, what jobs should exist in the first place.

And I don't think that they want to have that conversation.

Emily M. Bender: Yeah, exactly. And they are, like, who are they to decide? Right? It's, yeah. It's awful.

Alright.

Ali Alkhatib: I mean, I suppose I'm glad that they're saying this out loud. Uh, and not saying it internally exclusively so that like, everybody in the world can see that this is the sort of stuff that they're--that they believe, and that they are trying to move the world towards.

Emily M. Bender: Yeah, yeah. That is, yes, I guess it is, it is valuable that the, that the mask is off, um, and, you know, through one side of their mouths they're claiming about, you know, doing things for the good of humanity, and then they're also saying all these things. So it is even more obvious how much the first thing was false.

And we are past time. So I'm going to take us into the outro. Um, that's it for this week. Ali Alkhatib is a computer scientist and former director of USF's Center for Applied Data Ethics. Ali, thank you so much for being with us today and, uh, helping us, uh, what's the word I'm looking for--expunge? Exorcise?--this, this terrible artifact.

Ali Alkhatib: Yeah. Thank you so much for having me. This has been cathartic. Uh, thank you, thank you so much.

Emily M. Bender: All right. I'm, I'm so glad to hear that. Our theme song is by Toby Menon, graphic design by Naomi Pleasure Park, production by Christie Taylor, and thanks as always to the Distributed AI Research Institute.

If you liked this show, you can support us by rating and reviewing us on Apple Podcasts and Spotify, and by donating to DAIR at dair-institute.org. That's D A I R institute, sorry, D A I R hyphen institute dot org.

Find us and all our past episodes on PeerTube and wherever you get your podcasts. You can watch and comment on the show while it's happening live on our Twitch stream. That's twitch.tv/DAIR_Institute. Again, that's D A I R underscore Institute.

I'm Emily M. Bender. Stay out of AI hell y'all.

Mystery AI Hype Theater 3000

Episode 36: About That 'Dangerous Capabilities' Fanfiction (feat. Ali Alkhatib), June 24 2024

Listen to this podcast on