Mystery AI Hype Theater 3000
Mystery AI Hype Theater 3000
Episode 23: AI Hell Freezes Over, December 22 2023
AI Hell has frozen over for a single hour. Alex and Emily visit all seven circles in a tour of the worst in bite-sized BS.
References:
Pentagon moving toward letting AI weapons autonomously kill humans
NYC Mayor uses AI to make robocalls in languages he doesn’t speak
University of Michigan investing in OpenAI
Tesla: claims of “full self-driving” are free speech
LLMs may not "understand" output
LLMs can’t analyze an SEC filing
How GPT-4 can be used to create fake datasets
Paper thanking GPT-4 concludes LLMs are good for science
Will AI Improve Healthcare? Consumers Think So
US struggling to regulate AI in healthcare
Presenting the “Off-Grid AGI Safety Facility”
DropBox files now shared with OpenAI
Underline.io and ‘commercial exploitation’
Axel Springer, OpenAI strike "real-time news" deal
Adobe Stock selling AI-generated images of Israel-Hamas conflict
Sports Illustrated Published Articles by AI Writers
Cruise confirms robotaxis rely on human assistance every 4-5 miles
Underage workers training AI, exposed to traumatic content
Prisoners training AI in Finland
ChatGPT gives better output in response to emotional language
- An explanation for bad AI journalism
UK judges now permitted to use ChatGPT in legal rulings.
Michael Cohen's attorney apparently used generative AI in court petition
Brazilian city enacts ordinance secretly written by ChatGPT
The lawyers getting fired for using ChatGPT
Using sequences of life-events to predict human lives
Your palette-cleanser: Is my toddler a stochastic parrot?
You can check out future livestreams at https://twitch.tv/DAIR_Institute.
Subscribe to our newsletter via Buttondown.
Follow us!
Emily
- Twitter: https://twitter.com/EmilyMBender
- Mastodon: https://dair-community.social/@EmilyMBender
- Bluesky: https://bsky.app/profile/emilymbender.bsky.social
Alex
- Twitter: https://twitter.com/@alexhanna
- Mastodon: https://dair-community.social/@alex
- Bluesky: https://bsky.app/profile/alexhanna.bsky.social
Music by Toby Menon.
Artwork by Naomi Pleasure-Park.
Production by Christie Taylor.
ALEX HANNA: Welcome everyone to Mystery AI Hype Theater 3000, where we seek catharsis in this age of AI hype. We find the worst of it and pop it with the sharpest needles we can find.
EMILY M. BENDER: Along the way we learn to always read the footnotes and each time we think we've reached peak AI hype, the summit of Bullshit Mountain, we discover there's worse to come. I'm Emily M. Bender, a professor of linguistics at the University of Washington.
ALEX HANNA: And I'm Alex Hanna, Director of Research for the Distributed AI Research Institute. This is Episode 23, which we're recording on December 22nd, 2023. And we're going all in, digging deep, journer--journey--journeying through the seven circles of AI Hell itself. It is AI Hell all the way down.
EMILY M. BENDER: I know you're asking yourself why we're wearing these very warm hats and scarves if we're in Hell. Well luckily for us, thanks to climate change and the erratic movements of polar air, AI Hell has frozen over. The hype is covered in ice, there's a light dusting of snow on all the badly applied LLMs. For this brief period at least we have a chance of returning home unscathed.
ALEX HANNA: And to help us get through it, there is a bit of good news at the end. A breath of fresh air. I can feel it blowing through the sulfur, Emily. We can do this.
EMILY M. BENDER: We can do this. We better get started, because we got a lot of it. So let me share the first batch of them uh through the magic of Zoom here. All right, you want this one Alex?
ALEX HANNA: Yeah totally. So this is an article from Business Insider. The title is, "The Pentagon is moving to--is moving towards letting AI weapons autonomously decide to kill humans." So we're going right into the Hell, um, yeah and so the three points of this Business Insider, it says: "The US is among countries arguing against new laws to regulate AI-controlled killer drones.
The US, China, and others are developing so-called 'killer robots' in quotes. Critics are concerned about the development of machines that can decide to take human lives." Uh so this is pretty alarming, uh just off the bat. Um there's a Campaign to Stop Killer Robots, but they've been you know they're nowhere to find be found in this article. I would say one of the kinds of things that they put in this article, if you scroll down Emily, is that they talk about the way that these 'killer robots' are kind of a way for um uh for the US to catch up to China, which is sounds like a absolutely terrible uh kind of kind of idea, uh and I think it's it's kind of down in the article.
EMILY M. BENDER: All right I'm gonna search for China to get us there quickly.
ALEX HANNA: Yeah yeah. Yeah so here, "In a speech in the--in August US Deputy sec--Secretary of Defense, Kathleen Hicks, said technology like AI-controlled drone swarms would enable to US--the the US to offset China's People--People's Liberation Army numerical advantage in weapons and people." And then the quote is, "'We'll counter the PLA's mass with a mass of our own, but ours will be harder to plan for, harder to hit and harder to beat.'" Uh just nightmarish stuff right here.
EMILY M. BENDER: Yeah I mean so like don't kill people, don't create automated systems that are designed to kill people, don't create automated systems that are designed to do something else and kill people as a side effect. Like these like just baseline um yeah--all right.
ALEX HANNA: Yeah.
EMILY M. BENDER: We got to keep moving before we--the ice cracks under us here, I think.
ALEX HANNA: I know, next.
EMILY M. BENDER: Yeah, next okay. This is a language one, I'm gonna take it. "NYC may--Mayor Eric Adams uses AI to make robocalls in languages he doesn't speak. The calls have already triggered alarm over whether the mayor is misleading people." Um so starts off with, "Thanks to artificial intelligence tools, New York City Mayor Eric Adams has spent a barrage of--has sent a barrage of robocalls to residents in languages he doesn't speak. That includes "thousands" of robocalls in Spanish, more than 250 in Yiddish more than 160 in Mandarin, 89 in Cantonese and 23 in Haitian Creole."
Um and I just have to wonder like, did they stop and verify that the translations were actually good? You know, I'm thinking English-Spanish is maybe gonna sound a little awkward, but not too bad. Mandarin's gonna be robust, Cantonese, a little worried. I really doubt we've got great translation into Haitian Creole right now. That's just ugh. Um.
ALEX HANNA: Absolutely, yeah.
EMILY M. BENDER: I should say this an article in The Verge um and there's a quote here from Albert Fox Cahn, who's the executive director of the STOP uh project, Surveillance Technology Oversight Project. Um and their statement is, "This is deeply unethical, especially on the taxpayers' dime. Using AI to convince New Yorkers that he speaks languages that he doesn't is deeply Orwellian."
ALEX HANNA: Yeah and and and Christie uh our producer lives in New York, uh remarking, "Eric Adams is so bad--so bad he turned on the flames." Uh you know the other types of things with with Adams--I mean I was in New York two weeks ago, I saw this stupid NYPD robot which requires two human cops literally to sit there and babysit so no one vandalizes it. Incredible stuff.
EMILY M. BENDER: Wonderful wonderful.
ALEX HANNA: Let's move on.
EMILY M. BENDER: All right, keep moving. Um okay, this one we just have the headline because it's pay-walled and I'm not going to subscribe to Fortune. But in Fortune, um, "Exclusive: Sam Altman quietly got $75 million from the University of Michigan for a new venture capital fund earlier this year." Um and the article, which somehow I was reading it on a device that did didn't see it paywalled or maybe it wasn't paywalled earlier, I don't know, um basically University of Michigan has an endowment, like many universities do, and one of the places that they are investing some of it is in Sam Altman's basically personal VC fund.
And I thought is is this the same university whose generative AI plan that was very ra-ra, basically selling a product to their students, we panned that in the episode with Haley Lepp back in September? Sure enough, University of Michigan.
ALEX HANNA: Yeah.
EMILY M. BENDER: So.
ALEX HANNA: And it's definitely showing this you know I mean university endowments themselves being this kind of thing where you know universities just hoard money basically, but then Mich--you know this is one of our you know--one of our fellows at DAIR, Nathan Kim was mentioning on this article in our in in the DAIR Slack, you know there's kind of this philosophy of some endowment managers being pretty--being much more risky in putting stuff in AI, um just because endowments are so big they can kind of you know override some of this short-term volatility but still like really bad idea, why are you doing this?
EMILY M. BENDER: Yeah, yeah. And it looks like such a conflict of interest too when we look at that other uh piece of information we had of not UM but U-M.
ALEX HANNA: Right.
EMILY M. BENDER: Okay, sorry I thought I got rid of all the ads. So Gizmodo, you want this one?
ALEX HANNA: Yeah, let me take it. So Gizmodo uh headline, "Tesla Says It's Not Fraud, It's Free Speech." And then the uh subhead, "Elon Musk is dealing with even more legal headaches." And so this is a case in which Elon Musk uh has claimed that um Tesla uh uh you know which is claimed that is these uh can be fully self-driving um and they're under fire by the California DMV, um which uh over the company's ads that claim that the cars are completely self-driving, which they're not. Musk replied and said it's protected under free speech according to court documents made public by the register on Monday.
Uh incredible incredible claims that's that this is defended under free speech, um yeah and so just to reiterate in the article, they say, "At no point does Tesla refute that its cars don't actually drive for themselves or the fact that Tesla advertises cars as being fully quote 'autonomous.' Tesla's argument hinges on the fact that the California DMV conducted three previous investigations on the custody's claims of quote 'full self-driving capabilities' and took no action until two--until its 2020--2022 legal accusations. Tesla is hoping the case gets thrown out altogether but if not Tesla could be risking a lot for the sake of lying."
EMILY M. BENDER: Great. For the sake of hype, basically, um and we have uh Astayonix in the chat saying, "What a manchild. Also RIP Hyperloop." And what's that over there in AI Hell? Is that the dead Hyperloop?
ALEX HANNA: Oh yeah, it's it's it's it's gone to AI Hell, you know, RIP Hyperloop, just preventing any hope of getting that San Francisco-LA train that we need.
EMILY M. BENDER: It's 2023, never mind the jetmack--jetpack, where's my high-speed rail?
ALEX HANNA: I want a damn highspeed rail that goes to LA from the Bay Area.
EMILY M. BENDER: I would love be able to get to Portland or Vancouver quickly.
ALEX HANNA: Also very nice.
EMILY M. BENDER: Yeah. Okay. So we gotta stay gotta stay on track here um. This is a tweet by Aran Komatsuzaki um about a paper that they just published. Um this is--tweet's from November 1, by the way. Um, "The Generative AI Paradox: 'What It Can Create, It May Not Understand.'" Um and then a brief subhead here. "Proposes and tests the hypothesis that models acquire generative capabilities that exceed their ability to understand the outputs."
Which you know my reaction to that is just like duh, like this is--what do you mean--yes they can spit stuff out but they do not understand. And then in the um the figure from the paper that's in the tweet it says, "Figure 1. Generative AI in language and vision can produce high quality generations. Paradoxically, however, models have trouble demonstrating selective or interrogative understanding of these modalities." So I mean on the one hand, props for documenting that clearly. On the other hand, why did you think it would be otherwise?
Um and Alex there was a bit of the abstract that you liked um I think it was--
ALEX HANNA: Yeah yeah.
EMILY M. BENDER: --this one down here maybe?
ALEX HANNA: It was--it was the last one, which I thought was just very--I mean first off the discussion of understanding being silly, but then it was also this bit at the end where they said, "Our findings support the hypothesis that models generative capability may not be contingent on understanding on--upon understanding capability and call for caution in interpreting artificial intelligence by analogy to human intelligence." Yeah, no kidding.
Stop calling it intelligence.
EMILY M. BENDER: Right. But you know welcome over to the the the good side, right?
ALEX HANNA: The dark side maybe? I don't know. Is this the dark side or the light side?
EMILY M. BENDER: Whatever we are.
ALEX HANNA: Yeah.
EMILY M. BENDER: All right keeping us moving along, we have some wonderful new um vocabulary proposed by Miss IG Geek on Twitter, um and this was uh quote tweeting uh Reid Southen's tweet about AI art, where they say, "AI art is a misnomer on many levels. I think maybe it's time to start calling it synthetic art or imagery."
Um and then there's what looks like a a um a well it's a it's a screen cap of a dictionary definition, but they don't say of what um but probably synthetic. Um so uh, definition 4a: "of,relating to, or produced by chemical or biochemical synthesis. Especially produced artificially. b: devised, arranged, or fabricated for special situations to imitate or replace usual realities and c: factitious or bogus." I didn't realize that was in the definition of synthetic and I love it, because you know we've been calling it synthetic media for a long time. Synthetic text extruding machines.
But Miss IG Geek here is fabulous, she says, "It's Data Fairy puke. Authentic content that's been appropriated, maths-ticated and regurgitated."
ALEX HANNA: Yeah math--mathsticated is great, that's just that's got to go on some merch.
EMILY M. BENDER: Yeah.
ALEX HANNA: But I love these synonyms, f-f-factitious or bogus. Just you know if we were going back to the '90s, the merch would just say 'totally bogus' you know in a in a kind of a pow, and then under 'synthetic media' or something, I don't know.
EMILY M. BENDER: With with a rainbow and very like retro 60s.
ALEX HANNA: Yeah.
EMILY M. BENDER: Yeah. Okay I'm with you on that. All right next. I've got a few in here that are about like, I think my heading in our prep doc was something like, 'ChatGPT is bad for science and other investigative things.' So you want to take this one?
ALEX HANNA: Yeah so CNBC reports--an article by Kif Leswing, "GPT and other AI models can't analyze an SEC final--uh filing, researchers find." And the three uh kind of key points: "Large language models similar to the one at the heart of ChatGPT frequently fail to answer questions derived from Securities and Exchange Commission filings, new research finds. The findings highlight some of the challenges facing artificial intelligence models as big companies, especially in regulated industries like finance, seek to incorporate cutting edge technology into their operations, whether for customer service or research." And there's a quote, "'That type of performance rate is just absolutely unacceptable,' unquote, Patronus--" Is that how you say Patronus? "--AI co-founder Anand Kannappan um--Kannappan said. 'It has to be much higher for it to really work in an automated and production-ready way.'"
So yeah I'm assuming that the research was conducted by this this company Patronus um--yes and it got there was 79--"79 percent of answers right on Patronus's new AI test, the companies--company's founders told CNBC." So yeah don't use this for uh things that have are are tightly regulated markets, like fintech. Although there is a proliferation of this stuff in fintech.
EMILY M. BENDER: Right. So at any point where you actually care about the validity of the answer, don't use the synthetic media math-sticated data extruding machine. Totally. Um but you know the business um article I want to see? I want to see someone go through and calculate just how many resources were wasted on people saying, 'Can ChatGPT do this? Can ChatGPT do that?' Would like--we know it can't. What a what a stupid waste of time.
Okay. Um this is the acknowledgement section of um a paper with--here we go--list of contributors, you were asking Alex who it is. If we go to the top we'll see that it's um credited to basically Microsoft, and then they've got this list of contributors for each section. Um and it's all about using GPT-4 for science.
Um and the thing that I really wanted to to to lift up for ridicule here is in the first thing in acknowledgements is, "We extend our gratitude to OpenAI for developing such a remarkable tool. GPT-4 has not only served as the primary subject of investigation in this report but has also greatly assisted us in crafting and refining the text of this paper. The model's capabilities have undoubtedly facilitated a more streamlined and efficient writing process." Just, ugh.
ALEX HANNA: Yeah I mean it's it's I love--I love scientific papers that just thank the creator of the technology to begin with.
EMILY M. BENDER: Yeah.
ALEX HANNA: Clearly this is--clearly this is an unbiased assessment.
EMILY M. BENDER: Yeah. Title of the paper is, "The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4." And the um author is, "Microsoft Research AI4Science," all one word that last bit, and "Microsoft Azure Quantum." I don't know if that's the affiliation of Microsoft Research AI4Science, but anyway this went up on arXiv on December 8th. And I think we don't need to give it more time right now because we've got other stuff. All right, we have three in the healthcare space, you want this one?
ALEX HANNA: Yeah, so this is from Forbes, the contributor is Deb Gordon and the title is, "Will AI Improve Healthcare? Consumers Think So, New Surveys Show." So this is some reporting on some some some surveys, uh let's go down because I don't think I could have--I read this one early on because I don't think I had a Forbes--Forbes um membership.
EMILY M. BENDER: Yeah.
ALEX HANNA: Uh and so this is a survey from um Deloitte Center for Health Solutions. "...about half--half of consumers surveyed believe generative AI could improve access to healthcare. Another 46% said it it could make healthcare more affordable. Uh these figures were even higher--69% and 63% respectively--amongst uh respondents who reported already using AI." This is really interesting because first off those aren't very high numbers--
EMILY M. BENDER: But also you wonder what the survey was, because it was half said gen-AI could improve access to health care and then another 46%--so was it this weird like forced choice question? Because it seems like those two beliefs could be held by the same people. Um and in fact it has to be because "the figures were even higher among respondents who reported already using AI" and then 69 and 63 adds up to more than 100. So it's just bad reporting. Um.
ALEX HANNA: Yeah.
EMILY M. BENDER: But what I get from this basically is that the hype is unfortunately seeping in and--
ALEX HANNA: Yeah.
EMILY M. BENDER: --people are buying it um when they shouldn't be, um because we've got a couple of you know warning stories here. Um so Politico reports, "AI has arrived in your doctor's office. Washington doesn't know what to do about it.” Um, “AI is diagnosing diseases and recommending treatments, but the systems aren't always regulated like drugs or medical devices." Um and they I they totally should be. Um.
ALEX HANNA: Yeah.
EMILY M. BENDER: "...Congress is stalled. Schumer said this week that legislation was months away."
ALEX HANNA: Yeah and there's some some good quotes here from Suresh Venkatasubramanian um who uh was one of the co-authors of the of the the White House Office of Science and Technology Policy's Bill of Rights, uh and the quote says, "'There's no good testing going on, and then they're being used in patient facing situations and that's really bad.'"
Yes, 100%. Um and so but we're seeing folks press ahead um even when other kinds of medical products medical devices um have pretty tight boundaries around what you can and can't do with them.
EMILY M. BENDER: Yeah. And you know just just to remember that that racism is never far away when we're talking about this stuff, there's a nice detail here um--nice in the sense I'm glad they reported it um--so so, "New York City has formed a coalition to end racism in clinical algorithms and is lobbying health systems to stop using AI that the coalition says relies on data sets that underestimate black individuals lung capacity and their ability to give birth vaginally after a C-section and that overestimate their muscle mass."
So you know we had um--
ALEX HANNA: Oh Lord.
EMILY M. BENDER: --Roxana Daneshjou on a couple episodes back and she was telling us about how um you see uh you know scientific racism coming out and medical racism coming out uh in this tech all the time. So I'm not surprised I'm glad um that uh the--this is interesting, looks like it must be the New York City government.
ALEX HANNA: Yeah, I think it's it is. I mean the New York City has--I mean there was some reports I mean this is a small tangent but I know there is a new New York City law basically requiring anybody who empl--any company to who was employing people in New York City to meet a certain amount of compliance and how it was being used in hiring and reporting that out, uh some recent research by Jake Metcalf and some folks at Data & Society basically showed there was pretty abysmal compliance and so New York has luckily been ahead on a lot of this stuff, but it's been bad news all around.
EMILY M. BENDER: Yeah, all right. More bad news on the healthcare front, we talked about this briefly but it's worth talking about again. So this is reporting from November 16th in Ars Technica, uh the author is Beth Mole or Mole, uh "UnitedHealth uses AI model with 90% error rate to deny care, lawsuit alleges. For the largest health insurer in the US, AI's error rate is like a feature, not a bug." Um and you know we we called this, right we basically said you know episodes ago that people that the health insurance companies were going to be chomping at the bit to um basically take the output of these systems as an excuse to deny care.
ALEX HANNA: Yeah, completely yeah.
EMILY M. BENDER: Anything you want to add on this one?
ALEX HANNA: No, although it does remind me of something I think we're going to address later, so put a pin on that, but I think it was the 'life to VEC' thing but we'll come back to that one.
EMILY M. BENDER: We'll come back to that. That one's at the at the very end, well second to last. I'm just going to get my um--we we're about third of the way through there.
ALEX HANNA: We're third of the way through. We should take a little AI Hell break and say you know while you're getting that up, um I don't know. I'm trying to think of my own prompts. Maybe I can give you a prompt.
EMILY M. BENDER: Oh no. Just don't make me sing.
ALEX HANNA: I want you to--you're going to a store in AI Hell and you're looking for the proper boots to purchase to go through AI Hell. Uh you're asking a sales associate. Go.
EMILY M. BENDER: Um excuse me, excuse me I need to get through this really dense bit of AI hell and I can't tell--it's frozen outside right now but then there's also lava like, can you do you have the right thing please? Because I've really got to get to the other side.
ALEX HANNA: Yes.
EMILY M. BENDER: I am no Alex.
ALEX HANNA: No no it's it's wonderful. All right we're back in it.
EMILY M. BENDER: We're back in it.
ALEX HANNA: I'll go and take this one. This is a this is a tweet from Andrew Ng, the um uh famous Stanford professor and uh constant AI cheerleader um and so um yeah so he said, "How likely are AI--" This is a long ass tweet um because he's a blue check.
EMILY M. BENDER: Because he pays.
ALEX HANNA: Yeah, he pays for Twitter, please laugh. Uh but I'll read the first graph where he says, "How likely are AI doomsday scenarios? I spoke at the US Senate AI Insight forums organized by Senators Chuck Schumer, Martin Heinrich and Mike Rounds and Todd Young and explained why I think they're very unlikely." So uh and then he says, "AI systems aligned with RL-- RLHF--" Which, uh, reinforcement learning human feedback. "--already know they--" 'Already know,' so ascribing kind of intention and knowledge. "--they should default to obeying the law and not harming people. It seems to me overwhelmingly unlikely that a misaligned, advanced AI will be given an innocent instruction, (like reduce CO2 emissions) and decide the best way to do it is wipe out humanity."
Uh and then he goes on and on. And so it's just like basically he's his you know--if you watched last week's show or listen to it uh, we talked to Justin Hendrixand we talked about this probability of doom, this p(doom) thing and so Ng's p(doom) is very low so yay but for very very silly reasons. So he basically thinks that you know we've we've done a good job at aligning and two, he actually used GPT-4 to kill us all and it didn't work, yay. Abs--absolutely mind-numbing kind of kind of stuff coming from someone you think would approach this a bit smarter, given given given his familiarity with technologies.
EMILY M. BENDER: And I mean he's basically taking as given the idea that GPT-4 might combust into AGI, like the the place where you've got to to undercut the silly p(doom) conversation is not oh uh, RLHF is working, it's LLMs aren't AGI.
ALEX HANNA: Yeah.
EMILY M. BENDER: And and there's other stuff to worry about. Okay while we are in the AI safety region of AI Hell um [laughter] there's this LinkedIn post from a month ago which was hilarious, so I'll read it and then Alex I'll give you that first comment that you turned up.
So this is Clemens Mewald, who is the Head of Product at Instabase, whatever that is, um, "If you take the prospect of AGI (artificial general intelligence) seriously, you can't conduct research in the open, connected to the internet. Publishing papers like the "AI kill switch" on arXiv means that GPT-4 has already read it, and any future AGI will know how to circumvent the fail safes we put in place." Watch out, the AGIs are listening. All right, that was me, back to Clemens. "After thinking about this for quite some time, a few like-minded folks and I are creating the OGAGIS Facility: Off-Grid AGI Safety Facility. With private funding from wealthy and concerned benefactors, we are creating an off-grid AGI safety lab, disconnected from the internet. It will be constructed in an undisclosed location, protected by a faraday cage. This is the first and last time you will hear about it. I will only have conversations about it offline, in person, in a faraday cage with no electronics around." [Laughter]
ALEX HANNA: It's so good. Hold on, can I can we read the hashtags too?
EMILY M. BENDER: Okay, yeah.
ALEX HANNA: "#OpenAI #anthropic #inflection #AGI #AI #ArtificialIntelligence #AISafety #Paperclips #TheComingWave #GPT-4 #OGAGIS."
First off if you never want to talk about it again, why are you creating a hashtag for it?
EMILY M. BENDER: Right? He wants everybody else to talk about it.
ALEX HANNA: Yeah and it's--the the image--so the first comment that he says--uh oh yeah, AbstractTesseract--when you said the AI safety uh uh version of AI Hell, I was thinking about this and AbstractTesseract said this in the chat, "Has someone made a fantasy fiction style map of the regions of AI Hell, is this a thing?" and I automati--I was thinking about that as soon as you said the region. So if you're a listener to the pod and you want to make like a Dungeons and Dragons style world map of AI Hell, like--
EMILY M. BENDER: We are so there for that.
ALEX HANNA: We're so there, you absolutely cannot use an image generator for this, you have to draw this. Uh. [laughter]
EMILY M. BENDER: Yes, yes no image generators but yes. And and send it to us and we will we will gleefully promote that.
ALEX HANNA: We'll, we will post it on all the social medias. Um okay so so the first comment on this is, "Thanks to GPT-4 for helping rewrite this post and for generating this rendering of the OGAGIS facility." So very funny, just like okay. And and the image is like this house uh like in a Bob Ross-style painting uh with like this um this like uh kind of dome over it um, that's just yeah and I mean--
EMILY M. BENDER: It's like a glowing bucky ball kind of dome. I just have to laugh at one more thing about this. He's going on and on about how we already told GPT-4, and then in the comment it's like I used GPT-4 to help write this.
ALEX HANNA: So GPT-4, if you think that this is plausible, it's going to be searching every single map, it's going to be targeting you, it's going to have an agent that's going to infiltrate your your o OG--OG ASIS. Whatever.
EMILY M. BENDER: O-gag-sis?
ALEX HANNA: O-GAG-SIS, let's go. [laughter]
EMILY M. BENDER: Yeah. All right um so while we are still in the AI safety region of AI Hell, um I love this this is Illustrated with a picture of Clippy. So Business Insider: "OpenAI's offices were sent thousands of paperclips in an elaborate prank to warn about an AI apocalypse."
This is from November of this year. "An employee at rival Anthropic sent OpenAI thousands of paperclips in the shape of their logo. The prank was a subtle jibe suggesting OpenAI's approach to AI could lead to humanity's extinction. Anthropic was formed by ex-OpenAI employees who split from the company over AI safety concerns." And I forget where I saw this but it came through on some social media and the commentary was, 'These are deeply unserious people.'
And I think that speaks for itself. And in the interest of of not getting bogged down in the AI safety region of AI Hell, um we are now moving over to the training data part of AI Hell. Um, so I've got both tweets here but it's actually this one that I want us to start with. Um you want to read this one, because it's it's about your uh uh droso--damn it I lost the pronunciation. Your your favorite fruit fly.
ALEX HANNA: Yeah yeah dros-drosophilia.
EMILY M. BENDER: No it's not drosophilia, but that.
ALEX HANNA: Oh that's right yeah, that's--we were we were we were practicing this before show--the show, yeah. Uh so this is uh Charles Foster who um is looking at a a tweet um from Andrew Carr, and says, "That's a fun fact!" And the and it's a screenshot of an appendix. It says, "Chess Puzzles: Data pre-processing" And the highlighted section says, "The GPT-4 pretraining dataset included chess games in the format of move sequence known as Portable Game Notation (PGN)." And so Charles Foster says, "Wait, so then it's no mystery why OpenAI's new base models are good at chess. They explicitly crafted the pre-training data set to cover that. I presume whatever extra tuning they did to chat models wasn't focused on chess, so some of that was forgotten."
And then the quoting tweet is from David uh Pfau? It's spelled P-F-A-U. "It's almost like releasing information on the training set helps explain the capabilities of these models. We should do this more. We could even have a profession where people share information about how things work freely to demystify them." [laughter] It's incredible.
EMILY M. BENDER: So yeah, basically if it can do it it's probably because it's in the training data. Um or it only looks like it can do it. Okay, so. Continuing about training data, um now much less silly, so Dropbox has signed some sort of an agreement with OpenAI--this is also like a PSA, folks--um and so this is CNBC reporting uh by Hayden Field, uh one of the greats in the area. "How to stop Dropbox from sharing your personal files with OpenAI." So you have to like actually opt out, this is not opt-in this is opt-out if you're using Dropbox and you don't want your files to end up on OpenAI servers. Um so this is this is bad. But you want to you want to hit the key points there?
ALEX HANNA: Oh gosh it's it's just bad, just opt out. [laughter] I mean it's it's awful that they have this agreement yeah. Let's let's let's yeah yeah.
EMILY M. BENDER: I just want to point out from the second key point, even if you've opted out any files shared with another person who is using Dropbox AI could still be sent to open AI servers. Don't let friends--friends don't let friends share their data with OpenAI. I think is where we are. Um okay, so here's a little bit of an upside, um it starts off not so ah so the service Underline, which does the um AV for ACL conferences, um and the the online streaming, um sent email to everybody who had presented at an ACL conference um and this was my copy of it. "Dear Emily M; We are contacting you because you previously presented in conferences held in Underline. You can access your presentations here...um, Imagine that now your presentations can keep giving and help your fellow researchers further advance the state of AI and linguistics even while you are sleeping. We have been contacted by researchers who would like to use samples of your talk to help train their data models. Note: This is completely anonymous, with no identification of you or your talk to any outside party ever. This will help advance the state of open science and provide foundational models for others to use."
So anyway the upside here is there was a massive outcry, not just me, this is me posting about it on Twitter but lots of people pushed back and Underline--I think they haven't sent the follow-up yet but they heard us loud and clear and are um not actually going to be sharing data with this third party who was in stealth mode and didn't want to reveal who they are. And in my my tweet here--oh yeah, "Please click to indicate your acceptance." So it was an opt-in thing, but it was like this like they shouldn't have even asked in the first place, like they shouldn't be doing it. But, 'this this is completely anonymous'? It's video. That is like inherently non anonymous.
ALEX HANNA: Right.
EMILY M. BENDER: And then YaynLeCun in the in the chat: "An exciting chance for science researchers to advance science." Okay.
ALEX HANNA: I know, 'even even when you sleep,' which is gosh, this is not part of the AI Hell thing but I think we saw something, there was something like uh this device that was uh letting people uh work in their sleep. And I was just like this incredible incredible torment nexus territory right there.
EMILY M. BENDER: It was lucid dreaming something something--okay we should keep going because now we are crossing over into the um information ecosystem misinformation region.
ALEX HANNA: We're kind of in the journalism awful things part of hell uh so this is from Axios by Sarah Fischer uh and it says, "Axel Springer, OpenAI strike a quote 'real-time news' deal for ChatGPT." Uh, so scrolling down on this um, "ChatGPT parent OpenAI has struck a deal with Axel Springer, uh parent to a slew of German and US media outlets, to quote 'help provide people with new ways to access quality real-time news content through our AI tools,' end quote, OpenAI COO Brad Lightcap announced Wednesday. Why it matters: The deal marks a new milestone in the relationship between journalism companies and artificial intelligence firms, one that involves not only just providing data--training data to train ChatGPT's models, but also using vetted journalism to bolster the accuracy of ChatGPT's responses."
Uh so this sounds like a bit of a nightmare, um just given that you know this is strengthening the bond between um what is actually going to be going into GPT models, and this is--includes licensed um licensed text um that you can't get from just scraping the web, but also it further imbricates AI's intrusion into journalism and the production of AI generated journalistic content.
EMILY M. BENDER: Right, and papier-mâché made of good journalism is not good journalism, it's still synthetic text that's been extruded and maths-sticated.
Okay we do have to pick up the pace.
ALEX HANNA: Yeah yeah. Well just to just to say one thing Axel Springer is is the owner of both um Politico and also Bild and Die Welt, uh magazines in Germany. Also one of those magazines has a great hit piece on DAIR, so yeah, fun stuff.
EMILY M. BENDER: Yeah. All right so talk about really bad credibility. Um all right here's something terrible, "Adobe Stock is Selling AI-Generated Images of the Israel-Hamas Conflict," um and this is PetaPixel, and um this is yeah so it says, "As fears rise over fake imagery generated by artificial intelligence flooding the internet, one of the world's leading stock photo websites is openly selling AI images of the Israel-Hamas war." Like, why would you do this? I--
ALEX HANNA: Because--because it's in the news and you need to basically portray you know Gaza as this horrible exploded place, and so all the images are kind of bombed out you know cars and a child looking solemnly at a minaret and just explosions. I mean there's enough horrible shit already happening in Gaza uh without it having to be recycled and you know reprocessed through images here. Um you know actually rely on you know the many Palestinian journalists who are fucking fighting for their lives actually taking pictures of this stuff.
EMILY M. BENDER: Yeah. Um and just to cap this one off they quote Futurism, saying, "Futurism notes that a handful of small publications have ran these images without labeling them as AI," including here and here, um and I didn't look at where those were but those are sites that we should never trust again. Um.
ALEX HANNA: Yeah.
EMILY M. BENDER: Okay, uh this is further in the uh bad journalism part of AI Hell. Um Sports Illustrated was caught um doing uh sharing fake uh so--"Sports Illustrated appears to have published affiliate articles under the bylines of fake writers, complete with made up bios and AI generated headshots." This is from Maggie Harrison. "After we generated the--after we asked The Arena Group about it, the bylines were deleted entirely." And then Kevin Collier--and these are all Bluesky posts--says, "The VC media playbook really seems to be: Buy a struggling legacy media brand, fire most of the reporters, strip it for parts, expect to get similar page view numbers even though you're doing little to no free articles? Shit somebody saw us." Um.
ALEX HANNA: Yeah.
EMILY M. BENDER: --and then he corrects himself, saying, "As several people pointed out, I meant to say private equity, not VC, same story."
ALEX HANNA: Yeah.
EMILY M. BENDER: Um, okay, so then by the way this is the--Futurism reporting on that um--
ALEX HANNA: Yeah.
EMILY M. BENDER: --but I think we can continue.
ALEX HANNA: Yeah.
EMILY M. BENDER: Okay so now we are in the--
ALEX HANNA: Yeah we're now in the self-driving cars slash secret 'all AI is actually people' part of AI Hell.
EMILY M. BENDER: Yeah, and and Christie our producer is saying hurry the ice is cracking behind you.
ALEX HANNA: Eeeugh. I'm mimicking running now. So this is an article by Laura Kolodny uh and CNBC. The headline is, "Cruise confirms robotaxis rely on human assistance every four to five miles." Holy crap. It's--the key points, "GM-owned cruise is responding to allegations that its cars are not really self-driving because they require frequent help from humans working as quote 'remote assistance' to get through tricky drives. Cruise tells CNBC it worked with roughly quote one uh 'remote assistant agent' per every 15 to 20 driverless vehicles in its fleet before grounding operations last month. Human advisors generally provide quote 'wayfinding intel' to the robotaxis, and do not drive them remotely, company spokesman said." Oh dear.
EMILY M. BENDER: Um there's a really damning thing here--yeah so "The CEO wrote, 'Cruise AVs are being remotely assisted 2-4% of the time on average in complex urban environments. This is low enough already that there isn't a huge cost benefit to optimizing much further, especially given how useful it is to have humans review things in certain situations.'"
So basically it's far too expensive to get to the thing they're actually selling, so they're going to keep underpaying people to clean up after them.
ALEX HANNA: Well the funny thing is that he wrote this on Hacker News, right, so there was actually--so if you scroll up there were uh you know they had an initial New York Times article and then the the first paragraph says, "Cruise CEO and founder Kyle Vogt posted comments on Hacker News on Sunday responding to allegations that this company's robotaxis aren't really self-driving, but instead require frequent help from humans working in a remote operation center." So it's just hilarious that it's like, they did an article about it, they made claims and he's like wait wait wait wait wait, that's only sort of true but the truth is still pretty damning.
EMILY M. BENDER: Yeah. Okay um so just quickly on these next two because it's the usual kind of awfulness, and I've got a million ads popping up here sorry. Um so this is Wired um Niamh Rowe reporting, "Underage Workers Are Training AI. Companies that provide big tech with I--AI data labeling services are inadvertently hiring young teens to work their platforms, often exposing them to traumatic content."
Um and I mean I guess it was not surprising that that would have happened, that teens would be doing some of this gig work, and of course nobody's verifying. Um so also in the mistreatment of of people, um here you want this one Alex?
ALEX HANNA: Well you should should talk about it because it was in Finland.
EMILY M. BENDER: In Finland where I just was, where this photo is actually from. Okay so this is actually an article from September 11th by Morgan Meaker in Wired. Um, "These Prisoners Are Training AI. In high-wage Finland, where where clickworkers are rare, one company has discovered a novel labor force--prisoners." So basically, if you're trying to do data cleaning and all the other kind of labeling that that tends to be exported to low paid workers, but you're doing it in Finnish, it's going to be much harder to find those low paid workers to exploit. So what do they do? They go find the prisoners um and--
ALEX HANNA: Yeah this is--this is the uh I mean I guess you could call you know um--what is the sort of trope about prison labor making like stamping license plates or something?
EMILY M. BENDER: License plates, yeah.
ALEX HANNA: It seems like training AI is a new license plate stamper here.
EMILY M. BENDER: Yeah. All right, we've made it two-thirds of the way through, we have to pick up the pace a little bit. So Alex, your prompt is, "The ice is cracking underneath us and you're are running away to some kind of relative safety."
ALEX HANNA: Okay, uh I don't know how to how to do this. [mimics panicked exertion] I'm jumping uh foot to foot--I uh I'm dodging this here--I've got my my Yaktrax on uh but I'm still slipping--uh it's heat behind me--I don't I I don't know how to do this. [laughter]
EMILY M. BENDER: All right we made it. Hey check it out, here's a little ledge where there's still some ice and we can look at--oh look around, we're in the AI psychology region now of AI Hell.
ALEX HANNA: Oh man we have one this is--so this is an article from Nature and it's from one of our favorite kind of AI hype researchers, Michal Kosinski, um which you may know from the 'gaydar' study and studies like predicting facial--uh political orientation from face structure, so our favorite um phrenologist. So the title of this paper is, "Human-like intu--intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT."
So first off, um interesting that they distinguish large language models from ChatGPT, and surprised that Nature would publish something with such a you know obvious kind of error, but anyways.
The abstract: "We design a battery of semantic illusions and cognitive reflection tests, aimed to elicit intuitive but yet erroneous responses. We administer these tasks, traditionally used to study reasoning and decision-making in humans, to" these generative models. Um, we find that they display "intuitive system I thinking and associated cognitive errors." But uh it doesn't actually work in ChatGPT. Uh okay this is this is these things where it's they're applying psych--psychological tests uh to language models and then imputing human reasoning in them. Ugh.
EMILY M. BENDER: And they're they're not even actually psychological tests, let me see if I can get down to it, it's these um uh the these CRT tasks, which are basically--there's too many things popping up here--basically like math problems that are designed to trip you up because there's a a sort of obvious intuitive answer that's wrong, and then so-called semantic illusions which is basically just questions that have a presupposition failure.
And does the system answer the question anyway or does it catch the presupposition? It's not psychological tests um and it certainly I mean uh 'intuitive behavior and reasoning biases' is a way overstatement of what that's actually testing.
Um so the abstract ends uh yeah, "Our findings highlight the value of applying psychological methodologies to study large language models, as this can uncover previously undetected emergent characteristics." Um it's like no, psychological methodologies need to be validated for construct validity for these things before you use them, um and emergence is bullshit and we'll get to that in a future episode.
ALEX HANNA: Yeah.
EMILY M. BENDER: All right, still in the AI psychology region of Fresh AI Hell, um a Business Insider article, "Getting emotional with ChatGPT could get you the best outputs," by Aaron Mok, November 12th, um, "Talking to large language models with emotion can help you get better responses, study finds. Researchers added phrases like, 'This is very important to my career,' and, 'You'd better be sure,' to prompts. These findings suggest that AI may be one step closer towards artificial general intelligence or AGI." [laughter]
ALEX HANNA: The claims--just because you say 'this is very important to me,' the claim, the jump that it's better that's you're getting closer to AGI--I'm just like what in the what in the heck.
EMILY M. BENDER: Yeah. So um Ali Alkhatib had wonderful commentary on this on BlueSky, um, so he quotes that um the article and then he says, "oh my god, are people...actually this dumb?" And he goes on and it unfolds like, "BI has two reporters. One is careful and cautious, the other is sloppy & credulous. R1 publishes less often but doesn't embarrass themselves; R2 takes every claim at face value, embarrasses themselves daily. BI looks at metrics and works with more prolific reporters/writers." Um, "like nobody has to be involved," he continues, "in a plan to prop up AI/tech co's, but the more compliant mouthpiece for well-funded tech interests is always going to be more quote 'productive' if they're writing up shit like this without even talking to another researcher about this paper."
Um and then shout out to Karen Hao, "I didn't appreciate how Karen Hao is so good at this until just now. She asked me to talk about some other researchers' paper a while back so she could get the perspective about it that other researchers have on it. That's clearly not obvious to some people."
ALEX HANNA: Yeah, I mean we could go on about so much about the journalistic practice on many of these things about--there's just so much hype in journalism, but yeah I mean it's going to get the clicks, especially when you translate from kind of the limited claims of a paper to something that sounds like 'we're getting closer to AGI.'
EMILY M. BENDER: Yeah. Yeah, all right, before the ice cracks under us, we are moving on over to the legal misapplications area.
ALEX HANNA: I know it feels like we're here almost every week. So Gizmodo writes, uh "Judges Given the Okay to Use ChatGPT in Legal Rulings." This is by Thomas Germain. And, "The UK now permits judges to use the quote 'jolly useful' AI chat--chat bot in court." I mean so it couldn't be more British if you unless you threw in some colonialism. Uh so the first two paragraphs of this say, "Robots may help determine your legal fate if you end up in a British court. The UK Judicial Office issued guidance Tuesday permitting judges to use ChatGPT and other AI tools to write legal rulings and perform several other tasks." And the quote from the guidance says, "'The state of AI throughout society continues to increase, and so does its relevance to the court and tribunal system," the Judicial Office, which oversees judges, magistrates--" Oh no a ad just popped up. "--and members of tribunal panels in English and Wales, said in a statement. 'The guidance is the first step in a proposed suite of future work to support report the judiciary and their interactions with AI.'"
Uh so they're just allowing some usage of this to seep into this legal domain, even though so much of this has gone wrong. And they give some different citations of that. But we could go through and talk about new things--
EMILY M. BENDER: New examples of this. And this this is under the schadenfreude part of things. So JJ on BlueSky, "Y'all it happened again. Stop relying on generative AI for legal research. Or don't, but then I'm going to use it for my CLE." Which maybe is a law school term um.
ALEX HANNA: I don't know.
EMILY M. BENDER: So this is uh a filing in the United States District Court southern district of New York, in "United States of America versus Michael Cohen, Defendant." Um and basically it seems like Cohen's lawyer uh used generative AI maybe ChatGPT uh to generate a previous filing. Um so, "On November 29th, 2023, David M. Schwartz, counsel of record for Defendant Michael Cohen, filed a motion for early termination of supervised release." Um let's see, "As far as the court can tell" none of the cases cited in the paragraph I just skipped exist. Footnote: "The court is apparently not alone in being unable to find these cases. New counsel entered a notice of appearance on Mr Cohen's behalf after the government opposed his motion, and with the Court's leave filed a reply letter. Um, the reply letter asserts that many courts in this District have granted early termination of supervised release in similar circumstances after citing to various cases mostly by docket number, includes the following footnote. 'Such rulings rarely resort in--result in reported decisions. While several cases were cited in the initial motion filed by different counsel, undersigned counsel was not engaged at the time and must inform the court that it has been unable to verify those citations.'"
ALEX HANNA: I feel really bad for the the clerks at this court. They're just trying to do--they're wasting all this time trying to find these fake court cases and we're only hearing about this one--this is Michael Cohen, Donald Trump's former lawyer um who has you know been very public and doing a bunch of press spots.
EMILY M. BENDER: All right--
ALEX HANNA: Yeah more legal--yeah more legal hell. We're in, "Brazilian city--" We're on the AP. "Brazilian city enacts an ordinance that was secretly written by ChatGPT." By um Diane oh--who the author the byline D Dian Diane Jeantet and Mauricio Savarese. And this is an--writing from Rio. Uh, "City lawmakers in Brazil have enacted what appears to be the nation's first legislation written entirely by AI, even if they didn't know it about it at the time. "The experimental ordinance was passed in October in the southern city of Porto Alegre, and city councilman Ramiro Rosário revealed this week that was written by a chatbot, sparking objections and raising questions about the role of artificial intelligence in public policy.
Rosário told the Associated Press that he asked OpenAI's chatbot, ChatGPT, to craft a proposal to prevent the city--" Oh, an ad just popped up, we are so added up here. "--to prevent the city from charging taxpayers to replace water consumption meters if they are stolen. He then presented it to his 35 peers on the council without making a single change or even letting them know about its unprecedented origin." Oh so that's really rough practice, you're not even letting your your your your colleagues know this.
This is pretty disappointing because Porto Alegre has done some pretty interesting participatory projects in the past. They're pretty famous for a participatory budgeting exper--um experiment that they did. But this is really taking that experimenting to a really unfortunate place right.
EMILY M. BENDER: Right. And Rosário is being all defensive here. Uh quote, "If I had revealed it before, the proposal certainly wouldn't even have been taken to a vote." Dude, you could have written your own legislation based on that idea right. Um and I he says, "'It would be unfair to the population to run the risk of the project not being approved simply because it was written by AI,' he added." It's like no, it's unfair to the population to not do the job that you were elected to do.
ALEX HANNA: Right.
EMILY M. BENDER: If it's a good idea, you can write the stuff yourself and make sure you're writing legislation that actually works. Okay, uh next, here is a really fun Washington Post article um that that goes unexpected places.
So, "These lawyers used ChatGPT to save time. They got fired and fined." By Pranshu Verma and Will Oremus, published November 16th of this year. Um, "Artificial intelligence is changing how law is practice but not always for the better." Um so it starts with, "Zachariah Crabill was two years out of law school, burned out and nervous, when his bosses added another case to his workload this May. He toiled for hours writing a motion until he had an idea: Maybe ChatGPT could help? Within seconds, the artificial intelligence chatbot had completed the document.
Crabill sent it to his boss for review and filed it in the Colorado court. 'I was over the moon excited for just the headache that it saved me,' he told the Washington Post. But his relief was short-lived. While surveying the brief he realized to his horror that the AI chatbot had made up several fake lawsuit citations. Crabill, 29, apologized to the judge, explaining that he'd used an AI chatbot. The judge reported him to a statewide office that handles attorney complaints."
And he was fired. So like okay, our legal system actually working as it should and catching these things, um but then the ending of this just like is so disappointing. Um it's long but uh just taking us down to the end, um okay uh. So back to Suresh Venkatasubramanian, "But even using AI for legal grunt work such as e-discovery comes with risks, says Venkatasubramanian, the Brown professor. 'If they've been subpoenaed and they produced some documents and not others because of a ChatGPT error -- I'm not a lawyer but that could be a problem.' End quote. Those warnings won't stop people like Crabill, whose misadventures with ChatGPT were first reported--" Blah blah blah. Um, "He says he still believes AI is the future of law. Now he has his own company and says he's likely to use AI tools designed specifically for lawyers to aid in his writing and his research, instead of ChatGPT. He says he doesn't want to be left behind. 'There's no point in being a naysayer,' Crabill said, 'or being against something that is invariably going to become the way of the future.'"
ALEX HANNA: Wow.
EMILY M. BENDER: Yeah.
ALEX HANNA: Yeah, so incredible incredible twist here. I didn't get to the end of this article so getting fired for something but saying, no I'm actually going to lean into it, and invariably probably being rewarded for it by lots of VCs and folks really looking for someone that has you know quote unquote some real world expertise on this but is doing it way wrong.
EMILY M. BENDER: Yeah, all right we've got five minutes Alex before this place melts entirely.
Um--
ALEX HANNA: Oh gosh, all right.
EMILY M. BENDER: So going out big [laughter].
ALEX HANNA: Yeah, so this one came out today uh so there was was published yeah--no actually it was published four days ago. So this is something published in um Nature and Science Computing against--Nature Computational Science again and it's uh the title of the article is, "Using sequences of life-events to predict human lives." Um so very you know kind of wild name. The abstract here um, it says, "Here we represent human lives in a way that shares structural similarity to language, and we exploit the similarity to adapt NLP techniques to examine the the evolution of predictability of human lives based on detailed event sequences.
We do this by drawing on a comprehensive registry database, which is available for Denmark uh across several years, and that includes information about life events related to health, education, occupation, income, address, and working hours, recorded with day-to-day resolution."
So shout out to Scandinavian administrative data, the backbone of so much demographic research. "We create embeddings of life events in a single vector space, showing that this embedding space is robust and highly structured. Our models allow us to predict diverse outcomes ranging from early mortality to per--personality nuances--" Which, that's pretty scary. "--outperforming state-of-the-art models by a wide margin. Using methods for interpreting deep learning models we probe the algorithm to understand the factors that enable our predictions. Our framework allows researchers to discover potential mechanisms that impact life outcomes, as well as the associated possibilities for personalized interventions."
So this is yeah--so this is so this gets called life2vec [laughter] so you know doing this kind of meme of of of of word2vec, um basically taking life events and casting them into into an embedding space, um which is pretty horrifying but I would want to say there's a bit of a difference because this kind of prediction has been used in certain kinds of demographic researchers. And and you pointed out one of the authors on this, Anna Rogers who's also an NLP person, the original title of this uh this this kind of research um was published um with uh folks in the reporting in USA Today, uh and and Sasha Luccioni posts in a tweet, "I don't care if it was published in Nature, this is a load of poop emoji. You can't predict people's deaths and frankly I don't see why we're wasting our time trying."
But the original title is, "When will you die? Meet the quote 'doom calculator', an artificial intelligence algorithm." And Anna in her followup tweet says, "Indeed, what in the fresh Hell. I contributed to the project while I was at um CPH_SODAS." Which is the Copenhagen Center for Social Data Science. "--I do not agree with all the framing in the final paper (one of the difficulties in cross-field collaboration) but it's nowhere close to how USA Today presented it." And uh basically what she says--she goes "The paper uses prediction of registry life events to learn representations that could be used for social science research. It is not meant as any kind of 'doom calculator,' especially outside of a very specific Danish sample and especially for any kind of decision making." And I will say there this is this gets I know this might be outside the purview of the kind of critique of many of these prediction models--I've been in rooms where they've been there have been presentations of kind of different kinds of life events that happen. So for instance um I've been in a room where there's been uh you know Swed uh Swedish or actually Norwegian administrative data being used to predict things, like if someone will be incarcerated in their lives and kinds of criminality. And whatever methods you're using there, that's a problem, right. Uh you know predicting from birth someone's own uh uh uh you know life outcomes.
EMILY M. BENDER: The the prediction frame is entirely problematic here, right there's--you could do statistical analysis, like the the degree of data being on people sounds super creepy and surveillance state, I have to say um but and in the ethical and broader impacts uh section here they actually talk about how the data is safeguarded, which was reassuring to see, um but the you know if you've got that data and you're asking questions like, 'are we finding more adverse health events for people who work in this sector?' or whatever, like that's useful information that can guide public policy, but it shouldn't be framed as we're going to predict what's coming in this person's life next. It's just it just muddies the waters and this looks to me like um maybe it's old, maybe maybe the statistical hype was already doing this, before we had the current AI hype but it looks like social science is getting infected by AI hype, and then of course the newspaper is running with it in the worst possible way.
ALEX HANNA: Yeah, and I will say I mean the kind of--for administrative data of this sort they tend to put it in a data clean room, you know you can only analyze it on site. But predictive models in general I mean this is this is this could be its own episode, um just focusing and--but to call it a 'doom calculator' and to call it life2vec, no, let's let's not do that please.
EMILY M. BENDER: Let's not do that. Um so uh fun comment here from Abstract Tesseract: "The Game of Life: Vector Space Edition. The holiday favorite nobody asked for."
All right so our escalator out of um Fresh AI Hell, which seems to be shedding its layer of ice, um is this wonderful piece by Angie Wong in the New Yorker Sketchbook from November 15th. Um with the title, "Is My Toddler a Stochastic Parrot?"
And it's sort of presented as like graphic novel um and tells the story of um you know thinking about ChatGPT while also watching her toddler grow and she sort of goes deep deep into the 'maybe we are just stochastic parrots' thing. And there's this wonderful bit where it's represented as an abyss in the um yeah here. So she says um, "I had the dizzying sense of the ground giving away under our feet." And the toddler says, "Mama!" "I even thought, what's the point of us? Losing to AlphaGo caused Lee Sedol to quit Go. My student told me he turned in AI art for his final project because AI draws better than he does." And then she says to her child, "I'm here!" falling after him. Um, "Some media outlets are already firing their writers and replacing them with ChatGPT."
And on and on there's all the all the Fresh AI Hell. Um but she catches the toddler um and eventually um falls through more of this, and she says, "For this to be true, we'd have to accept a kind of solipsism. What do--what do other people matter to us? I don't want to be alone in a sea of machines. But it does really matter what's real and what's counterfeit--" Oh, "but does it really matter what's real and what's counterfeit if the extruded product is convincing enough? What does it matter who created the art, humor, or machine if it stir something inside me? What does it matter whether I turn to my husband or to ChatGPT to voice my thoughts? When I was alone inside my body behind a screen in my house, I could have entertained that question. But when I was pregnant I felt my baby's presence inside me every single day." Um and so she goes through the right of passage of birth. "He is alive to me and I am alive to him. Small children are the most helpless of creatures, the least able to communicate intelligently, the furthest from efficiency. They are the least useful, the least creative and the least likely to pass a bar exam."
But she goes through this and basically finds meaning in connection, um. "A toddler has a life, and learns language to describe it. An LLM learns language but has no life of its own to describe."
Um so anyway I want to encourage people to read this whole thing, it's beautiful um and I thought really affirming. So maybe that is our ticket for today Alex out of Fresh AI Hell.
ALEX HANNA: Oof. I'm gonna de-gear, take off the frozen uh you know the frost that is on my weirdly charred hands, uh somehow both frost bitten and also oddly warm. But yeah thank you for for joining us, uh I really encourage you to to look at this this New Yorker piece, it's absolutely beautiful um lovely lovely form in use of the web form.
Uh our theme song is by Toby Menon, graphic design by Naomi Pleasure-Park, production by Christie Taylor and thanks as always to the Distributed AI Research Institute. If you like this show, you can support us by rating and reviewing us on Apple podcast and Spotify and by donating to DAIR at DAIR-Institute.org. That's D-A-I-R hyphen institute dot org.
EMILY M. BENDER: Find us and all our past episodes on PeerTube and wherever you get your podcasts. You can watch and comment on the show while it's happening live on our Twitch stream, that's Twitch.tv/DAIR_Institute, again that's D-A-I-R underscore Institute. I'm Emily M. Bender.
ALEX HANNA: And I'm Alex Hanna. Happy New Year and say stay out of AI Hell, y'all--frozen or not.