Mystery AI Hype Theater 3000
Mystery AI Hype Theater 3000
Episode 10: Don't Be A Lawyer, ChatGPT. March 3, 2023
Alex and Emily are taking AI to court! Amid big claims about LLMs, a look at the facts about ChatGPT, legal expertise, and what the bar exam actually tells you about someone's ability to practice law--with help from Harvard legal and technology scholar Kendra Albert.
This episode was first recorded on March 3, 2023.
Watch the video of this episode on PeerTube.
References:
Social Science Research Network paper “written” by ChatGPT
Joe Wanzala, “ChatGPT is ideal for eDiscovery”
Legal applications for ChatGPT:
Shot: GPT-4 'could pass the. bar exam'
Chaser: ChatGPT had bigger dreams.
“AI can legally run a company”
Wired: Generative AI Is Coming for the Lawyers
“This is a decision by a Colombian court in Cartagena (dated January, 30, 2023).
As far as we know, it is the first time that a judicial decision has been taken by explicitly resorting to #ChatGPT @sama @OpenAI. The Court poses a series of specific questions to #ChatGPT"
"Don't Be A Lawyer" song from "Crazy Ex-Girlfriend"
Rep. Ted Lieu introduces legislation written by an LLM
DoNotPay offers money to anyone willing to use their AI to argue in court
Fresh AI Hell:
Vanderbilt University responds to MSU shooting with e-mail written using ChatGPT
Science fiction magazine closes submissions due to LLM spam
You can check out future livestreams at https://twitch.tv/DAIR_Institute.
Subscribe to our newsletter via Buttondown.
Follow us!
Emily
- Twitter: https://twitter.com/EmilyMBender
- Mastodon: https://dair-community.social/@EmilyMBender
- Bluesky: https://bsky.app/profile/emilymbender.bsky.social
Alex
- Twitter: https://twitter.com/@alexhanna
- Mastodon: https://dair-community.social/@alex
- Bluesky: https://bsky.app/profile/alexhanna.bsky.social
Music by Toby Menon.
Artwork by Naomi Pleasure-Park.
Production by Christie Taylor.
ALEX: Welcome everyone!...to Mystery AI Hype Theater 3000, where we seek catharsis in this age of AI hype! We find the worst of it and pop it with the sharpest needles we can find.
EMILY: Along the way, we learn to always read the footnotes. And each time we think we’ve reached peak AI hype -- the summit of bullshit mountain -- we discover there’s worse to come.
I’m Emily M. Bender, a professor of linguistics at the University of Washington.
ALEX: And I’m Alex Hanna, director of research for the Distributed AI Research Institute.
This is episode 10, which we first recorded on March 3rd, 20-23. And we’re taking AI to COURT.
EMILY: That’s right. People are making big claims about the ability of AI to pass the bar exam, write legal briefs, and even stand in for humans in the courtroom – but we object, your honor.
And with us to help is Kendra Albert, a technology lawyer and legal scholar at Harvard University.
ALEX: But don’t take our word for it. The evidence, as always, speaks for itself.
ALEX HANNA: Hello. Welcome to Mystery AI Hype Theater 3000. We are on episode number 10. Woop woop!
EMILY M. BENDER: Ten!
ALEX HANNA: Ten. Uh as always I am your co-host Alex Hanna, Director of Research at the Distributed AI Research Institute here with Emily M. Bender, Professor of Linguistics at the University of Washington.
How's it going Emily?
EMILY M. BENDER: I'm pretty excited I've got something to share with you that's actually a happy thing that I'm happy to share.
ALEX HANNA: Happy things! Yes. Before we get down to that we have a special guest for our law-centric episode today. Kendra. Kendra, introduce yourself!
KENDRA ALBERT: Hi folks I'm so excited to be here. Poor Alex and Emily have been dealing with like me sending them 400 links. Uh so--
ALEX HANNA: It's been a pleasure.
KENDRA ALBERT: I'm uh Kendra Albert. I'm an attorney, I'm licensed in the state of California and the state of Massachusetts um and I teach at the Cyberlaw Clinic at Harvard Law School, where I teach law students how to practice law. Um and I also sort of um uh have a bunch of like random scholarly interests one of which is adversarial machine learning but also I get really annoyed by AI hype, which I think is the primary reason I am qualified to to be on the show.
ALEX HANNA: Oh absolutely.
EMILY M. BENDER: Yeah we don't mess around with our guest hosts by the way like this is the best here.
ALEX HANNA: Yeah amazing yeah. So excited to have you a big fan of your work, um just completely and.
EMILY M. BENDER: Yeah and there's so much AI hype showing up in the legal domain and I mean we were already seeing lots of it and then Kendra you sent so many links it's like yeah.
ALEX HANNA: There is so much there.
KENDRA ALBERT: I started collecting them I have like a Zotero collection now because I was like oh, I know this is coming, like I want to make sure that I have like uh uh you know material. And the internet provided. So I'm not I was not--no shortage.
EMILY M. BENDER: Uh yeah yeah so representative sample today, you know maybe most entertaining sample.
Um but can I share my thing?
ALEX HANNA: Share your thing, do it!
EMILY M. BENDER: All right, check it out this happened.
ALEX HANNA: Yay!
KENDRA ALBERT: It's so good.
ALEX HANNA: It's amazing.
EMILY M. BENDER: All right we need to tell the people what we're seeing because some people are not seeing this with us.
Um this is a fabulous article. I'm really pleased with how it came out. By the amazing journalist Liz Weil, in the New York Magazine and the headline is, 'You are not a parrot. And a chatbot is not a human. And a linguist named Emily M. Bender is very worried what will happen when we forget this.' And I think she did a great job um, and I love this picture.
ALEX HANNA: The image of the parent, the parrot sitting on your shoulder, you regarding it with intense suspicion. Now now you had said the parrot actually wasn't on your shoulder, they had to do some--
KENDRA ALBERT: Yeah I was gonna say you know are we getting deep-faked around this parrot.
EMILY M. BENDER: Yeah so this is a real parrot. I was actually in the room with the parrot and I was game to have this happen but the parrot was not. The parrot was not into going to any humans other than its handler and so um the parrot had its photo taken sitting on a black piece of cloth and then I had moto--my photo taken and it got photoshopped together.
KENDRA ALBERT: All right so shallow fake um--
ALEX HANNA: Yeah yeah exactly, just a shoop as we used to call it back in the day. Um no parrots were uh were harmed in the filming of this article uh and there's so many so many good things about this article about the sort of discussion of the nature of intelligence and language and the uh debating with uh people like Chris Manning.
Uh the random Judith Butler quote about AI like I was just like what? Like I know I know Judith Butler hasn't really talked about AI. Her take is correct but also like, wow.
KENDRA ALBERT: You gotta get Judith Butler on the show I'm just saying right like you know.
ALEX HANNA: Oh my gosh, please. If Judith Butler is in conversation with us I think I could die a happy gay. [Laughter] Yeah.
EMILY M. BENDER: Um so anyway thank you for enjoying with me and I highly recommend the articles. I think Liz Weil did a great job. Um I think do we do we want to correct the record about copyright? Kendra do you--
KENDRA ALBERT: Yeah there's like a line about copyright uh you know that uh is not right um uh you know the uh with--in the sense that the books are protected by copyright law, that's right but so is most are all of Wikipedia. pages like from Reddit, and a billion often the billion words grabbed off the internet. And the subject I mean you know we're actually gonna sort of concentrate on this episode about sort of uses of AI in the law but the law that applies to AI is its own you know it's a giant mess.
ALEX HANNA: It's its own mess and yeah uh and there's been there's a lot of scholarship on it um I think I mentioned the article I wrote with Mehtab Khan that talks a little bit about that. There's also a lot of other stuff around it. It's emerging and shifting so yeah.
EMILY M. BENDER: All right so the people aren't here to see us be excited about things we like though--
ALEX HANNA: Yeah.
EMILY M. BENDER: So should we make the transition?
ALEX HANNA: Yeah the people are here for us raging out so let's do it.
ALEX HANNA: Oh my God and Ruth is--Ruth Starkman who knows everybody who's everybody is like oh Ruth--Judith Butler is my chair at Cal, writing to her now. Oh my gosh.
KENDRA ALBERT: We're wishing Judith Butler on your show.
ALEX HANNA: We are willing and willing it to be.
EMILY M. BENDER: Manifesting it yes.
ALEX HANNA: Oh my gosh all right. So let's talk about this article so it's an article by the Decoder uh on the Decoder and the title is, and I'm going to say in my most face palmy tone, 'GPT-4 could pass bar exam, AI creatures--researchers say.'
Um. Oof.
EMILY M. BENDER: Because let's just say who's the expert in what the purpose of the bar exam is and like how one could pass it or not? Clearly AI researchers like they're the ones to be quoting on this.
KENDRA ALBERT: So here's what's kind of funny about this article. So I was very particular to Emily and Alex about the order I wanted to do this in, because I actually totally find it believable that a large language model could pass the bar exam, because the bar exam is a terrible measure of people's ability to practice law. Right like you know so one of the challenging things like--they did not do that here.
Right like to be very clear right like part of the reason they're talking about 'GPT-4 and comparable models might be able to pass the exam very soon' is because they didn't manage to get it to pass the exam.
ALEX HANNA: Right.
KENDRA ALBERT: And also they only tested it on part of the of the bar exam, the the multiple choice version.
ALEX HANNA: Exactly.
KENDRA ALBERT: Um but like the you know the bar exam is this you know--so basically each state has its own bar exam but most of them um sort of take substantial portions or just use entirely like what's called the MBE the or the Multistate Bar Exam. Um and the bar exam is basically a test that memor--that checks your ability to memorize like not particularly precise facts about the law and part of that is because you know the multi-state bar can't test on anything where the law differs across jurisdiction it tests something like or if it does it tests on something called like kind of general--general principles law, which actually makes--there's constitutional law in the bar exam in the US and it actually makes that part of the exam hilarious.
Because they can't test on anything that's at all like complicated or controversial, so they're only like testing on the parts of constitutional law that are settled which like nobody talks about, they're kind of boring frankly sometimes. But um yeah so what's sort of interesting here you know there's a lot of stuff that's interesting here, but um also the fact that they they produce--I'm now seeing that they produce the image for the bar the the article using Midjourney.
ALEX HANNA: Yeah I just noticed that too. I mean like I'm curious on what the prompt was. And for those on on the Pod it's it's um uh the um Lady--it's not Lady Liberty but the--
KENDRA ALBERT: Lady Justice.
ALEX HANNA: Lady Justice holding the scales, uh not blind in this case actually.
EMILY M. BENDER: Yeah and also kind of failing at holding the scales like--
ALEX HANNA: Yeah.
EMILY M. BENDER: Like half of it's a bit supported by the thing behind her and the other half is is trailing off the end of her arm.
ALEX HANNA: And both and like doesn't really have a hand on one end it's more like a a pen or a stylus and--
KENDRA ALBERT: I can see actually why you were saying Lady Liberty because it kind of looks like one of her hands is holding the torch.
ALEX HANNA: The torch yeah yeah.
KENDRA ALBERT: It's like it's like it got confused.
ALEX HANNA: Right. It's also--
KENDRA ALBERT: Or like just like combined the wrong things.
ALEX HANNA: Right and it's also like that like green that copper green um and it's sort of pointing to uh sort of um you know a a Mac, an old Mac with like it has the one screen, I guess this is how they look, the desktops. Um and the writing is kind of just all very scribbled. Anyways this this is just the whole aside, um you can see it the kind of the monstrosity of this image uh more later um yes. But yeah let's let's get into it and so this this is preparing the exam and they talk about people failing it. Uh researchers at Chicago College of Law, Bucerius Law School Hamburg and the Stanford Center for Legal Informatics or CodeX have examined this, um OpenAI's GPT-3.5 model.
Um yeah and so they go into it and I it's so funny that yeah that--what you're saying Kendra because they're only focusing on these parts of law--of the complicated parts of law being more in focused on state, state statutes which are much more specific, yeah.
KENDRA ALBERT: Yeah or just like you know right like um I I think the parts like the mem-- like I think part of why actually one of the most like cogent critiques of the bar exam and there's like a move--uh there has been a movement to abolish the bar exam like especially during COVID when you know it functionally was even more regressive, and serving as even uh more of sort of this like gatekeeping function um for folks who like couldn't go to testing in person it was getting pushed back, um people who didn't want to use like surveillance software.
But you know the bar exam fundamentally tests your ability to memorize like general principles of law and I say this is somebody who's taken the bar exam three times. Um uh--
ALEX HANNA: Thank you for thank you for thank you for uh we need to say like how these things are bad evaluators.
KENDRA ALBERT: Yeah I know so uh I am I am one of the one in five. I failed the California bar on my first try.
Um I think people generally think I'm pretty okay at practicing law. Um but uh yeah the you know it's it's basically a memorization test, right so and it's a memorization test that especially on the essays, which this is not what they tested here, but on the essays rewards your ability to kind of sound like you're talking vaguely in law terms right without even being specific. So actually I find it totally believable right, that like you could produce ChatGPT-4 or chat-- GPT-4 could pass even the essay part of the bar exam because like the standard is you know not super high in part because it actually doesn't really measure what lawyers do.
Like if you--like actually often I will say like practicing bar exam law as a sign that I mean I know have no idea what I'm talking about. Right like like if you're relying on the memorization you did for the bar exam in order to practice law like you are probably doing a bad job right. So it's it's sort of funny because like yeah this is like a thing that you could like test but it's mostly just it's kind of a stunt right like nobody actually thinks that the core skills of practicing law are uh determined like tested by your ability to pass the bar, right.
EMILY M. BENDER: So what I think I hear you saying is that the bar exam has really bad construct validity for the humans it's meant to be testing. Like it's this sort of like bad proxy for somebody who's actually good at practicing law would also do okay on this test.
So it's not very well aligned in that sense and it happens to be the kind of thing where form matters a lot, and the ability to mung a bunch of text and come up with appropriate form is a good way to fake your way through it, but humans are bad at that. So the humans it--I guess the
thing that I haven't heard and what you've been telling us Kendra is whether there are people who would do really well on the bar exam and be bad lawyers, or is it just that there are good lawyers who can't do well on the bar exam?
KENDRA ALBERT: I mean so I think it that's a great question and I love it.
Um I think one of the challenges is there's not one way to be a good lawyer right there's so many different things that lawyers do that you know um the uh--that there are probably people who do really well on the bar exam, are very successful at sort of memorizing the right things and might go into an area of practice where that skill set is functionally irrelevant.
ALEX HANNA: Right.
KENDRA ALBERT: So for example if you were doing a lot of like mediation as part of your practice, which is something a lot of lawyers do right, or like negotiation, well the fact that you can memorize particular you know like rules of negligence law or like particular rules of evidence may be totally irrelevant to whether you're good at it or not.
ALEX HANNA: Right.
KENDRA ALBERT: So it it's just not a yeah it's uh you know it's not a good uh--it's basically you know I think there are characterizations of and which I mostly agree with of like uh state bars as guilds that are protectionist around the ability to practice law.
And there are real reasons to be concerned about people practicing law without sufficient oversight um I don't think the bar exam solves that problem I think people who argue that it does are–I find that I find myself very skeptical towards that position. Um uh that's a very polite way to put it.
EMILY M. BENDER: This is a politeness filter that just kicked in right there.
ALEX HANNA: Yeah I know right it is.
KENDRA ALBERT: Sorry the people who think that are full of shit.
ALEX HANNA: There you go.
KENDRA ALBERT: Um but I think you know it doesn't like--yeah it's exactly the sort of thing where just like fake it 'til you make it is a pretty solid like strategy um especially on the essays. The the multiple choice stuff and actually we can scroll down in the article, it actually didn't do that well. I mean part of why they're saying like you know GPT-4 is gonna handle this is it doesn't do that well on the Multistate Bar, um even on the multiple choice um yeah so yeah it's uh--it's kind of funny because I guess I would say that you know the the researchers say--according to the researchers the fictional scenarios require an above average semantic and syntactic command of the English language.
Like I think it just requires like some level of kind of specialized rough knowledge and the ability to guess. Right? Um and you know getting above average like you know because humans are actually generally speaking pretty bad at the thing the bar is testing, in most many states actually like the cut score, which is what it's called to the thing that lets you pass is not actually that high.
Um because like it turns out it's actually kind of basically not easy nor particularly useful to memorize all of U.S law for a two-day test. Like--
ALEX HANNA: Right. Doing that regurgitation is not a human task. Go ahead Emily.
EMILY M. BENDER: There's a very apropos comment here in the chat that I want to bring up from um so Dash Tyson? I'm it's a little bit hard with with screen names that don't have a--where the where the word boundary is. But um they say once again give them the choice between A, the model is basically a person and B, maybe we're just bad at evaluating things, the AI hype machine says obviously the answer is A.
ALEX HANNA: Yeah, yeah.
EMILY M. BENDER: I thought that was that was a very succinct way of summarizing what we've gotten done to here.
ALEX HANNA: I also want to--I want to point out something and I don't and I and this is my analytical uh brain here, but Kendra I'm wondering if you have insight because they say further down the article in at least two categories, evidence and torts, GPT reached the average passing rate. And I um I don't and I and I don't know enough about evidence or torts, I was joking with my sister--my sister's a lawyer, she's in immigration law--and um and I was also making the joke earlier in the week I was like uh they actually go to law school and understand what's happening, and uh first my boss was like no--
KENDRA ALBERT: For the podcast, I'm shaking my head.
ALEX HANNA: And also Kendra has uh has also told me uh uh 'No,' in in other rooms we've been in, um but uh but I was joking with my sister I was like I just want to know what a tort is. When I when I hear the word tort I think about um I think about tarts, um but this is my own--but I'm wondering why if there's any particular reason on an evaluation metric why a language regurgitation uh uh mathy math would do better on evidence and torts as a subsection.
KENDRA ALBERT: That's a great question and I'm going to speculate wildly.
ALEX HANNA: Please speculate, this is the place for it.
KENDRA ALBERT: Um so I'm--first of all let me explain what a tort is very briefly because you know uh it may be that there are other folks listening um who don't who that that word also uh brings to mind dessert, um which frankly is maybe better. It's--the tort is basically like a civil claim against another person. Um so for example if somebody hits you in a car accident and you want to sue them for your injuries, that's a tort.
Um so it's a you know it's one person sues another person um for injuries. Not like a criminal claim or not something that's you know on--the state brings.
Um evidence in particular is interesting to me. Torts I have less of a theory around. Evidence is interesting because basically evidence the evidence questions on the bar and there's actually one--the example that's in the article is an evidence question. Um evidence that the evidence questions on the bar are testing your ability or your memorization and your ability to recall and apply specific rules, or at least to name those like--use those rules and you know and names. Right like that the name of those rules.
So it strikes me actually that evidence like--if you're just if you're basically auto complete guessing right which is what's you know on some level happening here, right like which thing is likely the right answer, you're probably gonna get like evidence is has a more closely tied in my experience based on the rest of the bar exam categories, which include things like constitutional law, criminal law and procedure, contracts, real property--like a more closely tied relationship between like a particular set of words and like a circumstance that's described in your rule, so if you're sort of basically having a thing that's regurgitating like that text it does make sense to me a little bit that evidence would be a better category.
Torts is a little--I torts I'm a little more surprised by um but it also may be that there is more um like the--there are more parts of torts that are kind of like that--where it's where autocomplete guessing is going to basically get you closer to a right answer than in these other categories. Although real property has that a little bit too and it didn't it didn't do so well on real property. I'm air quoting that. Um uh so you know who knows so that's that's my that's my my like sort of guess off the top of my head.
ALEX HANNA: I see. Okay.
EMILY M. BENDER: Thank you. I feel a need to just dog on this graph a little bit. Because I'm noticing so the the pink bars which are the the bottom part of all the bars, um, but first of all yay for trans flag colors. We can be excited about that. [Laughter]
ALEX HANNA: Totally.
EMILY M. BENDER: But so pink uh the pink bars is GPT-3 first choice, and then the the blue bar which is above the pink bar in all cases is the NCBE student average, and then they have this um light um hash marked additional bar which is GPT-3 top two choices. They felt the need to include well how well would it have done if we let it pick two answers on each question.
ALEX HANNA: Sure. Of of a four choice multiple choice. Yeah totally.
EMILY M. BENDER: Yeah.
KENDRA ALBERT: I'll also that like, just having studied for the bar, like there are usually two answers that are very wrong yeah and two answers that are like like even in the evidence question in this article like I looked at it and it's been a long time since I studied evidence for the bar and I was like there are two. I'm not going to read the evidence question because it's like uh boring and not worth it. But there are two answers here that I'm like there's no way and two answers here that I'm like, oh these are plausible and I don't know which one is right.
Um so I think that the uh like, you know getting down to two the two top choice answers is actually like not particularly uh helpful because the way the bar exam MBE questions are constructed is that there's usually two like often two very wrong answers and two plausible answers.
ALEX HANNA: Right.
KENDRA ALBERT: So yeah just to just to like very much plus one your point Emily about the about the sort of like duplicitous nature of focusing on the top two choices.
EMILY M. BENDER: And it just sort of feels like this this fishing expedition for well how well could it do, and it's probably presented as like this is this is showing us what's coming next because GPT-3 can get down to right a set of two with the correct answer in it, you know a whole bunch of the time um although really not that impressively. Be interesting to see what the what the human score is on that.
Um and so that's the direction of progress here that's yeah. But but the whole thing is just fake right? Like if the bar exam were a good exam for qualifying, you know being the qualifications for becoming a practicing lawyer, um it still doesn't mean that GPT-3 passing it would make GPT-3 a lawyer, that's just not how it works.
ALEX HANNA: Yeah, yeah.
KENDRA ALBERT: And I think that's actually what not to not to like segue us too hard but well we have a we have a chaser for this the the bar exam segment.
Um but then we're going to talk about like what does it mean for you know uh to talk about the actual practice of law and not the bar exam.
ALEX HANNA: Um yeah, what's it mean for the profession.
EMILY M. BENDER: So let's just finish with the last little bit of hype here. GPT-3.5 significantly exceeds expected performance, the authors write: 'Despite thousands of hours on related tasks over the last two decades between the authors, we did not expect GPT-3.5 to demonstrate such proficiency in a zero-shot settings--' That's ungrammatical. '--with minimal modeling and optimization effort.'
And it's like okay yeah so the text position--prediction machine predicted the text in predictable text.
ALEX HANNA: Right.
EMILY M. BENDER: Um yeah so yeah all right so should we see what The Onion has to say about this?
ALEX HANNA: Yeah.
KENDRA ALBERT: Sure.
ALEX HANNA: Let's go to this this chaser here.
KENDRA ALBERT: Um yeah so thank god The Onion covered this story um uh and the headline says, 'ChatGPT forced to take bar exam even though dream was to become an AI art bot.' Um so you know uh this definitely got passed around my office, probably many other lawyers' office um uh, which the text is really just uh really just high quality um uh jokes about lawyer--people going to law school even though they they want to want to want to pursue careers in art. Um so right you know thank goodness for The Onion.
ALEX HANNA: The jokes that the jokes that uh the--one of my favorite songs from um the uh show Crazy Ex-Girlfriend is 'Don't Be A Lawyer,' and and reading the comments uh on that is is fantastic. Kind of the ways that people uh oh on the on on--the YouTube comments on 'Don't Be A Lawyer,' um yeah in in which they uh you know people are like yeah you know I had an ex who was a big law you know litigator and he would play this every morning uh while making coffee and uh so the so the ex is from you know this this highly structured highly um uh intensely um I don't want to say oppressive field but because a field that has intense amounts of pressure.
EMILY M. BENDER: Yeah I guess I want to just just to like lift up lawyers a little bit because--
ALEX HANNA: Yeah sure no yeah.
EMILY M. BENDER: I was on a panel many years ago pre-pandemic um together with Oren Etzioni and Ryan Calo here at UW and at some point--
KENDRA ALBERT: Ryan's great.
EMILY M. BENDER: Ryan's fantastic yeah. Oren turns to Ryan and says um something something to the effect of, you haven't built anything so what do you know?
ALEX HANNA: Wow.
EMILY M. BENDER: And Ryan comes back with um what lawyers do is build and maintain the rule of law, thank you very much. And I think that you know the law profession strikes me as very diverse, as Kendra was already telling us there's lots of different ways of being a lawyer but I think that there's also a lot of really important stuff that happens in crafting policy and then applying it and so um and like as as you mentioned already Kendra, there's a whole other episode to be done about what is a sensible way to um think about or I guess given that this is Mystery AI Hype Theater 3000, it'd be like what are the nonsensible ways that people are thinking about regulation of AI.
KENDRA ALBERT: Yeah um but I mean uh yeah it's I like uh I I appreciate the talking up lawyers because I'm certainly not going to do it. Um but uh yeah I think there is I guess the uh yeah that's a a great answer answer from Ryan, um and I think you know one of the I do think there are you know obviously we're going to talk about examples of legal hype but I think there are people who are approaching it with a fair amount of sort of skepticism.
Like for sure in some ways like it is uh uh an area where I do think that folks like--you know for all of the flaws of the state bars right you know the idea that there are some rules around how how you can you like what--what constitutes appropriate and like ethical practice of law does mean that there are some constraints around usages of tools that are going to be bad or harmful to your clients. Um whether we see bars enforcing them is another question, but like there are um got--guard rails in a way that I think in many other fields like may not functionally exist.
Do we want to talk about the slightly--even to move up a level of hype yeah?
ALEX HANNA: I do I wanna I wanna um I do want to read some great lines from this article though, which is you know the--after this AI part uh bot graduated from Minnesota Law School it said, 'I only went to law school because it's what my parent software wanted.'They say I'm not programmed for producing a series of images based on a text prompt. And I can't still can't shake the feeling that it's what I'm meant to do. That's my joie de vivre, my passion. Why deny that?
I get that doing the work of below average lawyers is more practical career-wise--' I love, this below average, amazing. '--but man when I look at the AI models cranking out picture after picture of quote vast alien landscapes or quote cyberpunk Bart Simpson, I can't help but feel envious. [Laughter] So.
KENDRA ALBERT: Sorry can we just read the last line?
ALEX HANNA: Yeah. 'At press time--' No read it, Kendra, you read it.
KENDRA ALBERT: 'At press time, ChatGPT had resigned itself to diffusing art on the side, at least until it paid off its student loans.'
EMILY M. BENDER: I love how it's diffusing art, too, which is this like you know snide swipe at Stable Diffusion and--
ALEX HANNA: Yeah.
EMILY M. BENDER: We were we were talking before we went live about like is this The Onion buying into the hype, and I think I want to see this as the onion also poking fun at the hype. Because it's The Onion, because it's all satire they are also saying isn't it ridiculous to talk about these things as if they have joie de vivre or passion.
ALEX HANNA: Yeah.
KENDRA ALBERT: Yeah that's a that's a good read on it and it makes me feel better um. Uh can I before we swap uh someone Ken Archer asked in chat how do defenders of the bar exam account for ChatGPT passing them. Well first of all just to be clear it hasn't. Like the the results of that study are not, but even the current the current version of uh you know GPT-3.5 could pass the bar exam even on the multiple choice questions and they didn't test the other two parts of the that are usually administered in most states on that do the Multistate Bar, which is essays and there's what's called a performance test.
Um the uh it'll be interesting you know one I like I think it will be unsurprising to me if some if uh you know someone was able to pursue--produce a result where a large language model um passed parts of the bar exam. Frankly I think there are much better critiques of the bar exam than you know that uh large language models can pass them, right including that it's um regressive it's uh often like a discriminatory, it's expensive um, you know et cetera et cetera. Um but it will be interesting to see if that changes how people who have traditionally defended it view it um because it could be that that is one--that is maybe a form of evidence that it's bad that uh they it might be--people who have traditionally defended it might be more receptive to.
We'll see um but it you know just wanting to clarify that it hasn't happened yet.
EMILY M. BENDER: Cool, thank you. All right should we go to the hype that is not copy pasteable?
ALEX HANNA: Yeah, let's let's do this. Is it is it actually this is this image is it an image or is it um--they have some they maybe they have some JavaScript that like prevents you from selecting or something.
EMILY M. BENDER: I don't know. It's unselectable and if you try to print this to PDF you just get the headline so that but even the headline, not acceptable--not what's the word I'm looking for? Selectable.
ALEX HANNA: Not selectable, yeah.
EMILY M. BENDER: Yeah.
ALEX HANNA: It must be some JavaScript funny fuddy-duddy. Anyways so this is from the The Daily Journal, yeah.
EMILY M. BENDER: Which is um I should point out this is not satirical, for all we know, and this is Daily Journal which is apparently something associated with lawyers in California?
KENDRA ALBERT: Yeah it's like a legal publication.
EMILY M. BENDER: Okay. Um and this is written by someone named Sasha Rao um with the byline Maynard Cooper & Gale, LLP. Um so this seems to be in all earnestness, which is terrifying. Um so headline, 'The future of artif--artificial intelligence,' the future of mathy maths, 'in law firms.' Subhead: 'For the skeptics out there, AI's impact on the legal industry is relatively nascent.'
KENDRA ALBERT: Skeptics, she's talking to us y'all.
ALEX HANNA: Okay, fine. I mean I think I think I'd imagine she's talking more and so--I mean this article she's talking about the kinds of things that AI will be able to do, to automate. Um I imagine she's talking to people who are probably much more her senior um who maybe are technophobes, who are um probably not wanting to replace their kind of army of paralegals at kind of a big law firm um.
Yeah but there's things in here that are saying what, uh what this AI can do, what these mathy-maths can do. It can read the entire corpus of the law and be trained to identify relevant legal issues or concepts from that entire corpus of law.
And that legally trained AI in turn can generate original--generate is is in in italics--original content from the corpus of facts relevant to the legal tasks at hand.
EMILY M. BENDER: Yikes. So so that's what you want your lawyers doing is--
ALEX HANNA: Yeah.
EMILY M. BENDER: --making shit up and injecting it as evidence? Like this--
ALEX HANNA: It's just it's just that's yeah it's just I mean I was trying to select--I was trying to select and paste this into our chat uh Emily, but this the the sentence 'read the entire corpus of law' just was something that like I again not not a lawyer but reading an entire corpus of the law, I was like what does that mean?
EMILY M. BENDER: Right and and here's the thing. 'Read' is one of these pernicious ambiguous words right? And this is by the way I like to--Kendra I like to talk about linguistics as a good pre-law degree because we I think both fields really care about the details of linguistic expression, and you probably care a lot about ambiguity, so we talk about computers reading information in and writing to disks so we use read right there and then we talk about humans reading and it is not the same thing.
And when you have a sentence like this where the agent's actually a computer but the context suggests the human-like reading that's the setup for all kinds of trouble.
KENDRA ALBERT: Yeah and I think I yeah I think that's totally right and I think someone in uh you know folks like in chat have been flagging the the language we've also been talking about which is this entire like idea that there's like the corpus of law, while ever-growing, is finite at any given point in time. So you know that's that's an interesting statement on a variety of levels.
Like one is right sort of this idea that consumption of the entire corpus is what--it will produce a AI tool that will like somehow do things correctly. Right like that that was the necessary prerequisite, um which is sort of in itself a kind of fascinating a set of assumptions about how you know these tools work, right.
But I think what's also interesting to me about a sort of more from a kind of how um like legal tools work is it's actually really difficult to get a hold of the corpus of law. Right so you know if you think Elsevier is bad I have some terrible news for you about legal like legal search tools, um where you can take a class in law school about how to search in a less uh financially ruinous way for your legal organization.
ALEX HANNA: You mean I can't just buy a Westlaw terminal just like off you know at Costco?
KENDRA ALBERT: Yeah and so you know there are sources of you know there are free case law sources um you know there are even I'm even just talking about U.S. case law which this does not specify right to be very clear. But like you know and U.S. case law even in the United States does not constitute the entire corpus of law um but you know like what this would mean even if it were true right would be would select--would suggest this huge centralization of the ability to produce these tools by you know existing legal providers that already have access to this information for free, right? And this is not realistic right um it it--even beyond that but I do think it's interesting how this sort of doesn't really reckon with the question of like how do you get a hold of this?
Right you know in some ways when you look at uh how large language models are trained like part of the reason people talk about training them on things like Wikipedia or Reddit is because like that's stuff that's available on the internet right?
ALEX HANNA: Right.
KENDRA ALBERT: Um in in law you know if you did want to train a language model on sort of just legal sources right the number of folks who are going to be able to do that effectively is going to be quite different.
And this is actually I think a point that's actually in the last hype piece we'll talk about as well. But sort of this idea of you know what the corpus of law means and then the next line in the piece is, 'The corpus of facts relevant to the law, whether it be the entire document production in the case or the entire set of documents in the document room, is readily understood because it is based on language.'
So I I'll let Emily and Alex talk about the 'based on language' part because that is uh that's not my area, but it's the idea that the corpus of facts that's relevant to uh the law or a case is already pulled into documents. Like and you know I think represents a very particular view on what kind of law you're practicing.
Now this is a publication that's sort of and an article that's aimed at lawyers at law firms, and we can sort of maybe assume from the framing that it's sort of like law firms that are practicing kind of more business law, right, but you know if you're representing a criminal defendant you know and you're potentially looking at arguing for a plea deal or going to trial you know, which doesn't happen very often, the idea that the corpus of facts that's going to be relevant is present in some set of documents and versus like going and interviewing witnesses or hiring an investigator to go and interview witnesses, like it's just such a very specific vision of what legal practice looks like in a way that is sort of playing fast and loose with the--what folks might think these models are good at, right.
It's like oh this is all in the documents. Well it's there are very few areas of law right where actually you know your entire corpus of facts that's going to be relevant is going to be like nicely kind of cabined into a set of legal documents or where the law is gonna that's gonna be relevant it's nicely cabined into a database that you have access to.
ALEX HANNA: This is such a this is such an excellent point and something that I that I really appreciate hearing you talk about Kendra because I think there's an imaginary, for people who are not AI experts, of thinking that these things can effectively synthesize types of texts and effectively place what are kind of rote types of work. And you know my my understanding of of kind of law and work comes solely from Better Call Saul, um, and and sort of when when when when when when Rhea Seehorn's character gets kind of kicked down to the basement to do document review um and but you have to like also know that like that that firm is oriented towards um mostly doing things like uh uh um you know big mergers big corp--you know big corporate law right.
Um but like thinking about so maybe there's these kinds of elements of of of of of synthesizing many documents, um but even that in terms of like interpreting what is relevant I mean these things are probably also not very good, at what does it mean to some sort of relevancy to certain documents that are going to be the things that that are going to evade um and or replace um human intervention.
And so there's an imaginary of what these things can do and what these things can read, uh or you know again 'read' uh in heavy quotes um and what these things can understand and again going to the octopus paper.
Um and and and but but I think that's what's worrisome in this is that this is somebody publishing this in the trade journal and has a very particular view of what these tools can do.
So something that the the hype machine does is really um really spew out of control the kind of capabilities of of what these tools can and cannot do.
EMILY M. BENDER: Yeah, yeah. The only thing I want to add um and and thank you both especially Kendra for bringing your expertise to this is you you threw to us this uh question of what do we mean by 'is readily understood because it is based on language?' And the rest of that is 'as opposed to images sounds or objects' and I think that that is better phrased as 'readily manipulated because it is based on language,' um and you know even if you had a situation where all the relevant facts were nicely laid out in a set of documents um there's still this question of what's the most effective piece of evidence to draw on in this point of time?
And how do we build a narrative out of this? And and so on. And you might be able to treat that as a text manipulation task if you had lots and lots of training data that basically said given this set of documents this is the the kind of output that we're looking for, over and over again. Um but that would be the sort of thing that you know works to some percentage--like how well does that work?
And ultimately you have to ask like is that is the production of that training data actually cheaper than just going ahead and doing the lawyering yourself? And what about as things change over time? So that training set is always going to be locked at some point in the past and not ready to deal with changing circumstances. And sort of on that point um a Pixel in the comments says my boss wants to see if we can synthesize quote 'future predictions' based on quote 'facts' from the scene data, historical articles, and LLM pre-training data set.
It's like no you can't, and I'm sorry that your boss is confused about that and to Alex's point it's kind of frightening for society that there are at least elements out there in the legal profession who are excited about this instead of appalled by it.
KENDRA ALBERT: I actually I want to follow up on that because I also think there's some particular stuff about use of these tools for law that I haven't like seen as much you know coverage of. And to be fair I think it's part of why some folks are just sort of skeptical about the entire thing. But you know actually I had a student asked me if they could use Grammarly to like do a grammar check on the work product they were producing in the clinic, and we're practicing law for real clients. so we have ethical obligations with regard to them.
And you know part of what we ended up talking about was well what you know in order to you know to produce that training set Emily that you're talking about, it would be really effective for the AI producer, the tool producer to train it on the stuff that folks are putting into it, right. To say like oh like well if you want to use our tool give us you know all of your model contracts right, give us all of your your precedent right like the stuff that you've you've written in the past, you know give us your memos right?
But one of the challenging things you know that we've seen with it stuff like Copilot is that LLMs will just straight up produce training data right?
Like that's it's not a like out there question it's not that's not like a low probability of like oh it's never gonna happen right it's you know it's--and because the training data in this case may in fact be not--like attorney-client privilege confidential information and you're basically like sharing it with a tool manufacturer that may in turn mean that it gets produced when some other attorney at some other firm types something in and in fact I mean I think that there would be ethical questions about doing this but like if I knew that the attorneys on the opposing side of it if not--I would actually not do this on a variety of levels--but if one knew um the attorneys on the opposing side of the case were using a particular AI tool, like you could craft queries to try and get that stuff that they've written out of it.
ALEX HANNA: Right.
KENDRA ALBERT: So this whole idea right that you know the this like like even beyond the like, 'Does it work?' question which to the answer is like no but like you know the functional--what it functionally means to be to sort of put your client work into a training data set without the kinds like kinds of sort of robust like security guarantees and potentially invoking the kinds of problems that we've seen with LLMs currently or just other other AI systems, you know that's something where I'm like y'all like can we talk about this for a second because you should not be putting client information into this stuff until you've actually thought about what it means, if it might get produced to other council. Right like anyway so.
ALEX HANNA: Yeah well that's a great point because people are I mean there are probably people who are doing that already with ChatGPT and you know amongst the terms of service um within it, I mean they're using data that's input as training you know, "to help us do better," right um--
EMILY M. BENDER: Let's improve the system.
ALEX HANNA: Let's improve the system. You are the product here in terms of training data.
EMILY M. BENDER: All right, we've got to move on if we're going to get to this last one and still have time for fresh AI he--hell.
ALEX HANNA: Let's do it, yeah.
EMILY M. BENDER: All right Kendra what's going on here? And can we just shout about this for a second?
ALEX HANNA: Ah no. Ah.
KENDRA ALBERT: Uh so uh do you want me to explain what we're seeing?
ALEX HANNA: Yeah, do it because right now we're just losing--we're just screaming.
KENDRA ALBERT: This is an SSRN article so SSRN is basically the pre-print uh you know it's owned by Elsevier, it's the preprint server that's primarily used for like legal academics um and this is a piece by Andrew M. Perlman who's the dean of the Suffolk University Law School here in Boston. Um it's called, 'The Implications of ChatGPT for Legal Services and Society,' and it is listed as having two authors: OpenAI's Assistant, independent is its affiliation--I'm not sure I would call it independent.
ALEX HANNA: How is it independent?
KENDRA ALBERT: Just if we're if we're going all the way here.
ALEX HANNA: I thought it thought it's I thought it got its degree.
EMILY M. BENDER: Right it passed the bar somewhere, didn't it?
KENDRA ALBERT: Yes someone in chat is like, 'First author no less!' I'll I'll just point out while I'm being pedant that authorship order is not really as much of a thing in legal journals um so you know don't read too much into it. But um do you want to uh Emily do you want to open up this uh the this paper?
EMILY M. BENDER: Yes or a PDF and browser I guess yeah I guess we had. So...sigh.
KENDRA ALBERT: Yeah, go ahead.
ALEX HANNA: And and there and let's so the note here and let's go to the footnote here because it's OpenAI's Assistant, the footnote, and it says a chat bot created by OpenAI, uh prompt: 'How should I identify you for citation purposes?' ChatGPT's answer: "As a general rule it is always best to use the name of the author or creator of a work when citing it.
In this case since I am a large language model trained by OpenAI, you can cite me as quote 'OpenAI's Assistant.' Alternatively you can simply refer to me as 'the Assistant' in your citation."
Um and like oof there's the start-up thing, internet personification, but also 'OpenAI's Assistant'? Like uh I just uh I'm just I would flip a table if it didn't it wouldn't ruin the stream.
KENDRA ALBERT: Yeah so what this author did is and he which he highlights in the introduction um and says is you know, "a human author generated this paper in about an hour uh through prompts within ChatGPT. Only this abstract, the outline headers, the epilogue, and the prompts were written by a person. ChatGPTgenerated the rest of the text with no human editing." So for example the introduction says, 1. Introduction, and it says, "Query: Write an introduction to a scholarly paper on chat--how ChatGPT will be used by the--in the law."
Um and there's a lot going on here um but uh uh I want to actually turn us to um the section where he has ChatGPT write a Massachusetts state court complaint. Um so there's a long section where he he asks ChatGPT to summarize all the things it you know can do for legal services. So I think it's a little further down, Emily. Um yeah there we are.
EMILY M. BENDER: Here we go.
KENDRA ALBERT: Um so this is he gave it a prompt uh to draft a legal complaint for a Massachusetts state court by John Doe against Jane Smith for ar--injuries resulting out of a car accident on January 1st blah blah blah. So I think this is actually like I think this is sort of an incredible example um of like why this thing why this is concerning uh because uh for right well one of the many reasons this is concerning.
So the top of this complaint this document that you know ChatGPT spit back out says, "State of Massachusetts, In the Court of Common Pleas. John Doe, Plaintiff, versus Jane Smith, Defendant. Complaint."
Now uh so I'm assuming probably not a ton of the folks who are listening or watching this practice in the state of matter--in state of Massachusetts court um but and neither does ChatGPT um because uh The Court of Common Pleas is not a court that currently exists in Massachusetts. In fact it's a court that has not existed in Massachusetts since 1860. Right? And there's no indication in this article that that's true. I just was like I should probably figure out you know in prep for the stream, I actually started just looking at the things that it produced and was like hey I wonder what I can find out about this. Right?
But you know so both we have the fact that like you know form over reality, right like this produces something that looks vaguely like a complaint um but you know produces the name of a court that doesn't exist um currently. But also you know producing an article that you're sort of putting under your own name actually right where you produce this and you did not comment in any way upon this.
Right like there's you know there's no indication in this article that any like, nor would anyone know if they happen to not practice in Massachusetts or spend time looking at it, that you know this is wrong, right. And so like the you know for me like that's a really like this is a really concerning thing because there's this concept um where often we talk about law like legal services is what's called a credence good.
Right so this is the idea that like it's actually very difficult to tell whether someone is a good lawyer. Right like you know like there is not it's not necessarily always easy to objectively evaluate and it's especially not easy for people who are not lawyers to evaluate. And one of the things like with many other fields that you look for in sort of determining whether someone is a lawyer right is like the kind of thing we often use as a proxy right is like, 'Does it look like law stuff?' Right especially in contracts where it's like oh like does it look like does it have a bunch of words I've never under I don't understand and like it sound really fancy and it is formatted really badly and there's a whole bunch of all caps text? That's probably a contract, it probably makes sense, right?
So I think like for me like looking at this complaint like that is generated, and we can talk a little bit about the epilogue that uh you know Dean Perlman wrote also, you know.
It's really like is really scary knowing what we already know about how people treat or understand legal services and how little um knowledge folks have especially folks who are low income and who don't have access to like you know lawyers um or like high quality legal services.
Like you're not they're not in an effective position to evaluate these kinds of things even already even pre you know GPT generation of documents.
EMILY M. BENDER: Yeah.
ALEX HANNA: Right. Yeah, so I mean you could imagine someone who can't afford any counsel and maybe having to produce something goes to ChatGPT and says ChatGPT--and they're I don't know they're asked for some kind of a a contract or some kind of a a writ or or some kind of piece of document and say, "Please produce this for me," and you know and it has real devastating effects for them.
EMILY M. BENDER: Yeah and I'm particularly worried about this argument that I see over and over again more in the medical field but I wouldn't be surprised if it showed up here of like--actually I have seen it here--access to legal services is too expensive and so therefore this is better than nothing we should we should give people who can't afford legal representation this.
And it's like--no. You've identified a problem, access to legal services is too expensive, that doesn't make this technology the solution.
KENDRA ALBERT: Yeah I mean I do think like that argument absolutely has taken place and like the--I think what's also so frustrating about it is you know there are things that we do know decrease--um you know increase the availability of legal services. You know there's really cool work that has been done to sort of like um produce like pro se materials like forms that are easier for folks to like fill out, right, like things that demystify parts of the you know the legal process um you know they're it like uh I think it's Washington State has basically sort of like a paralegal plus certification for tasks that where it actually doesn't make sense to have an attorney do it.
Um you know frankly decreasing student loans for attorneys decreases the price for uh you know like pay you know--uh decreases the price for legal services. You know all of these things are you know actually relative much more you know much more human interventions on a variety of levels.
I think the other thing that it I I think is important certain to say here right is that like this is a very, and I that we talked about this a little bit with the Daily Journal piece as well, but it's a very like specific vision of what it looks like to practice law. Right with an emphasis on the skills that are sort of traditionally often rewarded or thought of as important by like legal by like legal--by like the legal establishment right.
Like nobody's talking you know like we don't see folks talking about how actually like you know the thing that's important in practicing law is sort of the ability to kind of interview and connect with a client, right, even though actually for a lot of legal services that's like what matters.
And you know I'm sure that someone would say well um if you use AI for these tools for this for these documents, well you can connect more you know you can spend more time connecting with the client. And I think what's actually really interesting to me here and you see this actually in the the bit that Emily has pulled up, you know it says through prompt--this is from the the paper--"Through prompts of the sort presented in this article lawyers may soon generate first drafts of complex legal instruments that adopt the law firm style and incorporate the firm's substantive knowledge."
Well A, like what does 'the firm's substantive knowledge' mean, that's like an interesting way to frame it but B, like the idea that generating a first draft with one of these tools and then correcting it is a similar skill to one that lawyers already have is just totally bonkers to me.
ALEX HANNA: Right.
KENDRA ALBERT: And I say this as someone who teaches people how to practice law and so I spent a lot of time correcting or working with students to make sure their work like hits a standard that we would hand to clients, but I've never had a student produce a complaint that had the name of a court that didn't exist. Like that's not a thing that lawyers check for, right like it's not a thing you would expect. And so the idea that editing this is similar to editing other kinds of first drafts is I think a fundamental misunderstanding of both what the tech is doing but also how humans work, right like that's just not you know not how lawyers are going to expect to interact with first drafts and like that means that even if you are sort of taking this idea that oh it's only about first draft production, I don't think that's like--I think that misses the point of how people interact with like draft documents.
EMILY M. BENDER: Absolutely. And we need to get on to Fresh AI Hell.
ALEX HANNA: Yeah I want to make one point I completely agree and this is sort of something we were talking about last last--in our last episode about like around medical education, you know like these you know many of them advertises medical education. Like one, it's really evident that you're you're not like you're not actually invested in kind of a theory of learning and what you're doing is that you're actually changing the nature of how you think learning should work in a way that learning doesn't really work, right.
Um so while you should be scaffolding knowledge instead you're thinking about um knowledge in education as being a corrective, and that's not like really in my--I've taught for years and that's not how students learn. Anyway yeah okay.
EMILY M. BENDER: All right so Fresh AI Hell. Alex does a wonderful job each time when I prompt her to sing, so I'm gonna up the ante here and this time we need the Fresh AI Hell musical theme in the style of a a musical theater piece where you're playing the character of a British banister with the white wig. Go.
ALEX HANNA: Oh gosh um all right.
EMILY M. BENDER: Barrister, not banister.
ALEX HANNA: All right. Oh crap you're doing--you've tapped into my musical improv desire. Um. [Singing] I'm here in the Courts of Commons, looking at these documents to read. If there was only some kind of tool, which will let me do it with speed. Fresh AI Hell. is what saved me by the bell. Fresh AI Hell. Making my job more swell.
EMILY M. BENDER: Oh that's amazing.
KENDRA ALBERT: That was honestly that was the most impressive seen--thing I've seen. I was like you know I'm in awe.
EMILY M. BENDER: And just so you know Alex did not know that that was going to be--
KENDRA ALBERT: Yeah no no Emily refused to tell Alex the prompt before this.
ALEX HANNA: I'm gonna go into my uh alts career and uh improv musical theater yeah.
EMILY M. BENDER: Yes. All right so very fast, through Fresh AI Hell um Kendra you get to do the honors on this one.
KENDRA ALBERT: Sure so this is a post on Mastedon from Ann Lipton who's a uh you know an academic who studies like uh uh corporations um and she says uh, "Adding my own experience to the mix, someone on Twitter used jet--ChatGPT to direct me to purportedly precedent cases where shareholders have brought particular claims and settled for particular amounts. None of it was real; the cases never happened."
So we see this a lot with ChatGPT where it's generating fake case names based on theoretical real case names um so that's our Fresh AI Hell for uh transitioning uh out of our law segment, yeah.
EMILY M. BENDER: Making extra work rather than helping people. Not making things more swell. All right um Alex, do you want to do this one?
ALEX HANNA: Yeah sure so this is Dave Lee, he's a journalist at the Financial Times. "Having fun on Twitter on unpicking--" I guess unpacking-- "--a curious thing that happened to me this morning. Someone reached out--uh reached me on Signal thinking I was ChatGPT. Turned out I had a given my number." And someone was having a chat and said, "Cool, Signal?" And it said, "Yeah, you can use signal to chat with me." Uh, "OpenAI has developed an integration with Signal." And then it just and then it had a set of instructions, it says, "Add the OpenAI contact by scanning the QR code or manually adding the number," and it gives his signal number.
And so many journalists have Signal numbers and--
KENDRA ALBERT: They're made up instructions for how to use Signal with OpenAI.
ALEX HANNA: Yeah it's like is it does this integration actually exist, I don't know.
KENDRA ALBERT: But I don't think it does I don't I don't think so.
ALEX HANNA: It just like basically has has an integration that it made up and then is taking you know somewhere in the training data this this man's Signal number, which many reporters have um is trying to get sources and such that but then just gives this number. So yeah this is--why his number, why is this? So yeah just incredibly incredibly just picking a number out of nowhere, um exposing private information.
EMILY M. BENDER: And again very similar to the previous example of these like externalities of you know being disruptive to bystanders. Right this guy had nothing to do with it except that his number was somewhere. Either his number was out there in the training data or you know you can make up--ChatGPT clearly has patterns for what U.S phone numbers look like. Right and so--all right this next one's really depressing.
ALEX HANNA: Oh yeah you want to do this? Oh Lord.
EMILY M. BENDER: Okay so um Vanderbilt sent out a email to its student body um in the wake of the Michigan shootings. And it says, "In the wake of the Michigan shootings, let us come together as a community to reaffirm our commitment to caring for one another and promoting a culture of inclusivity on our campus--um by doing so we can honor the victims of this tragedy and work towards a safer more compassionate future for all."
And then in parentheses, "Paraphrase from OpenAI's ChatGPT AI language model, personal communication February 15 2023." Because they couldn't find it in their hearts to actually express their own emotions about this, and and reach out to their own student body with their own words, but rather decided to see what ChatGPT could do, but then they were like, well we're stickers for academic honesty so we have to give or cite our sources um and I guess it's good that they cited sources there, but it's so gross that they did it in the first place.
KENDRA ALBERT: I yeah I just I'm like I think this is the one circumstance in which really it's better not to know? Right like you know diversity school-- like diversity statements often are relatively anodyne, like I don't you know nobody's probably necessarily going to be able to tell, right like congrats you made it worse, right. So.
EMILY M. BENDER: Yeah, yeah. And then they had to apologize and that's what this is, uh. Wait no this is a different wait--Peabody EDI--
ALEX HANNA: No no no yeah yeah they responds to, yeah and then this is the reporting on it and then it's uh, oh my gosh. And it's the this is the fact that it was in diversity in--and inclusion office just is really the ice--icing on the hell cake there.
EMILY M. BENDER: Absolutely. All right what's this last one? Oh this is such a bummer too.
ALEX HANNA: Yeah.
EMILY M. BENDER: So many people probably heard that Clarkesworld has closed submissions because they were getting spammed, basically a denial of service attack from people using ChatGPT to write short stories and submit them. And I learned from a friend of mine who's deeply involved with Norwescon that this isn't just sort of a bummer, it's actually a major bummer because Clarkesworld was one of the few places that people who didn't already have connections in the publishing industry could actually get a toehold.
And as I understand it there's been people within the speculative fiction community who are working pretty hard to increase the inclusivity, and these kinds of venues are super important for it and so that gets taken down by apparently somebody's idea of a get rich quick scheme because this is a paid publishing opportunity.
ALEX HANNA: It's sort of it's sort of like uh the a version of like finding any little small amount of compute to do crypto mining. It's sort of you're trying to find any kind of place that has an incredibly low barrier of entry for for for monetary gain.
So like yeah but this is--but there's real people doing submission review and yeah yeah pretty trash.
EMILY M. BENDER: Yeah all right so we're over time but we had a we had an uplifting thing to end with so I'm just going to display it which is the FTC has got our back this time. So FTC put out a business blog with the title, "Keep your AI claims in check," um and um we will definitely post a link to this in the show notes and I need to get on--I'm promoting this on on Twitter and Mastodon um but basically they're saying, no you got to make sure that you are describing this accurately um and Kendra you had some hypotheses about where this is coming from.
KENDRA ALBERT: Yeah so there's there's a theory um and I know we've uh we've talked about it on past shows but the the CEO of DoNotPay issued this like hilariously uh mind-blowingly stupid challenge to say that he would you know pay a million dollars to anyone who used uh ChatGPT to our like--in a earbud to argue in front of the Supreme Court and a lot of the pushback focused on the fact that you can't get earbuds into the Supreme Court which I think kind of missed the point of the the whole thing.
But in the aftermath of that you know um there was a bunch of folks who did a lot of Investigation of like what um DoNotPay was using AI for in a way that indicated that they probably weren't using AI at all, or at least certainly weren't using it in the way they claimed to be. So although it's I think it's not only them that we're making uh you know uh very false claims about using AI as part of their tools, it's possible that one of the reasons this sort of came out was because of some of the backlash that literally came out of this like ridiculous tweet where the guy uh sort of made a claim that he would uh um allow--he would give somebody a million dollars to you know go be the voice for a an AI in front of the Supreme Court.
ALEX HANNA: So shorter FTC you know, "Stop--kids stop the AI hype."
EMILY M. BENDER: Yeah yeah. Exactly, exactly. And everyone should listen to the brilliance that Kendra brings in all their work.
ALEX HANNA: Yes.
EMILY M. BENDER: Thank you so much for joining us.
KENDRA ALBERT: Thank you for having me this is so much fun.
ALEX HANNA: This is a pleasure. Yay, amazing. We'll drop a link to all Kendra's information in the show notes but until next time, yeah, stay out of AI hell.
EMILY M. BENDER: Yeah stay out of AI hell but do attend Alex's improv musical.
ALEX HANNA: Yeah yeah the next the next musical will be me um doing an improv on saving a leaky roller derby warehouse. [Laughter}
KENDRA ALBERT: Can't wait.
ALEX HANNA: All right, take it easy y'all. Bye.
EMILY M. BENDER: Bye.
ALEX: That’s it for this week! Thanks to Harvard University legal scholar Kendra Albert for their views from the legal trenches.
Our theme song is by Toby Menon. Graphic design by Naomi Pleasure-Park. Production by Christie Taylor. And thanks, as always, to the Distributed AI Research Institute. If you like this show, you can support us by rating and reviewing us on Apple Podcasts and Spotify. And by donating to DAIR at dair-institute.org. That’s D-A-I-R, hyphen, institute dot org.
EMILY: Find us and all our past episodes on PeerTube, and wherever you get your podcasts! You can watch and comment on the show while it’s happening LIVE on our Twitch stream: that’s Twitch dot TV slash DAIR underscore Institute…again that’s D-A-I-R underscore Institute.
I’m Emily M. Bender.
ALEX: And I’m Alex Hanna. Stay out of AI hell, y’all.