QAA Membership Podcast

Professor Phillip Dawson: Assessment Design for a time of AI

QAA Membership

In this podcast, we share Professor Phillip Dawson's keynote address on Assessment Design for a time of AI, which he presented at our 2025 Quality Insights Conference last month. 

Phillip is co-director of Centre for Research in Assessment and Digital Learning (CRADLE) at Deakin University, Australia. Having initially studied AI and cyber security before his PhD in Higher Education, he’s now a leading expert on assessment validity in the digital age.
 
Artificial intelligence is capable of producing outputs that satisfy the requirements of some high-stakes assessments across a range of disciplines including law, medicine and engineering. This has driven concerns about a new wave of artificial-intelligence-enabled cheating, as well as questions about the sustainability and authenticity of current assessment practices. 

This presentation explores how assessment needs to change for a time of artificial intelligence. It draws upon work the Phillip has done as one of the leaders of the major Australian project Assessment reform for a time of artificial intelligence, which was funded by the Australian higher education regulator. The presentation’s main focus is resolving the tension between preparing students for a world pervaded by artificial intelligence, and ensuring the integrity and security of assessment.

Key points covered are:

1. Assessment matters, but so does what is assessed (does it need to change in a time of AI?)
2. Validity matters more than cheating (is AI panic more of a threat to validity than AI itself?)
3. Future-authentic assessment (prepare for their future, not our past)
4. Reverse scaffolding (use AI once you can do it yourself)
5. Zone of Proximal Development (tools for production vs tools for learning)
6. Cognitive offloading (extraneous vs intrinsic)
7. Evaluative judgement (but it can’t be the only thing)
8. Make structural not discursive changes (no bogus rules)
9. No such thing as AI-proof assessment (beware of anybody who says they have one
10. Swiss Cheese programmatic - layers of imperfect assessments tell us more than one good one

If you would like to view Phillip's slides, go to the video version of this podcast: https://youtu.be/8ANOlnypHFw 

Don’t forget, if you’re a QAA Member, you can book your place now for our in-person Member Network Conference on 2 April and access a range of thought-provoking and topical presentations and discussions from across the sector just like this one. Click the link to find out more: https://events.qaa.ac.uk/event/abc3576c-0f95-491f-af9d-8ee5cca86eec/summary

Dr Kerr Castle:

Welcome to the QAA podcast.

Dr Kerr Castle:

I'm Dr Kerr Castle and I'm very happy to introduce this latest episode focusing on assessment design for a time of artificial intelligence.

Dr Kerr Castle:

Generative artificial intelligence has had a seismic impact on learning, teaching and assessment in higher education, and is a topic that QAA has been supporting members on over the last couple of years. A critical dimension of this is sector concern over assessment validity and academic integrity, raising a fundamental question how do we assess in an age of generative artificial intelligence? To help us answer this question, we invited Professor Philip Dawson, co-director of the Centre for Research and Assessment and Digital Learning, or CRADLE, from Deakin University, to be the keynote speaker at our 2025 Quality Insights Conference and share his views. Having initially studied AI and cybersecurity before his PhD in higher education, phil is now an internationally renowned expert in assessment, artificial intelligence and academic integrity. This podcast features Phil's full keynote address, which proved to be one of the standout moments from Quality Insights this year, and if you would like to view Phil's presentation slides for additional context, you can find a link to them in the show notes. So let's hand over to Phil.

Professor Phillip Dawson:

Thank you so much for such a warm welcome. It's a joy to be here. We in Australia admire what you at the QAA do. It's a wonderful thing to have a body that's so concerned with quality in higher education and supporting a sector to do a good job of it. I'm joining you from Australia. I'm in Melbourne on the stolen lands of the Wurundjeri people and pay my respects to their elders, past, present and becoming so. I'm going to start with some really boring stuff. I'm starting with disclaimer. This is the most boring bit of the presentation. I'm starting with disclaimer. This is the most boring bit of the presentation.

Professor Phillip Dawson:

I am what you could call a standards-based assessment person. So you may have grown up wanting to be Batman or fight fires or go to the moon. I didn't grow up wanting to be a standards-based assessment person. But let me tell you it's a really exciting thing, because my primary concern is how can we tell if people have met some predetermined outcome that we think is really important for them educationally? How can we know? And then how can we warrant to society that this person can do this thing? So that's where I'm coming from.

Professor Phillip Dawson:

The UK is very much aligned with Australia in that, but there are some locations in the world where standards-based assessment is not the common way of doing things. In some places, people are marked to a curve against their peers to say, well, you were better than you. That's not me. I'm about how can we know if someone's met the standard. But I also support AFL. If we were in Australia, I'd make a joke about our football codes and there's a secret message on the slides, but I'm not going to do that. The AFL here is assessment for learning. I think assessment's learning purpose really matters, but I'm mostly going to be talking about assessment of learning today, because that's something that's really under threat at the moment. How can we really know what people are capable of? There are many other conversations we could be having about AI. They're all worth having. The one we're going to have is around assessment of learning.

Professor Phillip Dawson:

I receive research funding from educational technology companies, most notably in this space, and most relevant would be. I have received funding from Turnitin, which is to say they've paid the direct costs of some of my research, and also Inspira has funded the production of some work as well. If I were a Formula One driver or something, though, these logos would be very small because the money amounts are quite small and finally my mum helped me contract cheat in year four. Now we had to do a poster comparing the similarities and differences between Canada and Australia, and I was terrible at this and mum helped me too much. She basically did the task for me. I have come to understand now that that crossed the line, but I just kind of wasn't able to do this task. Later in life I discovered I have a thing called aphantasia, which is not having a mind's eye, so I can't picture a duck, a square, my children's faces or a sunset or anything like that. This made doing any of these visual tasks really challenging.

Professor Phillip Dawson:

If you'd like a hard hitting documentary on this issue of contract cheating in the early years, I recommend the Peppa Pig episode school project. In it Madam Gazelle sets the students a task and they all go home and get their parents to do it for them. And they do that Peppa Pig laughing thing at the end, where they all lie down. It's funny. But I raise this to say that all of our understandings of integrity are socially constructed and they vary. What's acceptable in one context is different to another context. What's acceptable with AI? We're making that up all of us together as we go along, so let's do some make-em-ups.

Professor Phillip Dawson:

Okay, that was the boring part. The rest of the talk is much better. I want you to take four things from this presentation Firstly, that assessment needs to change for a time of AI. Secondly, how we might get there. And when I'm talking about that, I'm going to talk about a project that I worked on with some colleagues for the Australian Higher Education Regulator, teqsa, where we tried to sketch out what that might look like. Then I'll give you some key considerations for you to keep in mind as you do the work of assessment change. And finally, I'm going to advocate for and push for a particular type of assessment change the necessity for structural changes. Okay, let's get on with it. We need to change assessment. What we've got here is a couple of rather low-res GIFs. I'm sorry about this. These are the best quality I could get.

Professor Phillip Dawson:

On the top right, we have a browser plugin that is designed to do your online quizzes for you. That's what it does. That's the thing. It's specifically designed to go in and auto do your quizzes. It uses generative AI to go and solve them for you. There's no interaction by you. There's no learning. There may be assessment, but it's certainly not assessment of what you can do. Now, that's been around for a little while. I think that sort of browser plug-in has been around for at least a year.

Professor Phillip Dawson:

But recently OpenAI released a thing called Operator. Now what Operator does is we're familiar with ChatGPT. You type to it, it tells you stuff, it might draw you a picture, it might give you a file to download, whatever else. What Operator does is Operator runs a web browser and it clicks the buttons and does stuff within the web browser. And there's a really interesting video up there on YouTube from a channel called Cursive and it shows how someone was able to basically say hey, log into the learning management system and just do the course, just do the course. And it keeps asking for permission. Operator does to continue to do the course and the person says no, no, just do it, stop talking to me, just get in there and do the course. And it types out the answers and it clicks the quiz buttons and the whole thing.

Professor Phillip Dawson:

And it's not necessarily great work, but it's passable work and that's a real problem. That is the problem that we have that AI can do good enough work. You know, there's been sort of arguments about exactly how good it is and I think this is a great paper. From early on. You know, gpt-4 passes the bar exam. This is about the bar exam that exists to practice law in a lot of states in America. And you know, we saw very rapidly with the advent of GPT-4 that it was able to exceed the passing range for becoming a lawyer in those states. And you know, yes, there were people who went and critiqued the findings of the paper and said, oh, but it's not exactly right. You know, maybe it's closer to the passing range for an average student, but the problem for me isn't how good it is, it's is it good enough? You know, does it achieve the passable standard? Because you know the passable standard is you can go and build a bridge, you can go and operate in surgery, you can go and teach my kids. That's the pass mark.

Professor Phillip Dawson:

Your various degree classifications and all of that that I know you love so much in the UK that we don't have in Australia. Yes, those matter too, but whether someone gets a degree or not is the thing that really, really matters. And AI can do that. Level for many types of tasks matters. And AI can do that level for many types of tasks. For a while there there were conversations that were basically AI can't do X, y, z, and look, I'm not going to go through this one in depth. I think this book is a great time capsule of what some clever AI people thought in 2021 was completely solved all the way through to nowhere near being solved. And we have things like AI writing interesting stories or interpreting a work of art was considered nowhere near solved and in just the space of a few years we've gotten to the point where some people would argue those are solved problems or real progress problems. So there is a need for assessment to change in a world where AI can do the things that we used to assess. So how are we going to do it?

Professor Phillip Dawson:

I'd like to talk now about some work where we got together a whole bunch of people in Australia. So we got together 18 experts in assessment, higher education and AI to talk through this really difficult problem. I'll do a small piece of patriotism now. You may think why Australian experts? Why would they matter? Australia is the number one country in higher education research and specifically in higher education assessment research. In higher education assessment research. You can go and look at bibliometric studies of assessment and evaluation in higher education or your big research metrics groups, they find Australia is where it's at for higher education assessment research, with big thanks to our wonderful assessment colleagues in the UK as well and around the world. Australians do have stuff to say on this.

Professor Phillip Dawson:

So we got together, 18 of us, and we had this challenge of, well, what are we going to do? And we produced this document Assessment Reform for the Age of Artificial Intelligence. We reached what we called 75% consensus, which is a bit of a contradiction in terms, but it was basically what could we as a bunch of various people agree terms, but it was basically what could we as a bunch of various people agree to? And here's where we got to. We have two guiding principles for what assessment needs to do. So the first one is assessment and learning experiences equip students to participate ethically and actively in a society where AI is ubiquitous. So these two principles are kind of our, not negotiables. There's a lot to this one. You could consider this the sort of assessment for learning principle that we've got going on. Equipping students to be in a world where AI is there means they probably have to use it, but not just as sort of operators pushing the button, thinking, participating ethically and actively.

Professor Phillip Dawson:

We had a bit of a chat last week with our higher education regulator in a closed door session and I don't want to reveal all the things, but I think we might be adding something around sort of critically in there as well, because something that's really come through since we did the work is a need for that sort of critical capability. People talk about AI literacy or critical digital literacies or those sorts of ideas. So it's more than just using the tech, this thing of a society where AI is ubiquitous. It's an acceptance that the world our graduates go to is going to have AI. There's a comment in the chat getting a few thumbs up.

Professor Phillip Dawson:

Personally, I'm not sure if the problem is the AI. I think the problem lies in how slash what we assess, although this is not a black and white situation and I think there is no easy answer to this at the moment and look, I really agree with that. I think a lot of this is just assessment stuff, isn't it? A lot of this is we need to do decent assessment. If the concern is people are just going to go off and get AI to do it for them, well, contract cheating has existed for ages. There's studies analysing the provenance of various works and you go back to sacred texts like the Bible or Donald Trump's the Art of the Deal, and those are often claimed to not be written by the person who it said they were written by. So not doing the work yourself has always been a thing. I do think this first principle, though, really does have something in it that we might need to do assessment for the world our students go to, where they will be using these tools, hopefully actively, ethically, critically. Our second guiding principle is more of an assessment of learning principle, so it's forming trustworthy judgments about student learning in a time of AI requires multiple, inclusive and contextualised approaches to assessment.

Professor Phillip Dawson:

There's a lot in this one as well.

Professor Phillip Dawson:

There's, firstly, trustworthy judgements. We need to be able to make judgements that matter, that we trust, that society can trust. In Australia, there's a lot of talk at the moment about social licence for universities, the public having confidence and faith and belief in the mission that we do and that we do a good job of it. Social license depends on this. If we don't have trustworthy judgments about what our graduates can do, then society is let down, and to do that requires multiple moments of assessment. There's no such thing as one perfect assessment that can let you know if the person can moments of assessment. There's no such thing as one perfect assessment that can let you know if the person can do the thing. It's always a patchwork Inclusive. So there's real concerns about inclusive assessment. That assessment's less good at being assessment if it doesn't work for everyone and contextualized approaches to assessment. There's no such thing as an assessment mode that works everywhere.

Professor Phillip Dawson:

Now, in addition to those, we kind of take them down a level to five propositions. These are a bit more practical, but they're not like a field guide on exactly what to do. They are a compass but not a map. These all start with. Assessment should emphasise the first one being. Assessment should emphasise appropriate, authentic engagement with AI. So assessment needs to involve students using AI in some ways at some times if we want to graduate them for that world, and it should be authentic in some way. I think whenever we set a task that has elements that aren't authentic, you know that don't represent the world of practice beyond the university, we should justify why we are doing something inauthentic. That's not to say all assessments should be authentic. There are good arguments why it shouldn't be sometimes, or that all things in the world of practice should be at university. There's no spending four hours a day answering emails assignment that I've seen, but that's very authentic to my professional practice. Now, appropriate is in there as well. We copped a bit of flack on that one, but it is a bit of a weasel word. I get it, but we're saying sort of not all uses that are authentic are appropriate for the higher ed space. Okay, second proposition Assessment should emphasize a systemic approach to program assessment, aligned with disciplines and qualifications.

Professor Phillip Dawson:

So this is saying zoom out from focusing on a single assessment task. Think about the program of study, the degree qualification, think about assessment at that level. How does assessment work across the program? How do things feed from first year to second year to third year and onwards? What does the whole say that's greater than the sum of its parts? Because if we want to have those sort of real moments of yes, we're confident the person could do the thing, there's problems with every single task that we could possibly do. Okay, so that's thinking systemically. You may have heard of a thing called programmatic assessment. That's where you do a really robust look across a whole degree, map all the assessment out and all that. We don't use the term programmatic assessment here. It's scary. It's largely regarded as infeasible outside of very well-funded medical courses. A lot of what's taken off in Australia has been sort of course-wide or systemic approaches where we say, hey, let's pick the three tasks that really tell us if someone should graduate, if they're really of that level. Okay.

Professor Phillip Dawson:

The third one assessment should emphasise the process of learning. There's been a real focus on product traditionally. We need assessment that uncovers what students are actually doing. Now, some of that is because some people are conceding assessing product is problematic. Now People could use AI for the product, et cetera. It doesn't tell us very much. We need to focus on process, but it's also where the learning happens. If assessment truly is for learning, we need to see those check-ins, those moments along the way of how people are producing. Number three assessment should emphasize opportunities for students to work appropriately appropriately again, with each other and AI. So assessment should actually involve students using AI for them to function in a world where AI is ubiquitous, but also not losing working with each other.

Professor Phillip Dawson:

And finally, assessment should emphasize security at meaningful points across a program to inform decisions about progression and completion. So this is saying don't have some high security moment of assessment for every task. Don't turn every assessment task into an exam, for instance. Find the key moments across the whole qualification and there might only be half a dozen of them and invest in securing those moments, perhaps with an exam, perhaps with a placement, perhaps with interactive oral assessment where you talk to a student about their work. None of these are new approaches. I was doing interactive oral assessment in computer science 20 years ago. These are standard things that we do in higher ed but we don't deploy them meaningfully across programs. Often we deploy them haphazardly because the person who is an integrity zealot has an exam or something and those that kind of don't really care about integrity so much and care a lot about learning. Maybe they don't use them Really. Trying to find those moments across the program that matter and securing those and maybe focus more on learning for the other moments and we have progression and completion there it's not fair to a student if they get through first year and they can't do the things. So we want to be helping secure progression and completion.

Professor Phillip Dawson:

Okay, I have a little addendum. So the previous stuff was consensus between the 18 experts and all that. This one is me. So, firstly, we don't mention cheating in that, but I don't think it's a useful framing for AI. Ai is there. Ai is authentic tools of production. Ai gets in the way of our ability to judge what students can do, so we need to address it. But cheating is not necessarily the most useful framing. Secondly, we also don't use the word validity, but validity is what underpins all of that conversation. I'll get into validity a bit more. It's not a scary word, don't switch off. We will all be using the word validity by the end of this.

Professor Phillip Dawson:

And thirdly, we can't substitute all of our assessments of disciplinary outcomes with assessments of AI use or critique. Yeah, we don't want to be giving people kind of bad AI degrees where you just used AI, or degrees where you just used AI and then you were assessed on how well you critiqued its output. Those are fundamentally different things to what we say our graduates can do right now. Fundamentally different things to what we say our graduates can do right now. So, yes, ai use and critique is important, but the good old stuff that we thought mattered might still matter. Okay, that's the how assessment might need to change for a time of AI. I've now got kind of half a dozen-ish things that I'd love you to keep in mind as we go through the work of assessment change. All right, I want to start off with this great quote. This is from the Challenging Assessment Special Issue of Assessment and Evaluation in Higher Education. It's in the editorial from Liz McDowell I think it's Liz.

Professor Phillip Dawson:

Many of the dilemmas we face are not about assessment per se but are, at heart, debates about what should be assessed, and I really want to flag that with us as we have this great assessment conversation that I really want to have because I'm an assessment person. I really want to acknowledge that we want to question are we assessing the right things? Are the things we used to assess that we want to question? Are we assessing the right things? Are the things we used to assess that we think maybe we can't assess anymore? Are they still worth assessing? Do we need to assess some different things? When we're having these conversations, let's ask ourselves are we having an assessment conversation or a what should be assessed conversation? And I acknowledge they're often quite enmeshed.

Professor Phillip Dawson:

Secondly, I want to plug a philosophy, a line of thinking, a paper that we've been pushing from my center, which is this validity matters more than cheating. So cheating and the whole AI thing have really taken over the assessment debate In this paper. In Assessment and Evaluation in Higher Ed we argue that yeah, yeah, cheating and all that matters, but assessing what we mean to assess is the thing that matters the most and that's really what validity is it's. Are we assessing the thing that we think we're assessing? Now, in our efforts to address cheating and AI, we often change assessment and you know that's really to try and change it to support validity. Quite often you know if someone uses AI, we can't make a valid judgment if they've used it for the outcomes we were trying to assess. So we change assessments, try and address that, try and promote assessment validity, but in doing so we sometimes hurt validity as well because we make the assessment worse at being an assessment of the thing.

Professor Phillip Dawson:

I'll give you an example Remote proctored exams very contested area. I've got a chapter in the handbook of academic integrity that really tries to look at the pros and cons of the arguments in that space and the evidence. Let's say we deploy a remote proctored exam because we are concerned that students are going to cheat. So that's an attempt to support validity, to address the cheating threat to validity. Problem is some students have got test anxiety so they underperform in this test. The test becomes less valid for those students because it's remote proctored now. So we've got kind of a supporter of validity over here and a threat to validity. It's complicated.

Professor Phillip Dawson:

I'd love us to move beyond chats about cheating and AI as if they are the main game to focusing on. Are we assessing what we mean to assess? Validity is the main thing. We have another question how can you engage professional and regulatory bodies PSRBs in those types of conversation? That it is what is driving a lot to the validity cheating arguments.

Professor Phillip Dawson:

Changing assessment to allow validity in the world of AI is often in contrast to the long-held thoughts of the PSRB agencies. Gee, we have this conversation in Australia as well. Okay, a few thoughts. Firstly, we have more power in this conversation as a collective group of educational institutions and discipline-based academics than we might think when we group together and in Australia that's often done through the deans' councils. So we have these organisations, that the deans of education or whatever else, deans of business, those sort of groups. They actually have a fair bit of sway with these PSRBs. So you have more power than you think, firstly. Secondly, I think it's about having conversations with the right people in them. In the Australian context a lot of these have education people working for them who I've found are a lot more progressive in their views than I expected. You know, and I've talked with our accounting regulation bodies I've thought, gee, those accountants are going to be very austere and conservative. But they're actually a lot more open than I thought they'd be. A lot of the success in Australia has been through sort of missions from the disciplines to the PSRBs, where the disciplines get together, maybe even get the deans councils to put in a little bit of cash to develop a report or a submission to those and then try and collaborate together.

Professor Phillip Dawson:

It's been really hard. There are some of them in Australia that I find incredibly frustrating. They have some anachronistic views. But I'm with you. It's a struggle. I'm sorry I don't have a magic wand to answer you, but very much if you can get onto the validity conversation with them, that's really good.

Professor Phillip Dawson:

In the paper we talk about how validity is a God term. No one's anti-validity. We're all pro-validity in the assessment world and it's a very powerful term to bring things into conversations Elsewhere I argue that validity is a great way to bring inclusive assessment into the conversation, because assessment that's not inclusive has real validity problems. Okay, I'm going to make a push for future authentic assessment. Now you're familiar with authentic assessment assessment that represents the world outside of the university and it's contested, and there's a great paper by Tim Thorns recently around sort of how authenticity is often presented as a panacea and it's not, but it's still something we do want to embrace in assessment where appropriate, and we might even want to think about future authenticity. So not just assessment that represents our past experiences of the discipline, but what the future of the discipline is or, at the very least, current AI practices in the discipline. As an example, I can say my university is actually getting academics to go out and spend time with the disciplines to understand and have conversations about AI. We have an incredibly modest, meagre funding scheme that supported many academics to go out there and find out where's AI at and in our experience. The disciplines and industry are crying out for that sort of engagement with academia around AI. They don't want us to tell them what to do. We don't want them to tell us what to do. It's a real meeting of minds to understand each other, because only through doing that can we really have authentic assessment that represents our students' futures.

Professor Phillip Dawson:

I next want to talk a little bit conceptual. So I'm going to talk about scaffolding. We're familiar with scaffolding. Possibly Did that Education 101 class many years ago? Did that Education 101 class many years ago? It's this idea that to teach my kids how to stack the dishwasher, I first show them how to do it, then I might guide their hands into it, giving them assistance from a more capable other, guiding them into doing this thing and gradually providing them feedback. And I'm withdrawing these supports as they go and eventually they can stack the dishwasher all by themselves. Supposedly scaffolding, I don't think is a great idea for AI. Ai will not withdraw scaffolds.

Professor Phillip Dawson:

If you use AI to learn how to do something in your UD studies and you're not really careful with it, if you just get it to do the thing for you, it will do the thing. It won't develop your ability to do it yourself. I'm going to push for reverse scaffolding, which is you can only use AI to do it. After you've shown us you can do it yourself. This is the you know calculator license kind of way of thinking. Once you know your times tables, you can use the calculator to do it yourself. I know the calculator analogy is limited. I worked on a paper with Jason Lodge where we really take the calculator analogy apart, but on a paper with Jason Lodge where we really take the calculator analogy apart. But if you want a good use of it here I think it's the you can do it with AI once you've shown us you can do it yourself. To do this we've got to be thinking across degrees systemically and kind of zoom in a little bit further and go a little bit more.

Professor Phillip Dawson:

Vygotsky here you may be familiar with this idea of the zone of proximal development. There's stuff you can do by yourself, there's stuff you can do with the assistance of someone else, and the stuff in between is the zone of proximal development. Now, ai tools for production this is your chat. Gpt don't respect that. Ai tools that are there to do stuff for people just do the stuff. They don't develop your ability to work through the zone of proximal development to the point where you can do it for you. So really, when we're allowing students to use AI, we need to be concerned about how it may wither their ability to do things. Ai we need to be concerned about how it may wither their ability to do things.

Professor Phillip Dawson:

I want to talk briefly about cognitive offloading. This is the use of some sort of tools or physical action to get something done in an easier way, something that takes less mental burden. People have been complaining about technology and cognitive offloading for a very long time. Complaining about technology and cognitive offloading for a very long time. You go back to Socrates. Socrates complains about writing, that people will appear to know things but really they just read it or they wrote it down, and that this is a real problem for education. Socrates was not a fan of this particular technology. All the way through with many other technologies, we've had these same concerns about cognitive offloading.

Professor Phillip Dawson:

My pitch to you around cognitive offloading is to think about the learning outcomes you are setting for students and you are assessing and ask yourself which of these are intrinsic and really matter and which of these are extraneous busy work. An example of this is referencing. I don't know how to do referencing by hand and I don't care. Endnote has always done it for me. I never learned it and that's fine. It was never the outcome being assessed. We need to take that same sort of lens to what we do now and really be honest with ourself. What's busy work. Can we allow students to use AI for the busy work, to do the cognitive offloading? Let's not allow them to do it for what's intrinsic, though and I think this is the last concept in this segment evaluative judgment. This is a pitch I want to make to you.

Professor Phillip Dawson:

Evaluative judgment is your ability to make decisions about quality, about the quality of your own work, about the quality of the work of other people and the quality of the work of AI. It really matters that you know if AI is giving you garbage. If AI is giving you garbage work, you need to know that it's not good enough. If AI is giving you garbage work, you need to know that it's not good enough, that you shouldn't use it. We had some academics using AI to produce submissions to a parliamentary inquiry in Australia and submitting it thinking it was good enough. It made up all these accusations about the big four consulting firms that were completely not true. They had very poor evaluative judgment these academics. You need to be able to judge quality work in your field, and students need that as well. Now, this includes judgments about AI products, but also about your processes of using AI. Are they good enough? You also need to develop your evaluative judgment, because AI is increasingly judging our work.

Professor Phillip Dawson:

Just one final plea for you, and it's about the necessity for structural changes to assessment. My colleague, thomas Corbin and I are currently working on a paper about this, so you are getting the thoughts in progress. There are two sorts of changes you can make to assessment to address AI. There's discursive changes, where you're addressing AI in assessment through the instructions you give to students. You say something like you can use AI for editing but you can't use it for writing. Contrasting with that, there's also structural changes. That's addressing AI in assessment through changes in the tasks that are unavoidable. For example, we will have a conversation with you about your work, or this take-home task is now an exam, or we have chopped this task up into pieces and we want you to have a conversation with us in between them, or many other different ways. We can fundamentally change what students do. Our pitch is basically discursive changes. They're not the way to go. You can't address this problem of AI purely through talk. You need action. You need structural changes to assessment, traffic lights that tell students this is an orange task, so you can use AI to edit but not to write it.

Professor Phillip Dawson:

We have no way of stopping people from using AI if we aren't in some way supervising them. We need to accept that we can't pretend some sort of guidance to students is going to be effective at securing assessment, because if you aren't supervising, you can't be sure how AI was or wasn't used. So don't set restrictions that can't be enforced. Those just hurt validity. Secure some tasks that really matter for the program of study and accept that others will see significant AI use and that might be a fantastic thing, all right.

Professor Phillip Dawson:

My final plea is, you know, think systematically. Think at the level of the program. Don't look for one perfect AI proof task. Assemble many imperfect tasks like layers of Swiss cheese across a program. Someone can use AI in your take-home assignment, but when they get to the exam they won't be able to do the thing. But the exam's not perfect. So maybe we might want to have a chat with them, maybe we might want a placement as well, and we might want to layer many imperfect things across a whole program of study so we can make a really high quality judgment of if this person is ready to go out into society and do the things we say they can do.

Dr Kerr Castle:

Thank you to Phil for such a thought-provoking keynote. As mentioned at the top of the podcast, you can view Phil's presentation slides via the link in the show notes. Thanks again to Phil for joining us and to you for listening. We really hope you enjoyed the presentation and look forward to sharing more content like this with you soon.