K–12 policies and resources

Assessment Design and Artificial Intelligence webinar

The QCAA developed the Assessment Design and Artificial Intelligence webinar to help teachers and school leaders deepen their understanding of assessment design while considering the recent developments in AI. The webinar focuses on assessment design and its practice in secondary schools that use the Australian Curriculum and QCE system. The webinar provides school leaders with further approaches to complement the guidance in the Australian Curriculum v9.0 and QCE and QCIA policy and procedures handbook v4.0.

The webinar includes presentations from:

Associate Professor Jason Lodge from the University of Queensland, an expert in in the ways technologies, such as AI, are influencing learning and teaching
Mahoney Archer, Deputy Principal at Albany Creek State High School.

The topics in the webinar include:

the principles of assessment design
the effect of generative AI on assessment practice
a school-based response.

Download video

Assessment Design and Artificial Intelligence webinar (MP4, 93.4 MB; 1hour 7min)

Video transcript

Introduction

Anthony Barnett
Principal Project Officer, Academic Integrity and Research, QCAA

Okay, it’s three-thirty and I’d like to begin by welcoming everybody and saying thank you for attending this webinar on assessment design and AI.

Before we begin, I would also like to acknowledge the traditional custodians of the land, of the lands on which we meet today. We pay our respects to their elders and their descendants who continue cultural and spiritual connections to Country and we extend that respect to Aboriginal and Torres Strait Islander people here today. We thank them for sharing their cultures and spiritualities and recognise the important contribution of this knowledge to our understanding of this place we call home.

Once again, thank you for joining us today. I’m Anthony Barnett, Principal Project Officer, Academic Integrity and Research, here at the QCAA. I’m joined today by the QCAA officers who are supporting webinars, Shannyn McSweeney and Scott Zadravec. This webinar will be recorded, recorded and an edited version will be available on our website. The chat function, as previously stated, has been enabled to allow you to post questions for the presenters during the webinar if you have them. Once we’ve heard from our presenters, there will be time to address a few questions, but the chat box will not be included in the recording. The questions you provide and comments are also useful for us at QCAA, as they provide insight into ways that we can support your needs going into the future.

So, this is the second in a three-part series on the theme of artificial intelligence and its relevance to academic integrity, assessment design and digital literacy. This webinar aims to explore the current research on assessment design and artificial intelligence and how it can inform your educational decisions. You will learn about the purpose and qualities of assessments, as well as the opportunities and challenges that AI presents for educators.

Today, we have two expert presenters. We’re delighted to have Jason Lodge, Associate Professor at The University of Queensland’s School of Education, who’s focused on the ways technologies — such as artificial intelligence — are influencing learning and teaching. We will also hear from Mahoney Archer, Deputy Principal at Albany Creek State High School, who has a keen interest in assessment design in the junior and senior phases.

So remind ourselves of the context for this webinar. Generative artificial intelligence, or gen AI, in the form of ChatGPT — as I’m sure many of us are aware — took the world by storm in November 2022. It sparked concern, interest, transformation and a deluge of questions about the place of gen AI and emerging technologies in education. As educators around the world continue to grapple with the implications of these emerging technologies, the opportunities and risks it presents are becoming clearer. So to support schools to implement the Australian Curriculum and the QCE in this landscape punctuated by gen AI, the QCAA has produced a range of resources, including the updated academic integrity courses for students and teachers, and AI factsheets. In addition, we’ve created this series of three AI-focused webinars, of which I mentioned this the second.

The first webinar in the series discussed academic integrity, ethical scholarship and AI. And at that webinar, Associate Professor Christine Slade, from The University of Queensland, explained why students act with integrity or at times, unfortunately, cheat, and how generative AI can make the possibility of cheating easy. She suggested ways to reduce this misconduct, such as creating positive norms, talking about ethics, building trust and using generative AI responsibly. In that webinar, we also heard from Scott Adamson from All Hallows School in Brisbane, where he shared his school’s experience of dealing with gen AI, preparing students for a technology-filled world. He emphasised the need for school academic integrity guidelines, teacher supports and critical use of gen AI. So, the main takeaway that we had from that webinar was that students want to learn and that, as educators, we can help them make positive choices about gen AI.

The goal of today’s webinar, then, is to connect the principles that shape academic integrity and assessment design. We will explore how gen AI can support or challenge these principles and attributes, such as equity and validity. These have been described by the QCAA in the range of resources, which I’m sure you’re familiar with, including the assessment literacy and the QCE course, and the assessment literacy course P–6, as well as several factsheets and guidance documents, all of which are available on our website. And following this webinar, you may want to return to view these again.

Now, it’s my pleasure to hand it over to Jason Lodge, who will share the current state of research into assessment design and AI, and he’ll talk about the purpose of assessment, assessment literacy and sharing why emerging technologies may challenge schools in developing quality assessments. So, over to you, Jason.

Presentation 1

Jason Lodge
Associate Professor, School of Education, The University of Queensland

Thank you, Anthony, and good afternoon, everybody. I too would like to start by acknowledging the traditional owners and custodians of these lands and pay my respects to elders past and present.

We find ourselves in an interesting time, I think. Anthony, next slide, please.

I wanted to show you an example of an image that I created in literally seconds. I have no artistic talent whatsoever. I have no ability in this area. But using the tools that are available now, this was generated by a tool called Midjourney that you might, might or might not be familiar with. I was able to produce this image that perhaps partly captures the current concerns that we might have about the evolution of artificial intelligence and, and robots. I think that this partly captures some of the concerns that we have more broadly about these tools. I’m going to focus somewhat on those concerns. But I also want to talk a little bit about some of the opportunities that run alongside that. Next slide, please, Anthony.

At the core of the issue I think that we face — and this follows on from the previous webinar about academic integrity — that we have a situation where tools are now available where all sorts of different artefacts, including the image that I just showed you, can be generated with no additional work, no background work, and — of concern for us — is that there can be potentially no learning occurring. I’m going to give you some examples of this further on, but I think that it’s important that we’re kind of clear on what we think the sort of key problem is that we’re dealing with. And part of this is that we obviously use different types of artefacts, whether that’s a report or an essay or some other kind of written submission that students make, to get a sense of how they’re progressing in their learning. So if those sorts of artefacts can be produced without that learning occurring, then we have something that we probably need to pay attention to. Next slide please Anthony.

Part of the discussion that’s happened this year has obviously led to — again many of you be, will be familiar with this — the Australian Framework for Generative AI in School, in Schools. There are very important factors in there that have been addressed. And these are things like privacy, equity, transparency, bias in the large language models that are out there, like the one that sits underneath ChatGPT. And I think that this sets a very good foundation that we can now build on to think about how we might meaningfully use these tools and help students learn with these tools and importantly for the discussion that we’re having today, how we might make sure that our assessment processes are robust to the existence of these tools now becoming available. Next slide please. I’ve got a few slides here, sorry Anthony, a bit of skipping forward.

So, turning our attention much more to the assessment design piece in this. A few months ago, some colleagues and I put together what we thought was our best guess of what the options were on the table and what their viability were, is, over the short, medium and long term. And you can see here that we started thinking about what does it mean if we were to ignore these tools? What is it if we try to ban them? What role does invigilation play? Do we need to think about embracing these tools? How do we design around them? Or do we need to rethink some of our assessment practices overall? We can’t ignore this. I think that it’s fair to say that the … certainly the global mood around this and a lot of the research that’s happening is suggesting that the genie’s out of the bottle, that the sorts of large language models like ChatGPT and Bing and Bard are increasingly becoming part of the applications and productivity tools that we all use day to day. So Microsoft are embedding these large language models within Word and PowerPoint and Excel. So the idea that we can ignore these tools, that it’s going to be some fad that’s going to disappear, I think is looking less likely. And I’ve heard commentary from people who have been working in these areas for a long time who are saying that the relationship that we have with knowledge and information and the change that is represented by generative AI is as fundamental as it was when the internet first became available and widely used. Obviously, this has happened a little faster though.

Can we ban these tools? Well, given that they are going to be embedded within and are already becoming embedded within many of the major productivity tools that we use day to day, that’s already looking quite problematic. And in the long term, we probably need to think a bit more about what that might mean. So banning these things looks like being a bit of a problem, but that’s a longer conversation that I think we will continue to have.

By invigilation, we’re talking here about how do we run things like pen and paper exams so that we can be absolutely certain that the students that we are working with are meeting the outcomes that we’re looking for. There’s always been a case for those kinds of assessments. And there will continue to be, we think, into the foreseeable future. Embracing these technologies — and again, given how fundamental these are to the way that we will potentially work and learn on into the future — we do need to be mindful of the issues that these technologies bring, including equity issues and the bias that I mentioned before, where the data is being held and whose data is being used for what. Those are serious questions that we need to consider. And that is, I think, why there is such an emphasis on that in the national framework. But over the longer term, we need to think about how we embrace this in our assessment practices and in our learning.

The fifth one there is about how we design around these tools. And early on, when we had GPT 3.5 at the beginning of the year, for anybody who was playing with these tools back then, there were some — I think it’s fair to say — significant holes in the data that was available. And it looked at that stage as though we could design assessment tasks that meant that using these tools were not going to provide students with a good response. So I tested this out with some of the assessments that I assigned to my students here at the university. And the response it gave was, was pretty poor because the information that it was drawing on was not as robust as it could have been. That’s changed already, as these models are being upgraded. And it’s now looking like if we’re trying to design assessment to target the weaknesses of these tools, that is becoming something that is not looking particularly viable as well. They’re able to actually produce quite good responses to a lot of questions.

So that leaves us with the sixth option there, and one that we’ve been spending a lot of our time and effort on here at the university and nationally — and I’ll talk about this in a second — is to rethink some of the assessment processes that we’ve got. Now that might mean things that we have been doing traditionally for a long time. But it also means to think about how do we build on good assessment practice as we know, as it currently exists? So that’s been our effort, and I’ll give you a bit of a sense of how we’ve progressed with that in the last few months. Next slide please Anthony.

So, I think one of the things that has caught our attention, and something that we’ve needed to consider as far as the assessment design piece goes, is that on one side of this, we are learning more and more about what these tools can actually do. So our starting point at the beginning of the year was that we were concerned that students would take an assigned task, put it into a tool like ChatGPT, ChatGPT would generate a response that students would then submit as an artefact for that assessment piece. In the meantime, and we’ve collected some data around this with some students here at the university, we’re seeing that there are all sorts of possibilities for these tools as part of the learning process that include working as a study buddy or a personal tutor for them to use the tool to come up with ideas for a brainstorming process, and so on. So, it’s a little bit more complicated than just this idea that students will submit work that has been taken out of a generative AI tool. There is a bit more to it, and we’ve needed to incorporate that into our thinking about what assessment design might look like.

What we hope that we don’t end up with is a situation where the sophistication of these tools leads us to a future that might look something like this, where we really rely very heavily on pen and paper exams for all of the assessments that we carry out. Now that, as I said before, is absolutely suitable and appropriate in certain circumstances and will continue to be a feature of the way that we assess, I think. However, there are going to be many circumstances where we want to get a different sense of our students’ learning, and we need a different kind of assessment approach. This is certainly the case in the courses that I teach with my students. It’s not just about their ability to be able to respond to a set of exam questions. I want to see different ways that they’re working with knowledge and are able to use that knowledge. So this is one part of the equation, but we hope that it’s not the only part of the kind of assessment design that we’re considering. Next slide, please Anthony.

So, separating those things out. So we have some challenges. We have some more ideas about how these tools can be used. And we don’t want to be in a situation where we’re just relying wholly on, on external and high-stakes exams for, for all of our assessment approaches. So what does it mean to rethink the assessment that we, we currently have, and how do we need to adapt that to the fact that these new tools exist and are available? And these are some of the things we’ve been thinking about — so authenticity, contextual information, that the assessment has some sort of purpose and that’s clear to students. There is elements of critical and creative thinking, which I’m going to pick up on again shortly, that there is some ownership that students can feel of the assessment activities that they’re engaged in. That it starts to get at the process of learning as it occurs over time, rather than a kind of snapshot approach that a lot of assessment we know takes, and it builds on the idea that it’s — that the students are involved in a relationship with us. There is a community that we hope, that we develop in our classrooms and that somehow the assessment, we hope, would capture that. For us, we come back often to this idea that education is not a transaction. It’s about a relationship. It’s a relational activity. And the more that we can think about that as part of our assessment processes, we hope that that will then allow us to come up with really effective assessment practices. Next slide please, Anthony.

I think it’s fair to say that one of the key things that we’ve, I guess, realised or gone back to through the process of rethinking assessment is that there are many components of good assessment design that are robust to the situation we find ourselves in now. So nothing that I’m hopefully saying here is giving you any impression that what good assessment design looks like should change. The core still relates to these things. It should be aligned with the curriculum, it should be equitable for all students, it should be evidence based. We should be thinking about it as something that occurs over time and is ongoing, and it’s transparent so that it allows us to have some confidence in the assessment process. And it should be informative to students about their learning, so feedback should be part of this situation as well. So I think the QCAA guidelines here about assessment hold. And I think that these are things that we still need to pay attention to, and give us good guidance about where we might head with all of this. From there, I think, at a higher level — Next slide please, Anthony. Are you getting good at predicting what I, when I need the next slide?

We also need to be thinking about the core attributes of, of assessment — validity, accessibility, reliability. And so for those who are in — who like this sort of psychometric view of the world, these are also things that I think, again, provide us a solid foundation. Do we need to change some things that we do? I think that that’s becoming clear. But we’re not in a situation here where we’re throwing the baby out of the bathwater. Assessment design is good assessment design. And that should continue to be the foundation for what we do.

That being said, moving on to the next part. We were tasked by the Tertiary Education Quality and Standards Agency back in August to do some work about what, what does rethinking assessment design look like with AI in the mix. Now I know that this is coming from a tertiary perspective, but we feel as though the work that we did is hopefully of some use. And that’s why I thought I’d give you a sense of what we arrived at through this work. So how this formed and the process that we went through is that we brought some of the leading experts in the country together. And these are experts in assessment, design and assessment of with artificial intelligence and experts who look at various different educational systems. What we wanted to do in that process is, is to come up with some guidance. We can’t develop a map of what the future might look like because things are changing very rapidly. But we felt what we could do is at least provide a sense of the direction that we need to head in. And we feel as though we’ve tried to produce something here that could serve as a compass per se, rather than a map, and the things that we landed on are on the next slide.

We have two guiding principles and three, sorry, five propositions beyond that. Now, this builds on what I’ve mentioned previously about good assessment design, and about the things that we know as a foundation and are important to think about with AI that are in the national framework.

So the first thing here is that we feel as though it’s a really important for us to be considering very carefully about the world in which our students can go out into and how artificial intelligence is going to interact with that world. So we want to try and equip students to be able to thrive in the world that we now find ourselves in. So that was one of our first guiding principles here. Secondly, we think that — and this feeds on from what we’ve just talked about in terms of the QCAA documentation and guidance — is that we want to have reliable judgments about student learning, and that we want to think about multiple, inclusive and contextualised assessment approaches. We felt that this also was a very good and important guiding principle for the ways in which we might rethink assessment from this point.

The five propositions that we then also came up with as a result of this collaboration are on the next slide. We need to think about what an authentic engagement with AI looks like as part of the assessment process. We need to think about assessment across not just one item or one unit, but about what that looks like across, for us that means a degree program. In your context, that could be what’s it look like over a term or over a whole year or over a larger kind of volume of learning. We need to emphasise the learning process as part of all of this, as I’ve alluded to before. We need to think about how students are working with each other and how they work with us and what happens with that relative to artificial intelligence when that gets mixed into the equation there. And we also need to, need to think about the kinds of things that Christine was talking about in the last webinar, like how can we ensure academic integrity at critical points during a student’s learning journey? So those are the key foundational principles that we’re working with at the tertiary level, and the propositions that we’re thinking about in terms of the direction that we need to take to rethink the assessment processes.

It raises the question from here though, about how do we take a bottom-up approach that will align with the sort of top-down approach that some of these guidance and frameworks give us as a way that we might progress? And from here, we then need to start thinking about what does it look like for individual kinds of tasks? How do we help empower all of our teachers — who are already stressed, who already have a high workload, who already have a lot of other things that they need to think about — to be able to make good decisions about individual kinds of tasks that they need to, to adapt and perhaps change to now adapt to this world that we find ourselves in?

There are a couple of key challenges here, I think. The first on the next slide is that there are a lot of great ideas out there about some of the things that we need to consider. When Bill Gates, for example, heralded in this new world of reality of artificial intelligence, he talked about learning styles as being a key thing that we might need to adapt to. So a lot of the global ideas around what we need to do with assessment and what we need to do in education are not founded on some ideas that have the strongest evidence behind them. I know many of you will know that the idea of learning styles — visual learners, auditory learners — has very little to any evidence sitting behind it. So a lot of the discussion that’s coming from the technical side of all of this doesn’t align with the sorts of things that we need to think about and the sorts of evidence that we’ve gained from our educational environments. So this is also making this idea of how we might change individual tasks a bit challenging.

There are also some elements in this where we haven’t quite got all of the pieces together. We need to do a lot of research to understand how the learning is working with these tools. How does that align with the way that we understand learning? What do we see in our classrooms? And how do all these things come together through what we call a consilience that might inform our action? So there is a challenge here that is about how we might work together across sectors to figure out what that might look like, given a lot of these tools are so new, and that we’re still all collectively trying to figure out how they might best work in the learning process. We’ve got some work to do to line everything up so that we know how best to do it. So again, I think that provides a little bit of a challenge to how we might implement this.

The third component in this, I think, is that we don’t want the technology to be driving what we do in assessment. Partly, we want to make sure that the pedagogy does, and certainly technology and pedagogy need to feature in what our considerations are. But as my colleague here Tim Fawns from Monash University talks about, these things are really entangled together. So if we think about how we might change our assessment design, we need to think about the technology, we need to think about the curriculum, we need to think about the syllabus. We need to think about the context we find ourselves in. So what all of this, I think, points to is that we have a complex set of factors that we need to consider if we’re going to change and rethink what assessment might look like. That doesn’t mean that it’s insurmountable. And I think one thing that gives me a lot of hope here is that we’re working together to try and figure this out. So it’s complicated, but I think that we’re making some fairly significant strides forward to think about what this might look like so that we can, again, empower our teachers to be able to make good decisions about individual assessment tasks within a broader context of the sorts of changes that we’ve talked about already.

So what does that look like in a very concrete way? And what I wanted to give you a sense here is what this might look like for particular kinds of examples. And one thing that we’re using here to think about the changes in assessment tasks from that ground-up perspective — so next slide please, Anthony — is what’s called the SAMR model. So across the different types of tasks that we might be assigning students to do, do we need to think substituting that task or do we need to think about augmenting it? Do we need to think about modifying it in some way? Or do we need to kind of redefine what that task is and how it works?

So I wanted to give you a bit of a case study here to give you a sense of the way that we’re approaching these kinds of questions that we have about individuals sorts of tasks. And we wanted to focus here on a particular kind of written assessment. Now this will vary depending on your context. This could be an essay or report, or some other sort of written artefact. So I’m going to refer the essays a little bit here, but really the kinds of changes that we’re talking about here are the sorts of things that we might see across a submitted written kind of tasks that we might ask students to do.

So if we think about our SAMR model here, the first of these considerations is to substitute. Right, so if machines are able to do some of these kinds of tasks, do we need to think about substituting completely, and — aligned with the way that generative AI might do the task — to reimagine what that task might look like? So an example of this might be, okay, if generative AI can produce the kind of artefact that I’m looking for. For example, I ask my students to generate a lesson plan, would a better way to go here be to substitute that task with an AI-generated lesson plan and then get the students to work on what they think is good or bad about that generated artefact, how they would improve it, what they might do differently, and so on and so on. So rather than getting them to do that work, I’m substituting that with what the machine might do. And then we build on that through the assessment.

The second approach here would be to augment. And by this we mean that we still have the task in a way that it was previously. We skipped one there. Sorry, Anthony. What does that look like when we augment it? And, for this kind of activity, we’re really thinking about, okay, if we’ve got this written task, be it an essay or something else, can we take that as a basis and can we use the technology to build on it? So can we potentially allow the students to integrate the way that they’re going about producing this task and doing the learning that’s required to get to that end point, and integrate the technology into that? And on the next slide, we can think about different levels at which we might be prepared to allow students to do that. So am I going to say to students, okay, we would like you to produce an essay yourself, but we’re going to allow you to do some editing and some grammar checking and those sorts of activities that might allow them to take something that they’ve produced and make it even more polished? And are we prepared to allow that to happen? So in that way, we’re allowing a level of augmentation to occur there.

The third level, and we’ll start to sort of get towards the end here and wrap this up, do we need to modify the essay? Do we need to make it something different? Do we need to add, for example, an oral component to that?

Some colleagues of mine at The University of Melbourne thought about the ways that we could modify essays the beginning of the year. And despite the changes and updates in the technology, most of these still hold. So there are ways that we could potentially modify these written tasks. They could focus more on critical thinking, for example, or they could draw on specific aspects of the syllabus that are very contextualised. They could focus on things that occur in a local environment, that a large language model that’s sitting on a server somewhere in North America is not necessarily going to capture the nuances of. Could we make it more about kind of value judgments that are more about students themselves and what they value in the world? So the evidence is suggesting that there are some really effective ways that we are able to modify written tasks like essays that make them pretty robust to this.

Beyond that, we could then redefine what the essay means as part of the student’s educational experience, and there is some work out there thinking about creative writing, in particular. On the next slide, there are some ways that we can consider, and there was an article about this in Harvard Business Review towards the end of last year. As we think about things like automation and the quality and variety of things that artificial intelligence can do, do we need to reconsider what the role of the essay is more broadly? Is it the kind of thing that people are going to be doing in the same way that we have thought of it traditionally, when we didn’t have generative AI in the mix, and do we need to adapt our assessment in regards to that?

So, from there, I think we can start to see that there are some ways that we could potentially adapt what we’re doing. We could also think about the kinds of standards that we implement. So if we’re going to allow students to work more with artificial intelligence, do we set our expectations in a different way so that we would expect, for example, grammar to be pretty well spot on because they’ve got the opportunity to be able to use the tool to help them with that aspect of their learning? We can also, on the next slide, think about the role of critical thinking and what kinds of mental skills are going to be of more value than others as generative AI and other AI tools are going to be able to do more of that kind of work.

So for us, we’re starting to, on the next slide, separate out what are the sorts of things that we know that generative AI as a technology is going to be pretty good at and what are the sorts of things that students are going to continue to be good at and, us as humans more broadly, are going to be good at? Do we need to focus our assessment more on the human components and think about the way that we might build written tasks, for example, that require more evaluation or meaning-making or give us some more insight into the way that students are regulating their own learning? So those are the sorts of things that we’re thinking about. And as I — I think where this leaves us, is that we have some really good foundations through the national framework.

We hope that we’ve been able to provide some guidance through the work that we’ve been doing with TEQSA, and we’re now in a situation where I think we can start to think quite carefully and in a nuanced way about how we might adapt and update the various different kinds of assessment activities that we assigned for our students. We’re working on a suite of resources here, and I know that there is work going on around the country and indeed around the world, to make that process as straightforward for teachers as possible because, as I said before, we know that we’re already facing a lot of things that we need to balance as teachers day to day in our work. How can we help to help teachers to make good decisions about what rethinking assessment might look like on into the future as we have these new tools and help us to come up with the best assessment design that we can that is robust to the, the existence of these tools and how they’re going to be integrated into our lives.

So hopefully, that was useful to give you a sense about where that’s going. We’ve still got a lot of work to do. And, yeah, we look forward to working together to figure out what this might look like. Anthony, back to you.

Transition

Anthony Barnett

Thank you very much, Jason, That was fascinating and thought provoking, your presentation about those sort of assessments ideas in this gen AI age. I found it particularly sort of fascinating, your focus there on to the different types of written responses, which might apply across all subjects’ areas, whether they’re essays, short responses, reports, science, maths, humanities, arts, languages. And really that connection you made between those pillars of assessment design that we already have and the fact that those hold us in a strong place, and that we should then think about how gen AI fits into that rather than the technology coming in first. So thank you very much for that.

And I think that sets us up really well now as we welcome Mahoney Archer, and she’s going to share with us the work that she has led in her school to prepare guidelines, and that will support teachers to adjust to the emergence of gen AI, and share some examples of the work that she’s [background noise] today. Thanks very much, and over to you Mahoney.

Presentation 2

Mahoney ArcherDeputy Principal, Albany Creek State High School

Thanks Anthony. I’d also like to acknowledge the traditional owners of all of the lands that we meet on today and acknowledge elders past, present and emerging, given that we’re chiming in from all over Queensland.

I guess, there are about eight slides that I’ve got to talk to, so Anthony can keep pace with along with me. But I suppose a couple of the things is to provide some context for where we found ourselves and — I mean, as Jason outlined, AI kind of began as an exciting little fire and rapidly became a fire storm that impacted us in schools pretty quickly on assessments.

And so as it, as it emerged, we really quickly noticed that it was going to have, or was almost immediately having, a pretty significant impact, specifically on extended written assessment. And so at our school, we have very tight committee structures. And part of the — one of the committees that I chair is the Curriculum, Teaching and Learning committee. So obviously, that became a feature of conversation in that committee, and then more broadly in our extended leadership teams and in our faculties. And for different faculties it was impacting more immediately than others. So certain quarters, in the assessments — and I’ll talk about it a little bit later — sort of features that, particularly in the humanities, it became an almost instant concern. So around that we had a few key questions that came out of that professional conversation in those different forums about what did the use of generative AI and assessment constitute for us? What did it mean? Did our existing assessment policy have the capacity to respond to its use, or was it going to be a bit of push lawyering around on a technicality. I’m going to use this and this; you don’t have anything written down that might make this possible. And what should our collective response as a school community be to managing that, and not just from a staffing-only sense of management, but about us as a collective, as a school community? And that being all of our stakeholders being involved in what that process might look like and how we could make it work for us, I guess, and with us. And I think some of the stuff that Jason was saying before is part of that process of the rethink approach. And the SAMR model, which is part of our professional conversations here too. He made a mention before about the genie being out of the bottle, and we well and truly understood that, so there was no really sort of hiding in the corner, pretending that it wasn’t going to be something that we had to respond to. So we sort of tried to move on the front foot about what we could do, and how we could respond and actually use it as a superpower, really, in our teaching and learning. And a way to then re-interrogate our assessment culture. The next slide Anthony, please.

So, in that first question about what did it constitute for us, we realised that, as we came to understand more and more… I’d also like to point out, Anthony, I don’t know if I’ve got a keen interest in AI. I think that perhaps it’s out of necessity I’ve developed very comprehensive interest in AI. As because, because we were then sort of interrogating our assessment policy and our existing assessment culture, which I’ll talk about a little bit as well, we considered that because generative AI was training itself on material sourced from across the internet, if students were using that without appropriate or correct referencing, then they weren’t behaving in an ethical way as a scholar. But also, we did feel as though it had a provision in our existing assessment policy around it being plagiarism or even to a degree contract cheating, where they were basically utilising someone to — ‘someone’, in inverted commas — to generate the material that they were then saying was their own. So we sort of had that first, that first sort of blush of okay, well what does this mean for us? Is it okay for people to use, is it not? And then, we fairly quickly came to the idea that no, it’s not okay in the way that it was. We were seeing it being evidenced by kids and, and saw that more as okay, this is starting to become about ethical use, and then starting to lead us down a very specific kind of direction. Next slide, please.

So for us, our existing policy had very explicit sections about managing academic misconduct, and we talk very clearly about what our expectations are of being an ethical scholar, and what all the different possibilities or examples of academic misconduct might be and, in that, plagiarism is clearly one. And so for us, we really were locating and labelling it very explicitly as an issue of academic integrity, and that it could be considered either plagiarism or contract cheating. And then we realised, yes, it does, it does do that already. But we also thought okay, we need to be more on the front foot here about having a conversation with our students and, more broadly, in our school community with parents and carers around what AI is and does and what its impact is on student learning and their outcomes. So we then sort of went back to our assessment policy to consider how we needed to frame that up so that it not only was more comprehensive and explicit about AI in the actual policy document, but it was then the sort of leaping off point for us to have those conversations more broadly in our school community. Can I have the next one, please?

So from that then, it was really around us looking at these are the sort of four key — and they’re not insignificant, and I should also have said right at the beginning, we as a school are at the beginning of this process, and we see so many opportunities around the application of AI — that it’s, it’s a process that we’re really orienting ourselves to and sort of responding to rather than reacting to. We’re trying to sort of come to understand a lot of the things of how it’s impacting on how we have previously done business and how we need to go forward and do business differently, which I think speaks to the work that Jason has been doing and what he articulated earlier. So I think the big things for us were that we needed to amend our assessment policy, to be very explicit around that. We also needed to consider our assessment instruments and we — we have a very strong assessment culture at our school, a professional assessment culture around our school, where we’ve worked very hard to have a consistent approach to the presentation of assessments, the use of language and expression, the explicit nature of cognitions. So we have worked very hard with that. I would feel very confident about the level of confidence and capacity in our staff to design quality assessment.

So, but we also recognise that AI was presenting a factor that previously we hadn’t really considered in how it might impact our need to redesign or re-interrogate our assessment instruments that we, that we currently have. And so in that, we realised that while we have a strong assessment culture, we really wanted to and needed to put ethical scholarship at the front of the story, and so that it wasn’t a punitive conversation around ‘you’ve cheated’ or framing it in that way, but trying to interrogate that and build that community understanding around what ethical scholarship is, and academic citizenship or even just a system really, about being part of a school community and why people might do that, behave in a way that’s not with integrity in terms of the academic work. So we realised there was, there was work to do around that, and that also we needed to teach the principles of that and demonstrate the principles of that to our Years 7 to 12 students. And from that, I think, is the story around, you know, why, why AI can be fabulous and have so many opportunities and possibilities, that yes, there’s a space for it in assessment. The space really is around pedagogy, and that’s its massive opportunity is around our focus being really shifted to an application of AI in that space and using it like we would have considered any, any tool previously. And teaching people how to use that tool in effective, respectful and appropriate ways. And for us, in that sort of assessment culture, assessment, academic conduct kind of space and assessment design space, we were really having those conversations with kids and with staff around we want for them to be using it responsibly and trying to tie that to the principles and values that we have in the school community that we talk about quite a lot. So that it was, everything was running through that idea of how do we redesign things if we need to, to act responsibly to this advent of new technology? But how do we also work to build a want to do that, and a revision of our practices to make sure that we’re getting the best out of what’s available to us in terms of teaching and learning resources and tools, but also getting the best out of our students both in their academic performance, but also in their demonstration of being ethical and honest, humans? Next slide, please.

So, specifically, and this is more just for information really about how explicit we went. We emphasise that ethical scholarship, assessment review and assessment cultural principles that I’ve talked about so far. And we, we made a decision, at least at this point, around saying that the use of AI for assessment is prohibited. And that’s as, as we currently have, the use for it to generate your assessment response is prohibited. We did come up with some different provisions for us to, again as part of our assessment culture and sort of looking at our assessment instruments over time, to consider the use of AI — which I think Jason was talking about as well — how might we use AI in the context of a piece of assessment, and be completely hand-on-heart happy to do so, and feel like it could be in fact a very useful or powerful tool to use to move a student from point A to point B, whatever that might be, in whatever subject that might be? So I haven’t included that here. But we do have that as part of our next sort of stage of our assessment design and assessment cultural development, I guess. So we’ve talked around that, about sort of emphasising in our current sort of space, the use of it without an acknowledgement or a referencing. Generally, this is for the future, would be explicitly stated. But at the moment, we’ve said, look we need to think more on this. Currently, it is around we were seeing evidence of students using it in a way without saying ‘I have used this’, instead rather saying ‘this is my words and my work’. And so we were just having a response to that specifically. And then talk to other things that might be able to be included about, say, for example, if there was permissible use that it would need to be attributed correctly. And we showed them how they might do that in terms of how we would teach appropriate referencing. So, can I have the next slide please?

Okay, so, in that, I’ve spoken to some of this already. That we in our conversations, in those professional conversations and forums I was talking about earlier, what became clear was that the reliability story of assessment was that if we could — and I know I’m using shorthand there — but if you could AI a task and get an adequate response from it, then we needed to revise and review the task or its conditions or some of the other aspects, which I think some of that is what Jason was speaking to before around reliability in assessment. And I think it is a really critical piece, because I was also cheering quietly over here on my muted microphone when he said it. We want, we don’t want a situation where we sort of go to 100% standardised assessments or we go to this exclusive sort of pen and paper kind of context. It would be unfortunate, I think, in the 21st century to do that for a start, but I think it would also have the potential to be quite adversely impactful on probably some KLAs and subject areas more than others. So I think the thing for us too, is that we were looking at our assessment practices and feeling that we had quite a high level of confidence in that design culture in our school and felt quite confident that some of the features of timed assessment were what we needed to bring into our assessment design re-interrogations and bring those sort of concepts or qualities of timed assessment into non-timed techniques. That idea of observable evidencing from students and having a staged assessment, so a component. They might do Part A and be able to evidence that, need to have you be able to observe that and provide feedback to that before they progress to Part B, rather than producing what might have been previously Parts A, B or C or whatever, but all presented as one task that would have been input into a generative AI engine that could have created a response. Certainly that was some of the concerns raised by our history teachers, that they had very good assessment instruments and task statements, and they were playing with sort of putting those tasks into generative AI and actually getting responses. Where early on, there was like, not, not as alarmed, and then very quickly, I guess, sort of more fire-bell alarm around it was just rapidly improving its capacity. So we had to consider what that might look like.

And then of course, the assessment culture story, which I was talking about before about ethical scholarship, and then re-embracing AI as part of teaching and learning and using it as a functional, useful tool in the classroom as a pedagogical strategy. Can I have the next slide please?

So our collective response to managing that was, as we said, you know, the idea around educating ourselves and so, sharing professional readings or discussing particular things that different members of the community had learned or considered or were ruminating on in those different forums. We talked to that, our assessment policy conversations, and then also that idea of leaning into the idea of ethical scholarship and the development of a junior academic integrity course that we have in the draft form currently, that’s a bit of a, modelled on the senior academic integrity course, but obviously for a junior audience, so that we can feel confident that we’re speaking to those other kind of cultural aspects of why students are drawn to AI. Not necessarily around the sort of desire to deceive I guess, but more some of the factors around why, why the students do that sort of stuff even before it was AI, that there could be a whole lot of factors like peer pressure or there could be concerns or pressures from home about student performance and things like that. So there are features of that, that we’re trying to lock into some of our school values around being responsible, respectful and resilient students, and understanding how that works in that way. So that’s certainly something that we’re very motivated to try and continue to build that in our school community. Can I have the next slide please.

So the last piece that I wanted to talk about was an example from our humanities faculty, which just gives you a little bit of a sense of — I probably want to draw you most specifically to the table that you can see on that second page for the task. Where, as I was saying earlier, some of the principles of timed assessment where there’s incremental junctures at which teachers are observing evidence being demonstrated, either quite literally in front of them, or they’re things that are necessary in order to progress to the next part of the piece of assessment in some way. That’s certainly been the first and most immediate response that’s been met with a pretty effective sort of management of that original, initial issue. And then obviously, going forward, being very explicit around those checkpoints and how we’re authenticating student work in that way. And also sort of the future conversations around what are we asking our students to do? And interrogating, as Jason was saying, what are the ways that we might be able to use AI in a way that means that we need to re-interrogate the way that we’re asking kids to do things for us? Are there things that we just no longer need to ask them to do? And they may be those lower-order cognitions, and more sort of on those slides that I think Jason showed towards the end around the robot and the brain, around those cognitive processes that we, we want to retain, we want to retain seeing the students’ evidence in those. And what are the contexts or forms or means or modes that we might be able to see that from them, and then use AI more proactively in assessment? As I said at the beginning, we’re a work in progress in that way. So that’s probably our story sort of in a nutshell, and I think I’ve come in pretty close to my 10-minute allocation Anthony. I’d like that minuted, please. That’s probably all we need. Thank you.

Q&A

Anthony Barnett

Thank you very much Mahoney and absolutely, I’ll make sure that’s minuted.

You also said something that was really useful to hear, is that many of us out there have come to have a keen interest in AI out of necessity and thinking about how it’s developed and impacting and affecting us in the education world. It was also really nice to hear there, the insights you were giving us around the way in which you put values at the centre of what you’re doing in terms of reevaluating assessments. It’s being driven by the values of learning that you’re prioritising and the values of assessment that you then want to be able to see, and very much focusing on right now, but also mindful that what happens in the future may well be something different. So that has been great, but then to be able to tie that all the way through to applying that to a real example in your own school there with that Year 8 assessment tasks, and how that might also have those same features that can be applied to other subject areas.

So the next thing that we’re going to do is, in the remaining time, we have, we’re going to have just a few questions for Jason and Mahoney. And I know that there’s been the sort of chat function and we’ve got two or three things going there. So just looking at those questions, and we’ve only got a few minutes. We’re just going to go to something that was already asking before the assessment, before the webinar began — is that obviously in the documents we’ve already been seeing from QCAA, and that Mahoney’s referred to as well, there’s talk about quality assessment needs to be valid, reliable, accessible and authentic. Which of those elements do we think is being most affected by generative AI?

I don’t know if Mahoney, you might want to, or Jason you might want to begin with that first, if I hand it over to you first, Jason and then maybe Mahoney next. So, Jason, what do you think about which attribute of quality assessment is being affected most?

Jason Lodge

I think, for us, what we’ve grappled with most is problems around validity, I think. Ultimately, I think what it boils down to is that we have a situation where machines can now kind of circumvent our ability to be able to make good inferences on the basis of submitted work. So to me, I think it starts to get at that, that kind of being a valid inference for us. So from there, we then need to think about all the changes that I’ve talked about. How do we, you know, take account of the fact that machines can circumvent our ability to be able to make those kinds of inferences? So, then, I think that means you probably need to lean a little heavier on the other components that you’ve just talked about. So how do we rebalance things on the basis of that one being slightly problematic?

Anthony Barnett

Mahoney, would you like to sort of just weigh in there? Sorry, your microphone is just muted.

Mahoney Archer

So, I would like to say what he said. [Laughing] I agree. Mark him correct.

Anthony Barnett

Brilliant. Well, okay, well, if we’re in agreement, what we’ll do is just sort of, looking at another question that’s also been posed, quickly.

When we, one of the things that’s been talked about is addressing assessment and the impact of gen AI by looking at the process of learning rather than that end product or artefact and that’s sort of been referenced by both Jason and Mahoney. So just thinking, maybe starting with you Mahoney, is there a sort of particular way you’re thinking about addressing that process of learning, so you can capture things in that way?

Mahoney Archer

Yeah, I think I spoke a little bit to that before, where it’s really about us sort of leaning into that idea that AI is not something for our profession to be scared of, but rather to be excited by it as a tool to be used effectively, and maybe not — to really lean into it actually and consider its use pedagogically and how might we use it as another piece of technology as we’ve used over the last however many decades, when there’s been a new piece of technology. Granted, having said that, it is a very special and specific type of technology that’s currently getting smarter and smarter every day. I think the, the opportunity really is about shifting the process back a bit into that teaching and learning space and using it and demonstrating its use in ethical and honest ways, and exciting and fun, engaging ways as well. So that there’s not this sort of abrupt endpoint at the assessment part of the story where then there’s this high degree of panic around, can we believe what we’re reading or can we believe what is being evidenced by our students, because that just feels icky to me.

Anthony Barnett

Okay, great. Jason, do you want to add anything more?

Jason Lodge

I think it’s going to be an agree-a-thon here, Anthony. [Laughing] Fully, fully agree with that. I think one of the things I always come back to is that I think that there are parts of this that are complex and there are reasons why we have assessed the way that we have traditionally. But this is where I also think that AI is as much part of the solution perhaps as it is part of the problem, which I think builds on, on Mahoney’s point that she’s already made that there are probably lots of ways that we could get some more insight into learning processes. And over time, I think we’ll start to be able to use these technologies to help give some of those insights. But we’ve got a little bit of work to do, I think, to figure out how best to do that in an ethical and appropriate way.

Mahoney Archer

Can I add something to that, Anthony, too? I just, I just realised another thing too, just as Jason was saying that, that the opportunity I think, too, is around the real power of AI as a supportive tool, a supportive learning tool for students. In terms of learning support, you know, it is a really powerful device in that way, to build capability and capacity and confidence for students with additional needs. A whole raft of things where there’s, just immediately that’s an amazing superpower of AI. So I think it is definitely something for us to lean into. Sorry, I just wanted to say that.

Anthony Barnett

That’s really useful, really positive. And I suppose just conscious of sort of our time as we’re going through here. There’s a technical question which has come up, which might be something that Jason might be able to respond to, and I know it’s sort of one out there that’s sort of still a work in progress, is that idea of who owns the outputs generated by sort of gen AI tools, because obviously, if we’re going to be using them in in an ethical way, we’re going to have to think very clearly about that. So I don’t know Jason, if there’s a few words, quickly, you want to mention about that?

Jason Lodge

I can talk about that. I’m going to talk about it from a slightly different perspective as a journal editor. So we’ve had to think about this very careful with people submitting work to a journal and what level of responsibility we have to give to various agents in the equation. For us, the bottom line is we need to be able to point to a human. So there is no referencing artificial intelligence, there is no putting artificial intelligence on as an author. It is a tool, because ultimately, as journal editors, we have to be clear that there is a level of responsibility for what is published in the journal. That responsibility is with the authors, and it’s not with a machine. Now, maybe that will change in the future. But for us, it was very clear that we could not treat generative AI as an agent, and therefore we had to be able to point to a person who was responsible for what’s in it.

Anthony Barnett

That’s great. Thank you very much for that perspective on, on sort of that thorny issue that we’re all starting to grapple with.

Look, at this stage as sort of, we just have a few minutes left in the webinar, I just want to say a big thank you for those responses to the questions and from the presentations. Certainly, it’s something which, I hope you in the audience have gained a great deal from Jason and Mahoney’s responses and from your participation in this webinar on assessment design and artificial intelligence.

I can certainly say in terms of having sat here and listened and gone through, you know, the takeaways that I have around this are really sort of quite reassuring and confidence building that we’ve got quality principles and strong attributes of assessment design that we can come back to, and they remain foundation. And regardless of whether we’re in the Australian Curriculum space or the QCE space, we will still strive for those features such as reliable, equitable and valid assessments. And also that, you know, despite the challenges of gen AI, there are, as Mahoney sort of really led us to, sort of things that are already happening in the school space. And as Jason has alluded to, research going on and thinking going on around how we sort of holistically think about this sort of new technology to maintain our assessment principles. And also that while this webinar’s part of a series of responding to emerging technology, to gen AI, remembering that the technology is just one part of what we’re doing. Technology is there to support the learning assessment that we’ve always wanted to do, and the importance remains with the relations, the relationships that take place within schools, within the education community, that we support our students and that we also support each other even seeing within the chat function there people responding around, for example, how to reference, sort of, there are ways that are being produced that we can turn to and so as a community of practice being able to share that.

So, all that remains is for me to say a big thank you to both Jason and Mahoney for their wonderful insights and presentations, and thank you to you for making the time to join us. I hope you found this a useful experience.

Three-minute read

Assessment design and artificial intelligence (AI) (PDF, 146.4 KB)

Last updated:: 9 November 2023

Your Right to Information

Memos

Form

Transition statements

NAPLAN

Senior

QCAA Portal

Forms

K–12 policies and resources

Assessment Design and Artificial Intelligence webinar

Download video

Video transcript

Introduction

Presentation 1

Transition

Presentation 2

Q&A

Three-minute read

K–12 policies and resources