02/27/2024
Artificial Intelligence v. Academic Integrity: Musings on the Future of Assessment
Written by Tricia Bertram Gallant
On February 23rd, the International Center for Academic Integrity and the National College Testing Association co-hosted me in delivering a webinar on the critical roles that testing and academic integrity professionals will play in the future of assessment in the GenAI era. In the webinar, I suggested that higher education institutions leverage this opportunity of GenAI to rethink both what teaching, learning and assessment look like, and the role that faculty versus professionals play in ensuring integrity in those processes.
There were a lot of questions raised during the webinar that I did not have time to address. So, I am grouping them here according to theme and then answering them in hopes of furthering the conversation beyond just those who attended the webinar live.
Ethics of GenAI Use
Q: When uploading work to AI for checking original work or helping to phrase things better, my understanding is that it then becomes part of the algorithm at that point. Has the person lost control of their original work at that point?
A: I guess it depends on your definition of “control”. I think control of original work is a concept that probably needs to be rethought. But, yes, my understanding is that in some tools you don’t have the option of choosing that your work stays out of the training data set. However, it’s not like giving your work to a human who could then turn your work in as their own. GenAI doesn’t work like that in that it doesn’t reuse your work in its complete form. Think about it this way – it breaks your work down into “more examples of words that can be predicted to follow other words that I will learn from”.
Q: Thinking about your point regarding the moral obligation of instructors, do you have thoughts on faculty using AI generated assignments, etc.?
A: I think faculty should be using GenAI to help them generate assignment ideas, come up with lesson plans, generate assessment questions etc. I don’t buy the argument that if we’re not allowing students to use GenAI, then faculty shouldn’t be allowed to use it either. Or that if faculty are using it, students should be allowed to. The purposes are different. The purpose of the instructor is to facilitate and assess learning and they should use whatever tools they have available to help them do that. (NOTE: I think this is better than the faculty just using assessment questions out of the instructor’s manual of the text book, or reusing the same questions over and over again. At least the instructor is involved and, ideally, critiquing the suggestions by GenAI.) The student’s role is to learn and demonstrate their learning; if the student hands in something generated by GenAI, then they’re not doing that. And it’s not just their purpose or roles that are different; students are novices in the discipline and faculty are experts. Thus, faculty can critique the Chatbot output more robustly and critically than can students; we just can’t assume that novice and experts can interface with these tools in the same way. Regardless, though, both faculty and students, can be transparent about their processes and their use of these tools.
The Future Role of Testing Centers
Q: What precedent is there for testing centers within California that broadly serve students in all three systems of higher education, and what are your thoughts on staffing centers with educational specialists who partner with instructors across the systems on the design of their assessments?
A: There is no precedent for that yet. That is something I’m working on. But yes, I would love testing centers to transform into assessment centers that have educational specialists and maybe even psychometricians on staff to help instructors create valid assessments of learning.
Q: What do you recommend for testing centers to upskill to become GenAI assessment specialists?
A: I think current testing center professionals can carve out time to learn how to use GenAI to do simple things like generate multiple exam questions from one set of questions; upskill questions from low taxonomy levels to high levels; revamp assessment questions to make them clearer. Also, I think professionals can learn how to use existing master-based testing platforms like PrairieLearn - to help faculty design their assessments.
Best Assessment & Integrity Practices
Q: I'm very curious about your advice to large institutions (in the U.S.), where we are teaching 150+ students, online. Ideas for navigating challenges with AI, outside of the typical best practices for assessment?
A: I think that online classes have a particular challenge that is not easy to overcome given the strenuous objections to remote proctoring services. However, generally the advice is the same for online and in-person classes: decide which assessments are for learning and which are of learning, and then secure the assessments that are OF learning. Security would either come in the form of in-person testing at a computer-based testing center or online proctoring. And, of course, instructors should do all of the other pedagogical, assessment, and class design techniques that should enhance intrinsic motivation, increase self-efficacy, and reduce barriers to learning and honest demonstration of that learning (this is what my book with David Rettinger will be about – hopefully released early 2025 by University of Oklahoma Press).
Q: What are your thoughts on balancing the monitoring testing centers do against student privacy concerns/student sensitive data protection? I’m thinking specifically about student fingerprints or facial ID as a means of identity confirmation before an assessment is given?
A: The privacy conversation is interesting to me. How many of us already give our facial ID over to our smart phones or the government, yet balk at doing it for school? Of course we should be sensitive to the data we have and protect that. However, we also should consider the tools we have at our disposal to make sure that we’re certifying the person who demonstrated their knowledge and abilities. We owe it to society to not give our degrees away without this certification. While it’s not always easy to resolve tensions between two values that are both “right” or “good” (e.g., privacy and integrity), I think it’s possible if we engage in thoughtful discussions rather than resort to heated and divided rhetoric.
Q: I have used UDL for years, but find that it's flexibility works well if you are in the "light it up" world, but not so much in the "lock it down". Many of the best practices seem to be a step backwards (e.g. oral exams). Curious therefore what you meant when you were emphasizing UDL.
A: I’m curious about your statement that oral exams are a step backwards! I see it as a step forward to remembering that we should be graduating people who have the skills to write AND speak (in some format) about their knowledge. So, I see oral assessment as a critical feature that was eliminated because it didn’t scale with the industrialization of higher education, not because it wasn’t good for learning or evaluating. There are a lot of innovative things happening with oral assessments right now – see this article 'Oral exams improve engineering student performance, motivation' for one example. I am hoping that GenAI tools will help offload background or administrative tasks for faculty and teaching assistants so that they will have time to engage in this way with their students. By emphasizing UDL, I mean that testing centers should think beyond the traditional notion of testing, which gets tagged as not UDL-friendly. We know that frequent testing and mastery-based testing can improve learning, so how can testing centers evolve to facilitate that while being UDL. Computers and GenAI are key to this. For example, students can choose when and what time to test. Students can test over and over again until they master. With computers, we can offer more options for assessment (one student could write, but another student could speak their answers). We can reduce logistical rigor while focusing on intellectual rigor. These are my musings at the moment (with the caveat that I am not a UDL expert).
Q: I currently work in a Writing Center, and in having conversations with our tutors, there is a level of resistance around engaging with GenAI tools. They want to have peer-to-peer conversations and feel that these tools will hinder the rapport building they thrive from. What would your response to this be? Should we be “forcing” engagement with these tools given that they are not going away? How do we strike a balance between engagement with these tools without losing what comes from human-human connection?
A: Such a great question! I think writing tutors should definitely be engaged with these tools and learning how to use them because their tutees are using them. They should learn how to identify when a writer might be over relying on a tool so they can engage with the in a conversation of what is being lost in that process and to hopefully convince them to do more of the writing themselves. I guess I would step back and ask “what is the purpose of our tutoring sessions” or “what is the learning goal?” and then work from there. Once we’ve defined those, we can determine if engaging with these tools would undermine or amplify those goals/purposes, and how. Then, perhaps we might even decide to structure or tutoring sessions differently that could be better for all involved. But, I’m not a writing tutor expert so I could be way off-base here. 😊
Q: Would you include the ability to let students share their writing process (like revision history) with the likes of oral exams and presentations as a way to further invigilate mastery and summative assessments?
A: Yes and no. Anything "remotely" proctored, which would include google doc version history and virtual presentations, is unsecure. A student could fake that version history (e.g., retype Chatgpt output instead of copying and pasting; have another person write the assignment who is logged in as them). And, of course, a student could fake a virtual presentation with the AI that now exists. Having said that, I would put those two examples in the bucket of attempting to assess process over product. Still not in the same invigilated bucket as proctored assessments, but they do raise the barrier for cheating and therefore make it less likely than in assessments that are external and completely unmonitored (e.g., homework).
If you would like to see the slides from the webinar, follow the links on the ICAI webinar page. You can access the recording on YouTube. 
Thank you for being a member of ICAI. Not a member of ICAI yet? Check out the benefits of membership and consider joining us by visiting our membership page. Be part of something great!