Beyond Detection: Designing AI-Resilient Assessments with Automated Feedback Tool to Foster Critical Thinking
- URL: http://arxiv.org/abs/2503.23622v1
- Date: Sun, 30 Mar 2025 23:13:00 GMT
- Title: Beyond Detection: Designing AI-Resilient Assessments with Automated Feedback Tool to Foster Critical Thinking
- Authors: Muhammad Sajjad Akbar,
- Abstract summary: This research proposes a proactive, AI-resilient solution based on assessment design rather than detection.<n>It introduces a web-based Python tool that integrates Bloom's taxonomy with advanced natural language processing techniques.<n>It helps educators determine whether a task targets lower-order thinking such as recall and summarization or higher-order skills such as analysis, evaluation, and creation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing use of generative AI tools like ChatGPT has raised urgent concerns about their impact on student learning, particularly the potential erosion of critical thinking and creativity. As students increasingly turn to these tools to complete assessments, foundational cognitive skills are at risk of being bypassed, challenging the integrity of higher education and the authenticity of student work. Existing AI-generated text detection tools are inadequate; they produce unreliable outputs and are prone to both false positives and false negatives, especially when students apply paraphrasing, translation, or rewording. These systems rely on shallow statistical patterns rather than true contextual or semantic understanding, making them unsuitable as definitive indicators of AI misuse. In response, this research proposes a proactive, AI-resilient solution based on assessment design rather than detection. It introduces a web-based Python tool that integrates Bloom's Taxonomy with advanced natural language processing techniques including GPT-3.5 Turbo, BERT-based semantic similarity, and TF-IDF metrics to evaluate the AI-solvability of assessment tasks. By analyzing surface-level and semantic features, the tool helps educators determine whether a task targets lower-order thinking such as recall and summarization or higher-order skills such as analysis, evaluation, and creation, which are more resistant to AI automation. This framework empowers educators to design cognitively demanding, AI-resistant assessments that promote originality, critical thinking, and fairness. It offers a sustainable, pedagogically sound strategy to foster authentic learning and uphold academic standards in the age of AI.
Related papers
- Computational Safety for Generative AI: A Signal Processing Perspective [65.268245109828]
computational safety is a mathematical framework that enables the quantitative assessment, formulation, and study of safety challenges in GenAI.<n>We show how sensitivity analysis and loss landscape analysis can be used to detect malicious prompts with jailbreak attempts.<n>We discuss key open research challenges, opportunities, and the essential role of signal processing in computational AI safety.
arXiv Detail & Related papers (2025-02-18T02:26:50Z) - Chatbots im Schulunterricht: Wir testen das Fobizz-Tool zur automatischen Bewertung von Hausaufgaben [0.0]
This study examines the AI-powered grading tool "AI Grading Assistant" by the German company Fobizz.<n>The tool's numerical grades and qualitative feedback are often random and do not improve even when its suggestions are incorporated.<n>The study critiques the broader trend of adopting AI as a quick fix for systemic problems in education.
arXiv Detail & Related papers (2024-12-09T16:50:02Z) - AI in Education: Rationale, Principles, and Instructional Implications [0.0]
Generative AI, like ChatGPT, can create human-like content, prompting questions about its educational role.
The study emphasizes deliberate strategies to ensure AI complements, not replaces, genuine cognitive effort.
arXiv Detail & Related papers (2024-12-02T14:08:07Z) - Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants [176.39275404745098]
We evaluate whether two AI assistants, GPT-3.5 and GPT-4, can adequately answer assessment questions.<n>GPT-4 answers an average of 65.8% of questions correctly, and can even produce the correct answer across at least one prompting strategy for 85.1% of questions.<n>Our results call for revising program-level assessment design in higher education in light of advances in generative AI.
arXiv Detail & Related papers (2024-08-07T12:11:49Z) - The Rise of Artificial Intelligence in Educational Measurement: Opportunities and Ethical Challenges [2.569083526579529]
AI in education raises ethical concerns regarding validity, reliability, transparency, fairness, and equity.
Various stakeholders, including educators, policymakers, and organizations, have developed guidelines to ensure ethical AI use in education.
In this paper, a diverse group of AIME members examines the ethical implications of AI-powered tools in educational measurement.
arXiv Detail & Related papers (2024-06-27T05:28:40Z) - How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence [0.9671462473115854]
Generative AI such as those with large language models have created opportunities for innovative assessment design practices.
This paper presents a framework that explores the capabilities of the LLM ChatGPT4 application, which is the current industry benchmark.
This critique will provide specific and targeted indications of their questions vulnerabilities in terms of the critical thinking skills.
arXiv Detail & Related papers (2024-06-20T22:46:56Z) - Student Mastery or AI Deception? Analyzing ChatGPT's Assessment
Proficiency and Evaluating Detection Strategies [1.633179643849375]
Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment.
This work investigates the performance of ChatGPT by evaluating it across three courses.
arXiv Detail & Related papers (2023-11-27T20:10:13Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model,
Data, and Training [109.9218185711916]
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind social media texts or reviews.
We propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training.
arXiv Detail & Related papers (2023-04-19T11:07:43Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric.
We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z) - An interdisciplinary conceptual study of Artificial Intelligence (AI)
for helping benefit-risk assessment practices: Towards a comprehensive
qualification matrix of AI programs and devices (pre-print 2020) [55.41644538483948]
This paper proposes a comprehensive analysis of existing concepts coming from different disciplines tackling the notion of intelligence.
The aim is to identify shared notions or discrepancies to consider for qualifying AI systems.
arXiv Detail & Related papers (2021-05-07T12:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.