Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content
- URL: http://arxiv.org/abs/2504.17964v1
- Date: Thu, 24 Apr 2025 22:24:14 GMT
- Title: Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content
- Authors: Celia Chen, Alex Leitch,
- Abstract summary: This paper examines how graduate students develop frameworks for evaluating machine-generated expertise in web-based interactions with large language models (LLMs)<n>Our findings reveal that students construct evaluation frameworks shaped by three main factors: professional identity, verification capabilities, and system navigation experience.
- Score: 1.967444231154626
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper examines how graduate students develop frameworks for evaluating machine-generated expertise in web-based interactions with large language models (LLMs). Through a qualitative study combining surveys, LLM interaction transcripts, and in-depth interviews with 14 graduate students, we identify patterns in how these emerging professionals assess and engage with AI-generated content. Our findings reveal that students construct evaluation frameworks shaped by three main factors: professional identity, verification capabilities, and system navigation experience. Rather than uniformly accepting or rejecting LLM outputs, students protect domains central to their professional identities while delegating others--with managers preserving conceptual work, designers safeguarding creative processes, and programmers maintaining control over core technical expertise. These evaluation frameworks are further influenced by students' ability to verify different types of content and their experience navigating complex systems. This research contributes to web science by highlighting emerging human-genAI interaction patterns and suggesting how platforms might better support users in developing effective frameworks for evaluating machine-generated expertise signals in AI-mediated web environments.
Related papers
- Beyond the Classroom: Bridging the Gap Between Academia and Industry with a Hands-on Learning Approach [2.8123958518740544]
Self-adaptive software systems have emerged as a critical focus in software design and operation.<n>A survey among practitioners identified that the lack of knowledgeable individuals has hindered its adoption in the industry.<n>We present our experience teaching a course on self-adaptive software systems that integrates theoretical knowledge and hands-on learning with industry-relevant technologies.
arXiv Detail & Related papers (2025-04-14T21:32:25Z) - A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z) - Level Up Peer Review in Education: Investigating genAI-driven Gamification system and its influence on Peer Feedback Effectiveness [0.8087870525861938]
This paper introduces Socratique, a gamified peer-assessment platform integrated with Generative AI (GenAI) assistance.<n>By incorporating game elements, Socratique aims to motivate students to provide more feedback.<n>Students in the treatment group provided significantly more voluntary feedback, with higher scores on clarity, relevance, and specificity.
arXiv Detail & Related papers (2025-04-03T18:30:25Z) - A Survey on (M)LLM-Based GUI Agents [62.57899977018417]
Graphical User Interface (GUI) Agents have emerged as a transformative paradigm in human-computer interaction.<n>Recent advances in large language models and multimodal learning have revolutionized GUI automation across desktop, mobile, and web platforms.<n>This survey identifies key technical challenges, including accurate element localization, effective knowledge retrieval, long-horizon planning, and safety-aware execution control.
arXiv Detail & Related papers (2025-03-27T17:58:31Z) - LLM Agents for Education: Advances and Applications [49.3663528354802]
Large Language Model (LLM) agents have demonstrated remarkable capabilities in automating tasks and driving innovation across diverse educational applications.<n>This survey aims to provide a comprehensive technological overview of LLM agents for education, fostering further research and collaboration to enhance their impact for the greater good of learners and educators alike.
arXiv Detail & Related papers (2025-03-14T11:53:44Z) - Human-Centered Design for AI-based Automatically Generated Assessment Reports: A Systematic Review [4.974197456441281]
This study emphasizes the importance of reducing teachers' cognitive demands through user-centered and intuitive designs.<n>It highlights the potential of diverse information presentation formats such as text, visual aids, and plots and advanced functionalities such as live and interactive features to enhance usability.<n>The framework aims to address challenges in engaging teachers with technology-enhanced assessment results, facilitating data-driven decision-making, and providing personalized feedback to improve the teaching and learning process.
arXiv Detail & Related papers (2024-12-30T16:20:07Z) - GUI Agents: A Survey [129.94551809688377]
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction.<n>Motivated by the growing interest and fundamental importance of GUI agents, we provide a comprehensive survey that categorizes their benchmarks, evaluation metrics, architectures, and training methods.
arXiv Detail & Related papers (2024-12-18T04:48:28Z) - User-centric evaluation of explainability of AI with and for humans: a comprehensive empirical study [5.775094401949666]
This study is located in the Human-Centered Artificial Intelligence (HCAI)
It focuses on the results of a user-centered assessment of commonly used eXplainable Artificial Intelligence (XAI) algorithms.
arXiv Detail & Related papers (2024-10-21T12:32:39Z) - Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges.
We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow.
We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z) - Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.74265453289855]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces.
This paper examines the affordances of interactive feedback features in ChatGPT's interface, analysing how they shape user input and participation in iteration.
arXiv Detail & Related papers (2024-08-27T13:50:37Z) - Enhancing LLM-Based Feedback: Insights from Intelligent Tutoring Systems and the Learning Sciences [0.0]
This work advocates careful and caring AIED research by going through previous research on feedback generation in ITS.
The main contributions of this paper include: an avocation of applying more cautious, theoretically grounded methods in feedback generation in the era of generative AI.
arXiv Detail & Related papers (2024-05-07T20:09:18Z) - Prototyping with Prompts: Emerging Approaches and Challenges in Generative AI Design for Collaborative Software Teams [2.237039275844699]
Generative AI models are increasingly being integrated into human task, enabling the production of expressive content.<n>Unlike traditional human-AI design methods, the new approach to designing generative capabilities focuses heavily on prompt engineering strategies.<n>Our findings highlight emerging practices and role shifts in AI system prototyping among multistakeholder teams.
arXiv Detail & Related papers (2024-02-27T17:56:10Z) - Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric.
We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.