Evaluating the Fitness of Ontologies for the Task of Question Generation
- URL: http://arxiv.org/abs/2504.07994v1
- Date: Tue, 08 Apr 2025 17:10:04 GMT
- Title: Evaluating the Fitness of Ontologies for the Task of Question Generation
- Authors: Samah Alkhuzaey, Floriana Grasso, Terry R. Payne, Valentina Tamma,
- Abstract summary: This paper proposes a set of requirements and task-specific metrics for evaluating the fitness of question generation tasks in settings.<n>An expert-based approach is employed to assess of various performance in Automatic Question Generation (AQG)<n>Our results demonstrate that characteristics significantly impact the effectiveness of question generation, with different tasks evaluated over varying performance levels.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ontology-based question generation is an important application of semantic-aware systems that enables the creation of large question banks for diverse learning environments. The effectiveness of these systems, both in terms of the calibre and cognitive difficulty of the resulting questions, depends heavily on the quality and modelling approach of the underlying ontologies, making it crucial to assess their fitness for this task. To date, there has been no comprehensive investigation into the specific ontology aspects or characteristics that affect the question generation process. Therefore, this paper proposes a set of requirements and task-specific metrics for evaluating the fitness of ontologies for question generation tasks in pedagogical settings. Using the ROMEO methodology, a structured framework for deriving task-specific metrics, an expert-based approach is employed to assess the performance of various ontologies in Automatic Question Generation (AQG) tasks, which is then evaluated over a set of ontologies. Our results demonstrate that ontology characteristics significantly impact the effectiveness of question generation, with different ontologies exhibiting varying performance levels. This highlights the importance of assessing ontology quality with respect to AQG tasks.
Related papers
- Pitfalls of topology-aware image segmentation [81.19923502845441]
We identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts, and inappropriate use of evaluation metrics.<n>We propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods.
arXiv Detail & Related papers (2024-12-19T08:11:42Z) - UKTF: Unified Knowledge Tracing Framework for Subjective and Objective Assessments [3.378008889662775]
Knowledge tracing technology can establish knowledge state models based on learners' historical answer data.
This study proposes a unified knowledge tracing model that integrates both objective and subjective test questions.
arXiv Detail & Related papers (2024-11-08T04:58:19Z) - AGENT-CQ: Automatic Generation and Evaluation of Clarifying Questions for Conversational Search with LLMs [53.6200736559742]
AGENT-CQ consists of two stages: a generation stage and an evaluation stage.
CrowdLLM simulates human crowdsourcing judgments to assess generated questions and answers.
Experiments on the ClariQ dataset demonstrate CrowdLLM's effectiveness in evaluating question and answer quality.
arXiv Detail & Related papers (2024-10-25T17:06:27Z) - An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance.
SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring.
An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z) - StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [94.31508613367296]
Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs)
We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure.
Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
arXiv Detail & Related papers (2024-10-11T13:52:44Z) - A RAG Approach for Generating Competency Questions in Ontology Engineering [1.0044270899550196]
With the emergence of Large Language Models (LLMs), there arises the possibility to automate and enhance this process.<n>We present a retrieval-augmented generation (RAG) approach that uses LLMs for the automatic generation of CQs.<n>We conduct experiments using GPT-4 on two domain engineering tasks and compare results against ground-truth CQs constructed by domain experts.
arXiv Detail & Related papers (2024-09-13T13:34:32Z) - Multi-Faceted Question Complexity Estimation Targeting Topic Domain-Specificity [0.0]
This paper presents a novel framework for domain-specific question difficulty estimation, leveraging a suite of NLP techniques and knowledge graph analysis.
We introduce four key parameters: Topic Retrieval Cost, Topic Salience, Topic Coherence, and Topic Superficiality.
A model trained on these features demonstrates the efficacy of our approach in predicting question difficulty.
arXiv Detail & Related papers (2024-08-23T05:40:35Z) - Application of Large Language Models in Automated Question Generation: A Case Study on ChatGLM's Structured Questions for National Teacher Certification Exams [2.7363336723930756]
This study explores the application potential of the large language models (LLMs) ChatGLM in the automatic generation of structured questions for National Teacher Certification Exams (NTCE)
We guided ChatGLM to generate a series of simulated questions and conducted a comprehensive comparison with questions recollected from past examinees.
The research results indicate that the questions generated by ChatGLM exhibit a high level of rationality, scientificity, and practicality similar to those of the real exam questions.
arXiv Detail & Related papers (2024-08-19T13:32:14Z) - Evaluating General-Purpose AI with Psychometrics [43.85432514910491]
We discuss the need for a comprehensive and accurate evaluation of general-purpose AI systems such as large language models.
Current evaluation methodology, mostly based on benchmarks of specific tasks, falls short of adequately assessing these versatile AI systems.
To tackle these challenges, we suggest transitioning from task-oriented evaluation to construct-oriented evaluation.
arXiv Detail & Related papers (2023-10-25T05:38:38Z) - Latent Properties of Lifelong Learning Systems [59.50307752165016]
We introduce an algorithm-agnostic explainable surrogate-modeling approach to estimate latent properties of lifelong learning algorithms.
We validate the approach for estimating these properties via experiments on synthetic data.
arXiv Detail & Related papers (2022-07-28T20:58:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.