Extracting Topics from Open Educational Resources
- URL: http://arxiv.org/abs/2006.11109v1
- Date: Fri, 19 Jun 2020 12:50:55 GMT
- Title: Extracting Topics from Open Educational Resources
- Authors: Mohammadreza Molavi, Mohammadreza Tavakoli, and G\'abor Kismih\'ok
- Abstract summary: We propose an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution.
This is done by: 1) collecting 123 lectures from Coursera and Khan Academy in the area of data science related skills, 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to these skills, and 3) defining topic distributions covered by a particular OER.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years, Open Educational Resources (OERs) were earmarked as critical
when mitigating the increasing need for education globally. Obviously, OERs
have high-potential to satisfy learners in many different circumstances, as
they are available in a wide range of contexts. However, the low-quality of OER
metadata, in general, is one of the main reasons behind the lack of
personalised services such as search and recommendation. As a result, the
applicability of OERs remains limited. Nevertheless, OER metadata about covered
topics (subjects) is essentially required by learners to build effective
learning pathways towards their individual learning objectives. Therefore, in
this paper, we report on a work in progress project proposing an OER topic
extraction approach, applying text mining techniques, to generate high-quality
OER metadata about topic distribution. This is done by: 1) collecting 123
lectures from Coursera and Khan Academy in the area of data science related
skills, 2) applying Latent Dirichlet Allocation (LDA) on the collected
resources in order to extract existing topics related to these skills, and 3)
defining topic distributions covered by a particular OER. To evaluate our
model, we used the data-set of educational resources from Youtube, and compared
our topic distribution results with their manually defined target topics with
the help of 3 experts in the area of data science. As a result, our model
extracted topics with 79% of F1-score.
Related papers
- Enriched BERT Embeddings for Scholarly Publication Classification [0.13654846342364302]
The NSLP 2024 FoRC Task I addresses this challenge organized as a competition.
The goal is to develop a classifier capable of predicting one of 123 predefined classes from the Open Research Knowledge Graph (ORKG) taxonomy of research fields for a given article.
arXiv Detail & Related papers (2024-05-07T09:05:20Z) - ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs [60.81649785463651]
We introduce ExaRanker-Open, where we adapt and explore the use of open-source language models to generate explanations.
Our findings reveal that incorporating explanations consistently enhances neural rankers, with benefits escalating as the LLM size increases.
arXiv Detail & Related papers (2024-02-09T11:23:14Z) - Query of CC: Unearthing Large Scale Domain-Specific Knowledge from
Public Corpora [104.16648246740543]
We propose an efficient data collection method based on large language models.
The method bootstraps seed information through a large language model and retrieves related data from public corpora.
It not only collects knowledge-related data for specific domains but unearths the data with potential reasoning procedures.
arXiv Detail & Related papers (2024-01-26T03:38:23Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Low-Resource Dense Retrieval for Open-Domain Question Answering: A
Comprehensive Survey [23.854086903936647]
We provide a structured overview of mainstream techniques for low-resource DR.
We divide the techniques into three main categories: (1) only documents are needed; (2) documents and questions are needed; and (3) documents and question-answer pairs are needed.
For every technique, we introduce its general-form algorithm, highlight the open issues and pros and cons. Promising directions are outlined for future research.
arXiv Detail & Related papers (2022-08-05T14:35:03Z) - Information Extraction in Low-Resource Scenarios: Survey and Perspective [56.5556523013924]
Information Extraction seeks to derive structured information from unstructured texts.
This paper presents a review of neural approaches to low-resource IE from emphtraditional and emphLLM-based perspectives.
arXiv Detail & Related papers (2022-02-16T13:44:00Z) - A Transfer Learning Pipeline for Educational Resource Discovery with
Application in Leading Paragraph Generation [71.92338855383238]
We propose a pipeline that automates web resource discovery for novel domains.
The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains.
This is the first study that considers various web resources for survey generation.
arXiv Detail & Related papers (2022-01-07T03:35:40Z) - Developing Open Source Educational Resources for Machine Learning and
Data Science [0.0]
We describe the specific requirements for Open Educational Resources (OER) in Machine Learning (ML) and Data Science (DS)
We argue that it is especially important for these fields to make source files publicly available, leading to Open Source Educational Resources (OSER)
We outline how OSER can be used for blended learning scenarios and share our experiences in university education.
arXiv Detail & Related papers (2021-07-28T10:20:20Z) - Domain Generalization: A Survey [146.68420112164577]
Domain generalization (DG) aims to achieve OOD generalization by only using source domain data for model learning.
For the first time, a comprehensive literature review is provided to summarize the ten-year development in DG.
arXiv Detail & Related papers (2021-03-03T16:12:22Z) - OER Recommendations to Support Career Development [0.0]
Open Educational Resources (OERs) have potential to contribute to the mitigation of problems, as they are available in a wide range of learning and occupational contexts globally.
We suggest a novel, personalised OER recommendation method to match skill development targets with open learning content.
This is done by: 1) using an OER quality prediction model based on metadata, OER properties, and content; 2) supporting learners to set individual skill targets based on actual labour market information; and 3) building a personalized OER recommender to help learners to master their skill targets.
arXiv Detail & Related papers (2020-05-30T21:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.