Evolutionary Neural Architecture Search for Transformer in Knowledge
Tracing
- URL: http://arxiv.org/abs/2310.01180v1
- Date: Mon, 2 Oct 2023 13:19:33 GMT
- Title: Evolutionary Neural Architecture Search for Transformer in Knowledge
Tracing
- Authors: Shangshang Yang, Xiaoshan Yu, Ye Tian, Xueming Yan, Haiping Ma, and
Xingyi Zhang
- Abstract summary: This paper proposes an evolutionary neural architecture search approach to automate the input feature selection and automatically determine where to apply which operation for achieving the balancing of the local/global context modelling.
Experimental results on the two largest and most challenging education datasets demonstrate the effectiveness of the architecture found by the proposed approach.
- Score: 8.779571123401185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge tracing (KT) aims to trace students' knowledge states by predicting
whether students answer correctly on exercises. Despite the excellent
performance of existing Transformer-based KT approaches, they are criticized
for the manually selected input features for fusion and the defect of single
global context modelling to directly capture students' forgetting behavior in
KT, when the related records are distant from the current record in terms of
time. To address the issues, this paper first considers adding convolution
operations to the Transformer to enhance its local context modelling ability
used for students' forgetting behavior, then proposes an evolutionary neural
architecture search approach to automate the input feature selection and
automatically determine where to apply which operation for achieving the
balancing of the local/global context modelling. In the search space, the
original global path containing the attention module in Transformer is replaced
with the sum of a global path and a local path that could contain different
convolutions, and the selection of input features is also considered. To search
the best architecture, we employ an effective evolutionary algorithm to explore
the search space and also suggest a search space reduction strategy to
accelerate the convergence of the algorithm. Experimental results on the two
largest and most challenging education datasets demonstrate the effectiveness
of the architecture found by the proposed approach.
Related papers
- Locally Adaptive One-Class Classifier Fusion with Dynamic $\ell$p-Norm Constraints for Robust Anomaly Detection [17.93058599783703]
We introduce a framework that dynamically adjusts fusion weights based on local data characteristics.
Our method incorporates an interior-point optimization technique that significantly improves computational efficiency.
The framework's ability to adapt to local data patterns while maintaining computational efficiency makes it particularly valuable for real-time applications.
arXiv Detail & Related papers (2024-11-10T09:57:13Z) - ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses [35.31588965060201]
We propose an efficient transformer-based network architecture for local feature matching.
On the YFCC100M dataset, our matching accuracy is competitive with LoFTR, a state-of-the-art transformer-based architecture.
arXiv Detail & Related papers (2024-10-30T06:39:27Z) - Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information [68.10033984296247]
This paper explores the domain of active localization, emphasizing the importance of viewpoint selection to enhance localization accuracy.
Our contributions involve using a data-driven approach with a simple architecture designed for real-time operation, a self-supervised data training method, and the capability to consistently integrate our map into a planning framework tailored for real-world robotics applications.
arXiv Detail & Related papers (2024-07-22T12:32:09Z) - ECToNAS: Evolutionary Cross-Topology Neural Architecture Search [0.0]
ECToNAS is a cost-efficient evolutionary cross-topology neural architecture search algorithm.
It fuses training and topology optimisation together into one lightweight, resource-friendly process.
arXiv Detail & Related papers (2024-03-08T07:36:46Z) - Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios.
We extend transferability metrics to object detection using ROI-Align and TLogME.
We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z) - DAAS: Differentiable Architecture and Augmentation Policy Search [107.53318939844422]
This work considers the possible coupling between neural architectures and data augmentation and proposes an effective algorithm jointly searching for them.
Our approach achieves 97.91% accuracy on CIFAR-10 and 76.6% Top-1 accuracy on ImageNet dataset, showing the outstanding performance of our search algorithm.
arXiv Detail & Related papers (2021-09-30T17:15:17Z) - Navigating the Kaleidoscope of COVID-19 Misinformation Using Deep
Learning [0.76146285961466]
We propose an effective model to capture both the local and global context of the target domain.
We show that: (i) the deep Transformer-based pre-trained models, utilized via the mixed-domain transfer learning, are only good at capturing the local context, thus exhibits poor generalization.
A combination of shallow network-based domain-specific models and convolutional neural networks can efficiently extract local as well as global context directly from the target data in a hierarchical fashion, enabling it to offer a more generalizable solution.
arXiv Detail & Related papers (2021-09-19T15:49:25Z) - GLiT: Neural Architecture Search for Global and Local Image Transformer [114.8051035856023]
We introduce the first Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition.
Our method can find more discriminative and efficient transformer variants than the ResNet family and the baseline ViT for image classification.
arXiv Detail & Related papers (2021-07-07T00:48:09Z) - Localized active learning of Gaussian process state space models [63.97366815968177]
A globally accurate model is not required to achieve good performance in many common control applications.
We propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space.
By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy.
arXiv Detail & Related papers (2020-05-04T05:35:02Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.