Sampling Approach Matters: Active Learning for Robotic Language
Acquisition
- URL: http://arxiv.org/abs/2011.08021v1
- Date: Mon, 16 Nov 2020 15:18:10 GMT
- Title: Sampling Approach Matters: Active Learning for Robotic Language
Acquisition
- Authors: Nisha Pillai, Edward Raff, Francis Ferraro, Cynthia Matuszek
- Abstract summary: We present an exploration of active learning approaches applied to three grounded language problems of varying complexity.
We report on how characteristics of the underlying task, along with design decisions such as feature selection and classification model, drive the results.
- Score: 42.69529080098759
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ordering the selection of training data using active learning can lead to
improvements in learning efficiently from smaller corpora. We present an
exploration of active learning approaches applied to three grounded language
problems of varying complexity in order to analyze what methods are suitable
for improving data efficiency in learning. We present a method for analyzing
the complexity of data in this joint problem space, and report on how
characteristics of the underlying task, along with design decisions such as
feature selection and classification model, drive the results. We observe that
representativeness, along with diversity, is crucial in selecting data samples.
Related papers
- The Why, When, and How to Use Active Learning in Large-Data-Driven 3D
Object Detection for Safe Autonomous Driving: An Empirical Exploration [1.2815904071470705]
entropy querying is a promising strategy for selecting data that enhances model learning in resource-constrained environments.
Our findings suggest that entropy querying is a promising strategy for selecting data that enhances model learning in resource-constrained environments.
arXiv Detail & Related papers (2024-01-30T00:14:13Z) - A Contrast Based Feature Selection Algorithm for High-dimensional Data
set in Machine Learning [9.596923373834093]
We propose a novel filter feature selection method, ContrastFS, which selects discriminative features based on the discrepancies features shown between different classes.
We validate effectiveness and efficiency of our approach on several widely studied benchmark datasets, results show that the new method performs favorably with negligible computation.
arXiv Detail & Related papers (2024-01-15T05:32:35Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - What Matters in Learning from Offline Human Demonstrations for Robot
Manipulation [64.43440450794495]
We conduct an extensive study of six offline learning algorithms for robot manipulation.
Our study analyzes the most critical challenges when learning from offline human data.
We highlight opportunities for learning from human datasets.
arXiv Detail & Related papers (2021-08-06T20:48:30Z) - Finding the Homology of Decision Boundaries with Active Learning [26.31885403636642]
We propose an active learning algorithm to recover the homology of decision boundaries.
Our algorithm sequentially and adaptively selects which samples it requires the labels of.
Experiments on several datasets show the sample complexity improvement in recovering the homology.
arXiv Detail & Related papers (2020-11-19T04:22:06Z) - Clustering Analysis of Interactive Learning Activities Based on Improved
BIRCH Algorithm [0.0]
The construction of good learning behavior is of great significance to learners' learning process and learning effect, and is the key basis of data-driven education decision-making.
It is necessary to obtain the online learning behavior big data set of multi period and multi course, and describe the learning behavior as multi-dimensional learning interaction activities.
We design an improved algorithm of BIRCH clustering based on random walking strategy, which realizes the retrieval evaluation and data of key learning interaction activities.
arXiv Detail & Related papers (2020-10-08T07:46:46Z) - On the Robustness of Active Learning [0.7340017786387767]
Active Learning is concerned with how to identify the most useful samples for a Machine Learning algorithm to be trained with.
We find that it is often applied with not enough care and domain knowledge.
We propose the new "Sum of Squared Logits" method based on the Simpson diversity index and investigate the effect of using the confusion matrix for balancing in sample selection.
arXiv Detail & Related papers (2020-06-18T09:07:23Z) - Improving Multi-Turn Response Selection Models with Complementary
Last-Utterance Selection by Instance Weighting [84.9716460244444]
We consider utilizing the underlying correlation in the data resource itself to derive different kinds of supervision signals.
We conduct extensive experiments in two public datasets and obtain significant improvement in both datasets.
arXiv Detail & Related papers (2020-02-18T06:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.