Related papers: Theory and Algorithms for Shapelet-based Multiple-Instance Learning

Theory and Algorithms for Shapelet-based Multiple-Instance Learning

URL: http://arxiv.org/abs/2006.01130v3
Date: Tue, 13 Oct 2020 06:57:57 GMT
Title: Theory and Algorithms for Shapelet-based Multiple-Instance Learning
Authors: Daiki Suehiro, Kohei Hatano, Eiji Takimoto, Shuji Yamamoto, Kenichi Bannai, Akiko Takeda
Abstract summary: We propose a new formulation of Multiple-Instance Learning (MIL), in which a unit of data consists of instances called a bag. The goal is to find a good classifier of bags based on the similarity with a "shapelet" (or pattern) In our formulation, we use all possible, and thus infinitely many shapelets, resulting in a richer class of classifiers.
Score: 5.08418565337126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a new formulation of Multiple-Instance Learning (MIL), in which a unit of data consists of a set of instances called a bag. The goal is to find a good classifier of bags based on the similarity with a "shapelet" (or pattern), where the similarity of a bag with a shapelet is the maximum similarity of instances in the bag. In previous work, some of the training instances are chosen as shapelets with no theoretical justification. In our formulation, we use all possible, and thus infinitely many shapelets, resulting in a richer class of classifiers. We show that the formulation is tractable, that is, it can be reduced through Linear Programming Boosting (LPBoost) to Difference of Convex (DC) programs of finite (actually polynomial) size. Our theoretical result also gives justification to the heuristics of some of the previous work. The time complexity of the proposed algorithm highly depends on the size of the set of all instances in the training sample. To apply to the data containing a large number of instances, we also propose a heuristic option of the algorithm without the loss of the theoretical guarantee. Our empirical study demonstrates that our algorithm uniformly works for Shapelet Learning tasks on time-series classification and various MIL tasks with comparable accuracy to the existing methods. Moreover, we show that the proposed heuristics allow us to achieve the result with reasonable computational time.

Related papers

Nearly Optimal Sample Complexity for Learning with Label Proportions [54.67830198790247]
We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags.<n>Despite the partial observability, the goal is still to achieve small regret at the level of individual examples.<n>We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal.
arXiv Detail & Related papers (2025-05-08T15:45:23Z)
Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models. We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
Derandomizing Multi-Distribution Learning [28.514129340758938]
Multi-distribution learning involves learning a single predictor that works well across multiple data distributions. Recent research has shown near-optimal sample complexity achieved with oracle efficient algorithms. This raises the question: can these algorithms be derandomized to produce a deterministic predictor for multiple distributions?
arXiv Detail & Related papers (2024-09-26T06:28:56Z)
A General Online Algorithm for Optimizing Complex Performance Metrics [5.726378955570775]
We introduce and analyze a general online algorithm that can be used in a straightforward way with a variety of complex performance metrics in binary, multi-class, and multi-label classification problems. The algorithm's update and prediction rules are appealingly simple and computationally efficient without the need to store any past data.
arXiv Detail & Related papers (2024-06-20T21:24:47Z)
FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Existing methods have proposed to select a subset of unlabeled examples for annotation. We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z)
Semisupervised score based matching algorithm to evaluate the effect of public health interventions [3.221788913179251]
In one-to-one matching algorithms, a large number of "pairs" to be matched could mean both the information from a large sample and a large number of tasks. We propose a novel one-to-one matching algorithm based on a quadratic score function $S_beta(x_i,x_j)= betaT (x_i-x_j)(x_i-x_j)T beta$.
arXiv Detail & Related papers (2024-03-19T02:24:16Z)
Collaborative Learning with Different Labeling Functions [7.228285747845779]
We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions. We show that, when the data distributions satisfy a weaker realizability assumption, sample-efficient learning is still feasible.
arXiv Detail & Related papers (2024-02-16T04:32:22Z)
PriorBoost: An Adaptive Algorithm for Learning from Aggregate Responses [18.944561572423726]
We focus on the construction of aggregation sets (called bags in the literature) for event-level loss functions. We propose the PriorBoost algorithm, which adaptively forms bags of samples that are increasingly homogeneous.
arXiv Detail & Related papers (2024-02-07T16:06:20Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs) We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z)
Learning distributed representations with efficient SoftMax normalization [3.8673630752805437]
We propose a linear-time approximation to compute the normalization constants of $rm SoftMax(XYT)$ for embedding vectors with bounded norms.<n>We show on some pre-trained embedding datasets that the proposed estimation method achieves higher or comparable accuracy with competing methods.<n>The proposed algorithm is interpretable and easily adapted to arbitrary embedding problems.
arXiv Detail & Related papers (2023-03-30T15:48:26Z)
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation [110.09855163856326]
This paper is about the problem of learning a policy for generating an object from a sequence of actions. We propose GFlowNet, based on a view of the generative process as a flow network. We prove that any global minimum of the proposed objectives yields a policy which samples from the desired distribution.
arXiv Detail & Related papers (2021-06-08T14:21:10Z)
An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples. We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z)
Clustering with Penalty for Joint Occurrence of Objects: Computational Aspects [0.0]
The method of Hol'y, Sokol and vCern'y clusters objects based on their incidence in a large number of given sets. The idea is to minimize the occurrence of multiple objects from the same cluster in the same set. In the current paper, we study computational aspects of the method.
arXiv Detail & Related papers (2021-02-02T10:39:27Z)
Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates. The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature. It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.