Related papers: A Near-Linear Time Algorithm for the Chamfer Distance

A Near-Linear Time Algorithm for the Chamfer Distance

URL: http://arxiv.org/abs/2307.03043v1
Date: Thu, 6 Jul 2023 15:07:48 GMT
Title: A Near-Linear Time Algorithm for the Chamfer Distance
Authors: Ainesh Bakshi, Piotr Indyk, Rajesh Jayaram, Sandeep Silwal, Erik Waingarten
Abstract summary: Chamfer distance is a popular measure of dissimilarity between point clouds. We present the first $(1+epsilon)$-approximate algorithm for estimating the Chamfer distance with a near-linear running time. Our experiments demonstrate that it is both accurate and fast on large high-dimensional datasets.
Score: 21.018781726524946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For any two point sets $A,B \subset \mathbb{R}^d$ of size up to $n$, the Chamfer distance from $A$ to $B$ is defined as $\text{CH}(A,B)=\sum_{a \in A} \min_{b \in B} d_X(a,b)$, where $d_X$ is the underlying distance measure (e.g., the Euclidean or Manhattan distance). The Chamfer distance is a popular measure of dissimilarity between point clouds, used in many machine learning, computer vision, and graphics applications, and admits a straightforward $O(d n^2)$-time brute force algorithm. Further, the Chamfer distance is often used as a proxy for the more computationally demanding Earth-Mover (Optimal Transport) Distance. However, the \emph{quadratic} dependence on $n$ in the running time makes the naive approach intractable for large datasets. We overcome this bottleneck and present the first $(1+\epsilon)$-approximate algorithm for estimating the Chamfer distance with a near-linear running time. Specifically, our algorithm runs in time $O(nd \log (n)/\varepsilon^2)$ and is implementable. Our experiments demonstrate that it is both accurate and fast on large high-dimensional datasets. We believe that our algorithm will open new avenues for analyzing large high-dimensional point clouds. We also give evidence that if the goal is to \emph{report} a $(1+\varepsilon)$-approximate mapping from $A$ to $B$ (as opposed to just its value), then any sub-quadratic time algorithm is unlikely to exist.

Related papers

Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch. We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work. We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z)
A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al. A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$. We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z)
Metricizing the Euclidean Space towards Desired Distance Relations in Point Clouds [1.2366208723499545]
We attack unsupervised learning algorithms, specifically $k$-Means and density-based (DBSCAN) clustering algorithms. We show that the results of clustering algorithms may not generally be trustworthy, unless there is a standardized and fixed prescription to use a specific distance function.
arXiv Detail & Related papers (2022-11-07T16:37:29Z)
Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
We give a sketching-based iterative algorithm that computes $1+varepsilon$ approximate solutions for the ridge regression problem. We also show that this algorithm can be used to give faster algorithms for kernel ridge regression.
arXiv Detail & Related papers (2022-04-13T22:18:47Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z)
Provably Approximated ICP [40.349822671753024]
We prove that there emphalways exists a "witness" set of $3$ pairs in $P times Q$ that, via novel alignment algorithm, defines a constant factor approximation to this global optimum. Our approximation constants are, in practice, close to $1$, and up to x$10$ times smaller than state-of-the-art algorithms.
arXiv Detail & Related papers (2021-01-10T18:09:29Z)
A Linear Transportation $\mathrm{L}^p$ Distance for Pattern Recognition [4.991212094743681]
The transportation $mathrmLp$ distance has been proposed as a generalisation of Wasserstein $mathrmWp$ distances. We show that the linear $mathrmTLp$ distance significantly improves over the linear $mathrmWp$ distance on signal processing tasks.
arXiv Detail & Related papers (2020-09-23T17:19:19Z)
On Efficient Low Distortion Ultrametric Embedding [18.227854382422112]
A widely-used method to preserve the underlying hierarchical structure of the data is to find an embedding of the data into a tree or an ultrametric. In this paper, we provide a new algorithm which takes as input a set of points isometric in $mathbbRd2 (for universal constant $rho>1$) to output an ultrametric $Delta. We show that the output of our algorithm is comparable to the output of the linkage algorithms while achieving a much faster running time.
arXiv Detail & Related papers (2020-08-15T11:06:45Z)
An $\tilde{O}(n^{5/4})$ Time $\varepsilon$-Approximation Algorithm for RMS Matching in a Plane [3.9596068699962315]
The 2-Wasserstein distance (or RMS distance) is a useful measure of similarity between probability distributions. We present a new $varepsilon$-approximation algorithm that runs in $O(n5/4mathrmpolylog n,1/varepsilon)$ time.
arXiv Detail & Related papers (2020-07-15T14:47:25Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.