Related papers: Improving Address Matching using Siamese Transformer Networks

Improving Address Matching using Siamese Transformer Networks

URL: http://arxiv.org/abs/2307.02300v1
Date: Wed, 5 Jul 2023 13:58:26 GMT
Title: Improving Address Matching using Siamese Transformer Networks
Authors: Andr\'e V. Duarte and Arlindo L. Oliveira
Abstract summary: This research introduces a deep learning-based model designed to increase the efficiency of address matching for Portuguese addresses. The model has been tested on a real-case scenario of Portuguese addresses and exhibits a high degree of accuracy, exceeding 95% at the door level.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Matching addresses is a critical task for companies and post offices involved in the processing and delivery of packages. The ramifications of incorrectly delivering a package to the wrong recipient are numerous, ranging from harm to the company's reputation to economic and environmental costs. This research introduces a deep learning-based model designed to increase the efficiency of address matching for Portuguese addresses. The model comprises two parts: (i) a bi-encoder, which is fine-tuned to create meaningful embeddings of Portuguese postal addresses, utilized to retrieve the top 10 likely matches of the un-normalized target address from a normalized database, and (ii) a cross-encoder, which is fine-tuned to accurately rerank the 10 addresses obtained by the bi-encoder. The model has been tested on a real-case scenario of Portuguese addresses and exhibits a high degree of accuracy, exceeding 95% at the door level. When utilized with GPU computations, the inference speed is about 4.5 times quicker than other traditional approaches such as BM25. An implementation of this system in a real-world scenario would substantially increase the effectiveness of the distribution process. Such an implementation is currently under investigation.

Related papers

Towards Optimal Multi-draft Speculative Decoding [102.67837141152232]
Multi-Draft Speculative Decoding (MDSD) is a recent approach where, when generating each token, a small draft model generates multiple drafts. This paper discusses the dual of the optimal transport problem, providing a way to efficiently compute the optimal acceptance rate.
arXiv Detail & Related papers (2025-02-26T03:22:44Z)
AddrLLM: Address Rewriting via Large Language Model on Nationwide Logistics Data [15.64626282181379]
We introduce AddrLLM, an innovative framework for address rewriting built upon a retrieval augmented large language model. It overcomes aforementioned limitations through a meticulously designed Supervised Fine-Tuning module, an Address-centric Retrieval Augmented Generation module and a Bias-free Objective Alignment module. It has significantly decreased the rate of parcel re-routing by approximately 43%, underscoring its exceptional efficacy in real-world applications.
arXiv Detail & Related papers (2024-11-17T07:32:46Z)
Improvement in Semantic Address Matching using Natural Language Processing [16.09672533759915]
Address matching is an important task for many businesses especially delivery and take out companies. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database. This paper discuss semantic Address matching technique, by which we can find out a particular address from a list of possible addresses.
arXiv Detail & Related papers (2024-04-17T18:42:36Z)
Global Point Cloud Registration Network for Large Transformations [46.7301374772952]
We present ReLaTo, an architecture that faces the cases where large transformations happen while maintaining good performance for local transformations. This paper uses a novel Softmax pooling layer to find correspondences in a bilateral consensus manner between two point sets, sampling the most confident matches. A target-guided denoising step is then applied to both the obtained matches and latent features, estimating the final fine registration.
arXiv Detail & Related papers (2024-03-26T18:52:48Z)
Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition [77.4863142882136]
Cross-encoder models are prohibitively expensive for direct k-nearest neighbor (k-NN) search. We propose ADACUR, a method that adaptively, iteratively, and efficiently minimizes the approximation error for the practically important top-k neighbors.
arXiv Detail & Related papers (2023-05-04T17:01:17Z)
GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting [121.42898228997538]
We propose an efficient shared encoding for all agents and the map without sacrificing accuracy or generalization. We leverage pair-wise relative positional encodings to represent geometric relationships between the agents and the map elements in a heterogeneous spatial graph. Our decoder is also viewpoint agnostic, predicting agent goals on the lane graph to enable diverse and context-aware multimodal prediction.
arXiv Detail & Related papers (2022-11-04T16:10:50Z)
Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization [60.91600465922932]
We present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. Our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods.
arXiv Detail & Related papers (2022-10-23T00:32:04Z)
ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement [80.94378602238432]
We propose an efficient structure named Correspondence Efficient Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner. To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates. Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
arXiv Detail & Related papers (2022-09-25T13:05:33Z)
Should All Proposals be Treated Equally in Object Detection? [110.27485090952385]
The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequal processing of proposals. This results in better utilization of available computational budget, enabling higher accuracy for the same FLOPS.
arXiv Detail & Related papers (2022-07-07T18:26:32Z)
Multinational Address Parsing: A Zero-Shot Evaluation [0.3211619859724084]
Address parsing consists of identifying the segments that make up an address, such as a street name or a postal code. Previous work on neural networks has only focused on parsing addresses from a single source country. This paper explores the possibility of transferring the address parsing knowledge acquired by training deep learning models on some countries' addresses to others.
arXiv Detail & Related papers (2021-12-07T21:40:43Z)
Deep Contextual Embeddings for Address Classification in E-commerce [0.03222802562733786]
E-commerce customers in developing nations like India tend to follow no fixed format while entering shipping addresses. It is imperative to understand the language of addresses, so that shipments can be routed without delays. We propose a novel approach towards understanding customer addresses by deriving motivation from recent advances in Natural Language Processing (NLP)
arXiv Detail & Related papers (2020-07-06T19:06:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.