Unified Embedding: Battle-Tested Feature Representations for Web-Scale
ML Systems
- URL: http://arxiv.org/abs/2305.12102v3
- Date: Wed, 15 Nov 2023 00:22:44 GMT
- Title: Unified Embedding: Battle-Tested Feature Representations for Web-Scale
ML Systems
- Authors: Benjamin Coleman, Wang-Cheng Kang, Matthew Fahrbach, Ruoxi Wang,
Lichan Hong, Ed H. Chi, Derek Zhiyuan Cheng
- Abstract summary: Learning high-quality feature embeddings efficiently and effectively is critical for the performance of web-scale machine learning systems.
This work introduces a simple yet highly effective framework, Feature Multiplexing, where one single representation space is used across many different categorical features.
We propose a highly practical approach called Unified Embedding with three major benefits: simplified feature configuration, strong adaptation to dynamic data distributions, and compatibility with modern hardware.
- Score: 29.53535556926066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning high-quality feature embeddings efficiently and effectively is
critical for the performance of web-scale machine learning systems. A typical
model ingests hundreds of features with vocabularies on the order of millions
to billions of tokens. The standard approach is to represent each feature value
as a d-dimensional embedding, introducing hundreds of billions of parameters
for extremely high-cardinality features. This bottleneck has led to substantial
progress in alternative embedding algorithms. Many of these methods, however,
make the assumption that each feature uses an independent embedding table. This
work introduces a simple yet highly effective framework, Feature Multiplexing,
where one single representation space is used across many different categorical
features. Our theoretical and empirical analysis reveals that multiplexed
embeddings can be decomposed into components from each constituent feature,
allowing models to distinguish between features. We show that multiplexed
representations lead to Pareto-optimal parameter-accuracy tradeoffs for three
public benchmark datasets. Further, we propose a highly practical approach
called Unified Embedding with three major benefits: simplified feature
configuration, strong adaptation to dynamic data distributions, and
compatibility with modern hardware. Unified embedding gives significant
improvements in offline and online metrics compared to highly competitive
baselines across five web-scale search, ads, and recommender systems, where it
serves billions of users across the world in industry-leading products.
Related papers
- Enhancing Few-Shot Image Classification through Learnable Multi-Scale Embedding and Attention Mechanisms [1.1557852082644071]
In the context of few-shot classification, the goal is to train a classifier using a limited number of samples.
Traditional metric-based methods exhibit certain limitations in achieving this objective.
Our approach involves utilizing multi-output embedding network that maps samples into distinct feature spaces.
arXiv Detail & Related papers (2024-09-12T12:34:29Z) - Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition [57.74076383449153]
We propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++.
It models two common event representations simultaneously, i.e., event images and event voxels.
We achieve new state-of-the-art performance on the Bullying10k dataset, i.e., $90.51%$, which exceeds the second place by $+2.21%$.
arXiv Detail & Related papers (2024-06-27T02:32:46Z) - U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation [63.31007867379312]
We introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantics.
We employ feature fusion at multiple scales to ensure the effective extraction and integration of both global and local features.
Experimental results demonstrate that our approach achieves superior performance across multiple datasets.
arXiv Detail & Related papers (2024-05-24T08:58:48Z) - The Effectiveness of a Simplified Model Structure for Crowd Counting [11.640020969258101]
This paper discusses how to construct high-performance crowd counting models using only simple structures.
We propose the Fuss-Free Network (FFNet) that is characterized by its simple and efficieny structure, consisting of only a backbone network and a multi-scale feature fusion structure.
Our proposed crowd counting model is trained and evaluated on four widely used public datasets, and it achieves accuracy that is comparable to that of existing complex models.
arXiv Detail & Related papers (2024-04-11T15:42:53Z) - CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion [58.15403987979496]
CREMA is a generalizable, highly efficient, and modular modality-fusion framework for video reasoning.
We propose a novel progressive multimodal fusion design supported by a lightweight fusion module and modality-sequential training strategy.
We validate our method on 7 video-language reasoning tasks assisted by diverse modalities, including VideoQA and Video-Audio/3D/Touch/Thermal QA.
arXiv Detail & Related papers (2024-02-08T18:27:22Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - Unifying Voxel-based Representation with Transformer for 3D Object
Detection [143.91910747605107]
We present a unified framework for multi-modality 3D object detection, named UVTR.
The proposed method aims to unify multi-modality representations in the voxel space for accurate and robust single- or cross-modality 3D detection.
UVTR achieves leading performance in the nuScenes test set with 69.7%, 55.1%, and 71.1% NDS for LiDAR, camera, and multi-modality inputs, respectively.
arXiv Detail & Related papers (2022-06-01T17:02:40Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Mapping the Internet: Modelling Entity Interactions in Complex
Heterogeneous Networks [0.0]
We propose a versatile, unified framework called HMill' for sample representation, model definition and training.
We show an extension of the universal approximation theorem to the set of all functions realized by models implemented in the framework.
We solve three different problems from the cybersecurity domain using the framework.
arXiv Detail & Related papers (2021-04-19T21:32:44Z) - StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics [4.237343083490243]
In machine learning (ML), ensemble methods such as bagging, boosting, and stacking are widely-established approaches.
StackGenVis is a visual analytics system for stacked generalization.
arXiv Detail & Related papers (2020-05-04T15:43:55Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.