Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker
- URL: http://arxiv.org/abs/2511.07969v1
- Date: Wed, 12 Nov 2025 01:31:37 GMT
- Title: Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker
- Authors: Matthias De Lange, Jens-Joris Decorte, Jeroen Van Hautte,
- Abstract summary: We introduce WorkBench, the first unified evaluation suite spanning six work-related tasks formulated explicitly as ranking problems.<n>We use this insight to compose task-specific bipartite graphs from real-world data, synthetically enriched through grounding.<n>This leads to Unified Work Embeddings (UWE), a task-agnostic bi-encoder that exploits our training-data structure with a many-to-many InfoNCE objective.
- Score: 3.4204762278595346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Workforce transformation across diverse industries has driven an increased demand for specialized natural language processing capabilities. Nevertheless, tasks derived from work-related contexts inherently reflect real-world complexities, characterized by long-tailed distributions, extreme multi-label target spaces, and scarce data availability. The rise of generalist embedding models prompts the question of their performance in the work domain, especially as progress in the field has focused mainly on individual tasks. To this end, we introduce WorkBench, the first unified evaluation suite spanning six work-related tasks formulated explicitly as ranking problems, establishing a common ground for multi-task progress. Based on this benchmark, we find significant positive cross-task transfer, and use this insight to compose task-specific bipartite graphs from real-world data, synthetically enriched through grounding. This leads to Unified Work Embeddings (UWE), a task-agnostic bi-encoder that exploits our training-data structure with a many-to-many InfoNCE objective, and leverages token-level embeddings with task-agnostic soft late interaction. UWE demonstrates zero-shot ranking performance on unseen target spaces in the work domain, enables low-latency inference by caching the task target space embeddings, and shows significant gains in macro-averaged MAP and RP@10 over generalist embedding models.
Related papers
- Task Prototype-Based Knowledge Retrieval for Multi-Task Learning from Partially Annotated Data [38.55691652000724]
Multi-task learning (MTL) is critical in real-world applications such as autonomous driving and robotics.<n>Existing methods for partially labeled MTL typically rely on predictions from unlabeled tasks.<n>We propose a prototype-based knowledge retrieval framework that achieves robust MTL instead of relying on predictions from unlabeled tasks.
arXiv Detail & Related papers (2026-01-12T12:27:02Z) - Tracking and Segmenting Anything in Any Modality [75.32774085793498]
We propose a universal tracking and segmentation framework named SATA, which unifies a broad spectrum of tracking and segmentation subtasks with any modality input.<n> SATA demonstrates superior performance on 18 challenging tracking and segmentation benchmarks, offering a novel perspective for more generalizable video understanding.
arXiv Detail & Related papers (2025-11-22T09:09:22Z) - Multi-Task Label Discovery via Hierarchical Task Tokens for Partially Annotated Dense Predictions [44.78165979575075]
We propose a novel approach to optimize a set of compact learnable hierarchical task tokens.<n>The global task tokens are designed for effective cross-task feature interactions in a global context.<n>A group of fine-grained task-specific spatial tokens for each task is learned from the corresponding global task tokens.<n>The learned global and local fine-grained task tokens are further used to discover pseudo task-specific dense labels at different levels of granularity.
arXiv Detail & Related papers (2024-11-27T23:53:27Z) - Joint-Task Regularization for Partially Labeled Multi-Task Learning [30.823282043129552]
Multi-task learning has become increasingly popular in the machine learning field, but its practicality is hindered by the need for large, labeled datasets.
We propose Joint-Task Regularization (JTR), an intuitive technique which leverages cross-task relations to simultaneously regularize all tasks in a single joint-task latent space.
arXiv Detail & Related papers (2024-04-02T14:16:59Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - BridgeNet: Comprehensive and Effective Feature Interactions via Bridge Feature for Multi-task Dense Predictions [29.049866510120093]
Multi-task dense prediction aims at handling multiple pixel-wise prediction tasks within a unified network simultaneously for visual scene understanding.
To tackle these under-explored issues, we propose a novel BridgeNet framework, which extracts comprehensive and discriminative intermediate Bridge Features.
To the best of our knowledge, this is the first work considering the completeness and quality of feature participants in cross-task interactions.
arXiv Detail & Related papers (2023-12-21T01:30:44Z) - Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning [9.571499333904969]
Auxiliary tasks facilitate learning in situations where data is scarce or the principal task of interest is extremely complex.<n>We propose a novel framework, dubbed Detaux, whereby a weakly supervised disentanglement procedure is used to discover a new unrelated auxiliary classification task.<n>The disentanglement procedure works at the representation level, isolating the variation related to the principal task into an isolated subspace.
arXiv Detail & Related papers (2023-10-13T17:40:39Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Distribution Matching for Heterogeneous Multi-Task Learning: a
Large-scale Face Study [75.42182503265056]
Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm.
We deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
We build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks.
arXiv Detail & Related papers (2021-05-08T22:26:52Z) - Learning Task-oriented Disentangled Representations for Unsupervised
Domain Adaptation [165.61511788237485]
Unsupervised domain adaptation (UDA) aims to address the domain-shift problem between a labeled source domain and an unlabeled target domain.
We propose a dynamic task-oriented disentangling network (DTDN) to learn disentangled representations in an end-to-end fashion for UDA.
arXiv Detail & Related papers (2020-07-27T01:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.