Words as Bridges: Exploring Computational Support for Cross-Disciplinary Translation Work
- URL: http://arxiv.org/abs/2503.18471v1
- Date: Mon, 24 Mar 2025 09:19:29 GMT
- Title: Words as Bridges: Exploring Computational Support for Cross-Disciplinary Translation Work
- Authors: Calvin Bao, Yow-Ting Shiue, Marine Carpuat, Joel Chan,
- Abstract summary: We develop a prototype cross-domain search engine that uses aligned domain-specific embeddings to support conceptual exploration.<n>We discuss qualitative insights into the promises and pitfalls of this approach to translation work.
- Score: 13.058644000074654
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Scholars often explore literature outside of their home community of study. This exploration process is frequently hampered by field-specific jargon. Past computational work often focuses on supporting translation work by removing jargon through simplification and summarization; here, we explore a different approach that preserves jargon as useful bridges to new conceptual spaces. Specifically, we cast different scholarly domains as different language-using communities, and explore how to adapt techniques from unsupervised cross-lingual alignment of word embeddings to explore conceptual alignments between domain-specific word embedding spaces.We developed a prototype cross-domain search engine that uses aligned domain-specific embeddings to support conceptual exploration, and tested this prototype in two case studies. We discuss qualitative insights into the promises and pitfalls of this approach to translation work, and suggest design insights for future interfaces that provide computational support for cross-domain information seeking.
Related papers
- LexGen: Domain-aware Multilingual Lexicon Generation [40.97738267067852]
We propose a new model to generate dictionary words for 6 Indian languages in the multi-domain setting.
Our model consists of domain-specific and domain-generic layers that encode information.
We release a new benchmark dataset across 6 Indian languages that span 8 diverse domains.
arXiv Detail & Related papers (2024-05-18T07:02:43Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - OV-VG: A Benchmark for Open-Vocabulary Visual Grounding [33.02137080950678]
This research endeavor introduces novel and challenging open-vocabulary visual tasks.
The overarching aim is to establish connections between language descriptions and the localization of novel objects.
We have curated a benchmark, encompassing 7,272 OV-VG images and 1,000 OV-PL images.
arXiv Detail & Related papers (2023-10-22T17:54:53Z) - A Recent Survey of Heterogeneous Transfer Learning [15.830786437956144]
heterogeneous transfer learning has become a vital strategy in various tasks.
We offer an extensive review of over 60 HTL methods, covering both data-based and model-based approaches.
We explore applications in natural language processing, computer vision, multimodal learning, and biomedicine.
arXiv Detail & Related papers (2023-10-12T16:19:58Z) - Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection.
Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training.
This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z) - Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis [26.974822569543786]
We propose a soft prompt-based joint learning method for cross domain aspect term extraction.
By incorporating external linguistic features, the proposed method learn domain-invariant representations between source and target domains.
Experiments are conducted on the benchmark datasets and the experimental results demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-03-01T20:33:37Z) - Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow
Interpolation [48.32604585839687]
Previous adversarial approaches have shown promising results in inducing cross-lingual word embedding without parallel data.
We propose to make use of a sequence of intermediate spaces for smooth bridging.
arXiv Detail & Related papers (2022-10-07T04:37:47Z) - Revise and Resubmit: An Intertextual Model of Text-based Collaboration
in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science.
Existing NLP studies focus on the analysis of individual texts.
editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Unsupervised Word Translation Pairing using Refinement based Point Set
Registration [8.568050813210823]
Cross-lingual alignment of word embeddings play an important role in knowledge transfer across languages.
Current unsupervised approaches rely on similarities in geometric structure of word embedding spaces across languages.
This paper proposes BioSpere, a novel framework for unsupervised mapping of bi-lingual word embeddings onto a shared vector space.
arXiv Detail & Related papers (2020-11-26T09:51:29Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - A Common Semantic Space for Monolingual and Cross-Lingual
Meta-Embeddings [10.871587311621974]
This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings.
Existing word vectors are projected to a common semantic space using linear transformations and averaging.
The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities.
arXiv Detail & Related papers (2020-01-17T15:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.