URL-BERT: Training Webpage Representations via Social Media Engagements
- URL: http://arxiv.org/abs/2310.16303v1
- Date: Wed, 25 Oct 2023 02:22:50 GMT
- Title: URL-BERT: Training Webpage Representations via Social Media Engagements
- Authors: Ayesha Qamar, Chetan Verma, Ahmed El-Kishky, Sumit Binnani, Sneha
Mehta, Taylor Berg-Kirkpatrick
- Abstract summary: We introduce a new pre-training objective that can be used to adapt LMs to understand URLs and webpages.
Our proposed framework consists of two steps: (1) scalable graph embeddings to learn shallow representations of URLs based on user engagement on social media.
We experimentally demonstrate that our continued pre-training approach improves webpage understanding on a variety of tasks and Twitter internal and external benchmarks.
- Score: 31.6455614291821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding and representing webpages is crucial to online social networks
where users may share and engage with URLs. Common language model (LM) encoders
such as BERT can be used to understand and represent the textual content of
webpages. However, these representations may not model thematic information of
web domains and URLs or accurately capture their appeal to social media users.
In this work, we introduce a new pre-training objective that can be used to
adapt LMs to understand URLs and webpages. Our proposed framework consists of
two steps: (1) scalable graph embeddings to learn shallow representations of
URLs based on user engagement on social media and (2) a contrastive objective
that aligns LM representations with the aforementioned graph-based
representation. We apply our framework to the multilingual version of BERT to
obtain the model URL-BERT. We experimentally demonstrate that our continued
pre-training approach improves webpage understanding on a variety of tasks and
Twitter internal and external benchmarks.
Related papers
- SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization [70.11167263638562]
Social relation reasoning aims to identify relation categories such as friends, spouses, and colleagues from images.
We first present a simple yet well-crafted framework named name, which combines the perception capability of Vision Foundation Models (VFMs) and the reasoning capability of Large Language Models (LLMs) within a modular framework.
arXiv Detail & Related papers (2024-10-28T18:10:26Z) - EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data [15.801018643716437]
This paper aims to enhance the GUI understanding and interacting capabilities of large vision-language models (LVLMs) through a data-driven approach.
We propose EDGE, a general data synthesis framework that automatically generates large-scale, multi-granularity training data from webpages across the Web.
Our approach significantly reduces the dependence on manual annotations, empowering researchers to harness the vast public resources available on the Web to advance their work.
arXiv Detail & Related papers (2024-10-25T10:46:17Z) - Representing Web Applications As Knowledge Graphs [0.0]
The proposed method models each node as a structured representation of the application's current state, with edges reflecting user-initiated actions or transitions.
This structured representation enables a more comprehensive and functional understanding of web applications, offering valuable insights for downstream tasks such as automated testing and behavior analysis.
arXiv Detail & Related papers (2024-10-06T02:50:41Z) - SocialQuotes: Learning Contextual Roles of Social Media Quotes on the Web [9.130915550141337]
We liken social media embeddings to quotes, formalize the page context as structured natural language signals, and identify a taxonomy of roles for quotes within the page context.
We release SocialQuotes, a new data set built from the Common Crawl of over 32 million social quotes, 8.3k of them with crowdsourced quote annotations.
arXiv Detail & Related papers (2024-07-22T19:21:01Z) - AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization [57.34659640776723]
We propose an end-to-end framework named AddressCLIP to solve the problem with more semantics.
We have built three datasets from Pittsburgh and San Francisco on different scales specifically for the IAL problem.
arXiv Detail & Related papers (2024-07-11T03:18:53Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - SoMeR: Multi-View User Representation Learning for Social Media [1.7949335303516192]
We propose SoMeR, a Social Media user representation learning framework that incorporates temporal activities, text content, profile information, and network interactions to learn comprehensive user portraits.
SoMeR encodes user post streams as sequences of timestamped textual features, uses transformers to embed this along with profile data, and jointly trains with link prediction and contrastive learning objectives.
We demonstrate SoMeR's versatility through two applications: 1) Identifying inauthentic accounts involved in coordinated influence operations by detecting users posting similar content simultaneously, and 2) Measuring increased polarization in online discussions after major events by quantifying how users with different beliefs moved farther apart
arXiv Detail & Related papers (2024-05-02T22:26:55Z) - URLBERT:A Contrastive and Adversarial Pre-trained Model for URL
Classification [10.562100395816595]
URLs play a crucial role in understanding and categorizing web content.
This paper introduces URLBERT, the first pre-trained representation learning model applied to a variety of URL classification or detection tasks.
arXiv Detail & Related papers (2024-02-18T07:51:20Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Pix2Struct: Screenshot Parsing as Pretraining for Visual Language
Understanding [58.70423899829642]
We present Pix2Struct, a pretrained image-to-text model for purely visual language understanding.
We show that a single pretrained model can achieve state-of-the-art results in six out of nine tasks across four domains.
arXiv Detail & Related papers (2022-10-07T06:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.