TAROT: A Hierarchical Framework with Multitask Co-Pretraining on
Semi-Structured Data towards Effective Person-Job Fit
- URL: http://arxiv.org/abs/2401.07525v2
- Date: Wed, 17 Jan 2024 23:06:15 GMT
- Title: TAROT: A Hierarchical Framework with Multitask Co-Pretraining on
Semi-Structured Data towards Effective Person-Job Fit
- Authors: Yihan Cao, Xu Chen, Lun Du, Hao Chen, Qiang Fu, Shi Han, Yushu Du,
Yanbin Kang, Guangming Lu, Zi Li
- Abstract summary: We propose TAROT, a hierarchical multitask co-pretraining framework, to better utilize structural and semantic information for informative text embeddings.
TAROT targets semi-structured text in profiles and jobs, and it is co-pretained with multi-grained pretraining tasks to constrain the acquired semantic information at each level.
- Score: 60.31175803899285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Person-job fit is an essential part of online recruitment platforms in
serving various downstream applications like Job Search and Candidate
Recommendation. Recently, pretrained large language models have further
enhanced the effectiveness by leveraging richer textual information in user
profiles and job descriptions apart from user behavior features and job
metadata. However, the general domain-oriented design struggles to capture the
unique structural information within user profiles and job descriptions,
leading to a loss of latent semantic correlations. We propose TAROT, a
hierarchical multitask co-pretraining framework, to better utilize structural
and semantic information for informative text embeddings. TAROT targets
semi-structured text in profiles and jobs, and it is co-pretained with
multi-grained pretraining tasks to constrain the acquired semantic information
at each level. Experiments on a real-world LinkedIn dataset show significant
performance improvements, proving its effectiveness in person-job fit tasks.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Instruction Embedding: Latent Representations of Instructions Towards Task Identification [20.327984896070053]
For instructional data, the most important aspect is the task it represents, rather than the specific semantics and knowledge information.
In this work, we introduce a new concept, instruction embedding, and construct Instruction Embedding Benchmark (IEB) for its training and evaluation.
arXiv Detail & Related papers (2024-09-29T12:12:24Z) - Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering [8.20929362102942]
Author profiling is the task of inferring characteristics about individuals by analyzing content they share.
We propose a new method for author profiling which aims at distinguishing relevant from irrelevant content first, followed by the actual user profiling only with relevant data.
We evaluate our method for Big Five personality trait prediction on two Twitter corpora.
arXiv Detail & Related papers (2024-09-06T08:43:10Z) - Unified Pretraining for Recommendation via Task Hypergraphs [55.98773629788986]
We propose a novel multitask pretraining framework named Unified Pretraining for Recommendation via Task Hypergraphs.
For a unified learning pattern to handle diverse requirements and nuances of various pretext tasks, we design task hypergraphs to generalize pretext tasks to hyperedge prediction.
A novel transitional attention layer is devised to discriminatively learn the relevance between each pretext task and recommendation.
arXiv Detail & Related papers (2023-10-20T05:33:21Z) - Generalization with Lossy Affordances: Leveraging Broad Offline Data for
Learning Visuomotor Tasks [65.23947618404046]
We introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data.
When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems.
We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.
arXiv Detail & Related papers (2022-10-12T21:46:38Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - ORCA: Interpreting Prompted Language Models via Locating Supporting Data
Evidence in the Ocean of Pretraining Data [38.20984369410193]
Large pretrained language models have been performing increasingly well in a variety of downstream tasks via prompting.
It remains unclear from where the model learns the task-specific knowledge, especially in a zero-shot setup.
In this work, we want to find evidence of the model's task-specific competence from pretraining and are specifically interested in locating a very small subset of pretraining data.
arXiv Detail & Related papers (2022-05-25T09:25:06Z) - Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks.
We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z) - ConCET: Entity-Aware Topic Classification for Open-Domain Conversational
Agents [9.870634472479571]
We introduce ConCET: a Concurrent Entity-aware conversational Topic classifier.
We propose a simple and effective method for generating synthetic training data.
We evaluate ConCET on a large dataset of human-machine conversations with real users, collected as part of the Amazon Alexa Prize.
arXiv Detail & Related papers (2020-05-28T06:29:08Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.