Related papers: AI-Driven Decision-Making System for Hiring Process

AI-Driven Decision-Making System for Hiring Process

URL: http://arxiv.org/abs/2512.20652v1
Date: Wed, 17 Dec 2025 18:45:17 GMT
Title: AI-Driven Decision-Making System for Hiring Process
Authors: Vira Filatova, Andrii Zelenchuk, Dmytro Filatov,
Abstract summary: This paper presents an AI-driven, modular multi-agent hiring assistant.<n>It integrates (i) document and video preprocessing, (ii) structured candidate profile construction, (iii) public-data verification, (iv) technical/culture-fit scoring with explicit risk penalties, and (v) human-in-the-loop validation via an interactive interface.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Early-stage candidate validation is a major bottleneck in hiring, because recruiters must reconcile heterogeneous inputs (resumes, screening answers, code assignments, and limited public evidence). This paper presents an AI-driven, modular multi-agent hiring assistant that integrates (i) document and video preprocessing, (ii) structured candidate profile construction, (iii) public-data verification, (iv) technical/culture-fit scoring with explicit risk penalties, and (v) human-in-the-loop validation via an interactive interface. The pipeline is orchestrated by an LLM under strict constraints to reduce output variability and to generate traceable component-level rationales. Candidate ranking is computed by a configurable aggregation of technical fit, culture fit, and normalized risk penalties. The system is evaluated on 64 real applicants for a mid-level Python backend engineer role, using an experienced recruiter as the reference baseline and a second, less experienced recruiter for additional comparison. Alongside precision/recall, we propose an efficiency metric measuring expected time per qualified candidate. In this study, the system improves throughput and achieves 1.70 hours per qualified candidate versus 3.33 hours for the experienced recruiter, with substantially lower estimated screening cost, while preserving a human decision-maker as the final authority.

Related papers

Scaling Agentic Verifier for Competitive Coding [66.11758166379092]
Large language models (LLMs) have demonstrated strong coding capabilities but still struggle to solve competitive programming problems correctly in a single attempt.<n>Execution-based re-ranking offers a promising test-time scaling strategy, yet existing methods are constrained by either difficult test case generation or inefficient random input sampling.<n>We propose Agentic Verifier, an execution-based agent that actively reasons about program behaviors and searches for highly discriminative test inputs.
arXiv Detail & Related papers (2026-02-04T06:30:40Z)
When LLM meets Fuzzy-TOPSIS for Personnel Selection through Automated Profile Analysis [0.5949779668853556]
This study presents an automated personnel selection system that utilizes sophisticated natural language processing (NLP) methods to assess and rank software engineering applicants.<n>A distinctive dataset was created by aggregating LinkedIn profiles that include essential features such as education, work experience, abilities, and self-introduction.<n>For candidate ranking, the DistilRoBERTa model was fine-tuned and integrated with the fuzzy TOPSIS method, achieving rankings closely aligned with human expert evaluations.
arXiv Detail & Related papers (2026-01-30T00:57:35Z)
AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios [49.90735676070039]
The capacity of AI agents to effectively handle tasks of increasing duration and complexity continues to grow.<n>We argue that current evaluations prioritize increasing task difficulty without sufficiently addressing the diversity of agentic tasks.<n>We propose AgentIF-OneDay, aimed at determining whether general users can utilize natural language instructions and AI agents to complete a diverse array of daily tasks.
arXiv Detail & Related papers (2026-01-28T13:49:18Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
MLAR: Multi-layer Large Language Model-based Robotic Process Automation Applicant Tracking [0.0]
This paper introduces an innovative Applicant Tracking System (ATS) enhanced by a novel Robotic process automation (RPA) framework or as further referred to as MLAR.<n>MLAR addresses these challenges employing Large Language Models (LLMs) in three distinct layers: extracting key characteristics from job postings in the first layer, parsing applicant resume to identify education, experience, skills in the second layer, and similarity matching in the third layer.<n>Our approach integrates seamlessly into existing RPA pipelines, automating resume parsing, job matching, and candidate notifications.
arXiv Detail & Related papers (2025-07-14T16:53:19Z)
AI Agents-as-Judge: Automated Assessment of Accuracy, Consistency, Completeness and Clarity for Enterprise Documents [0.0]
This study presents a modular, multi-agent system for the automated review of highly structured enterprise business documents using AI agents.<n>It uses modern orchestration tools such as LangChain, CrewAI, TruLens, and Guidance to enable section-by-section evaluation of documents.<n>It achieves 99% information consistency (vs. 92% for humans), halving error and bias rates, and reducing average review time from 30 to 2.5 minutes per document.
arXiv Detail & Related papers (2025-06-23T17:46:15Z)
AI Hiring with LLMs: A Context-Aware and Explainable Multi-Agent Framework for Resume Screening [12.845918958645676]
We propose a multi-agent framework for resume screening using Large Language Models (LLMs)<n>The framework consists of four core agents, including a resume extractor, an evaluator, a summarizer, and a score formatter.<n>This dynamic adaptation enables personalized recruitment, bridging the gap between AI automation and talent acquisition.
arXiv Detail & Related papers (2025-04-01T12:56:39Z)
CritiQ: Mining Data Quality Criteria from Human Preferences [91.44025907584931]
We introduce CritiQ, a novel data selection method that automatically mines criteria from human preferences for data quality.<n>CritiQ Flow employs a manager agent to evolve quality criteria and worker agents to make pairwise judgments.<n>We demonstrate the effectiveness of our method in the code, math, and logic domains.
arXiv Detail & Related papers (2025-02-26T16:33:41Z)
Prompt Tuning as User Inherent Profile Inference Machine [68.16976932088708]
We propose UserIP-Tuning, which uses prompt-tuning to infer user profiles.<n>UserIP-Tuning outperforms state-of-the-art recommendation algorithms.<n>The presented solution has been deployed in Huawei AppGallery's Explore page since May 2025.
arXiv Detail & Related papers (2024-08-13T02:25:46Z)
On Speeding Up Language Model Evaluation [48.51924035873411]
We propose an $textitadaptive$ approach to explore this space.<n>We lean on multi-armed bandits to sequentially identify the next (method, validation sample)-pair to evaluate.<n>We show that it can identify the top-performing method using only 5-15% of the typical resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z)
Fairness in AI-Driven Recruitment: Challenges, Metrics, Methods, and Future Directions [0.0]
Recruitment process significantly impacts an organization's performance, productivity, and culture.<n>This paper systematically reviews biases identified in AI-driven recruitment systems, categorizes fairness metrics and bias mitigation techniques, and highlights auditing approaches used in practice.
arXiv Detail & Related papers (2024-05-30T05:25:14Z)
Query Performance Prediction using Relevance Judgments Generated by Large Language Models [53.97064615557883]
We propose a new Query performance prediction (QPP) framework using automatically generated relevance judgments (QPP-GenRE)<n>QPP-GenRE decomposes QPP into independent subtasks of predicting relevance of each item in a ranked list to a given query.<n>We predict an item's relevance by using open-source large language models (LLMs) to ensure scientific relevance.
arXiv Detail & Related papers (2024-04-01T09:33:05Z)
Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data. We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases. Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.