Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation
- URL: http://arxiv.org/abs/2309.01717v3
- Date: Tue, 4 Jun 2024 02:01:18 GMT
- Title: Interdisciplinary Fairness in Imbalanced Research Proposal Topic Inference: A Hierarchical Transformer-based Method with Selective Interpolation
- Authors: Meng Xiao, Min Wu, Ziyue Qiao, Yanjie Fu, Zhiyuan Ning, Yi Du, Yuanchun Zhou,
- Abstract summary: Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency.
Existing methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon.
In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture.
- Score: 26.30701957043284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from the discipline system defined by a funding agency. The agency will subsequently find appropriate peer review experts from their database based on this division. Automated topic inference can reduce human errors caused by manual topic filling, bridge the knowledge gap between funding agencies and project applicants, and improve system efficiency. Existing methods focus on modeling this as a hierarchical multi-label classification problem, using generative models to iteratively infer the most appropriate topic information. However, these methods overlook the gap in scale between interdisciplinary research proposals and non-interdisciplinary ones, leading to an unjust phenomenon where the automated inference system categorizes interdisciplinary proposals as non-interdisciplinary, causing unfairness during the expert assignment. How can we address this data imbalance issue under a complex discipline system and hence resolve this unfairness? In this paper, we implement a topic label inference system based on a Transformer encoder-decoder architecture. Furthermore, we utilize interpolation techniques to create a series of pseudo-interdisciplinary proposals from non-interdisciplinary ones during training based on non-parametric indicators such as cross-topic probabilities and topic occurrence probabilities. This approach aims to reduce the bias of the system during model training. Finally, we conduct extensive experiments on a real-world dataset to verify the effectiveness of the proposed method. The experimental results demonstrate that our training strategy can significantly mitigate the unfairness generated in the topic inference task.
Related papers
- Reconciling Heterogeneous Effects in Causal Inference [44.99833362998488]
We apply the Reconcile algorithm for model multiplicity in machine learning to reconcile heterogeneous effects in causal inference.
Our results have tangible implications for ensuring fair outcomes in high-stakes such as healthcare, insurance, and housing.
arXiv Detail & Related papers (2024-06-05T18:43:46Z) - Resolving the Imbalance Issue in Hierarchical Disciplinary Topic
Inference via LLM-based Data Augmentation [5.98277339029019]
This study leverages large language models (Llama V1) as data generators to augment research proposals categorized within intricate disciplinary hierarchies.
Our experiments attest to the efficacy of the generated data, demonstrating that research proposals produced using the prompts can effectively address the aforementioned issues.
arXiv Detail & Related papers (2023-10-09T00:45:20Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Combating Exacerbated Heterogeneity for Robust Models in Federated
Learning [91.88122934924435]
Combination of adversarial training and federated learning can lead to the undesired robustness deterioration.
We propose a novel framework called Slack Federated Adversarial Training (SFAT)
We verify the rationality and effectiveness of SFAT on various benchmarked and real-world datasets.
arXiv Detail & Related papers (2023-03-01T06:16:15Z) - In Search of Insights, Not Magic Bullets: Towards Demystification of the
Model Selection Dilemma in Heterogeneous Treatment Effect Estimation [92.51773744318119]
This paper empirically investigates the strengths and weaknesses of different model selection criteria.
We highlight that there is a complex interplay between selection strategies, candidate estimators and the data used for comparing them.
arXiv Detail & Related papers (2023-02-06T16:55:37Z) - Hierarchical MixUp Multi-label Classification with Imbalanced
Interdisciplinary Research Proposals [22.458438099629277]
We propose a hierarchical mixup multiple-label classification framework, which we called H-MixUp.
The number of proposals is imbalanced between non-interdisciplinary and interdisciplinary research.
We develop a fused training method of Wold-level MixUp, Word-level CutMix, Manifold MixUp, and Document-level MixUp to address the third issue.
arXiv Detail & Related papers (2022-09-28T08:27:52Z) - Hierarchical Interdisciplinary Topic Detection Model for Research
Proposal Classification [33.06389455749012]
We develop a deep Hierarchical Interdisciplinary Research Proposal Classification Network (HIRPCN)
We first propose a hierarchical transformer to extract the textual semantic information of proposals.
We then design an interdisciplinary graph and leverage GNNs for learning representations of each discipline.
arXiv Detail & Related papers (2022-09-16T16:59:25Z) - Causal Fairness Analysis [68.12191782657437]
We introduce a framework for understanding, modeling, and possibly solving issues of fairness in decision-making settings.
The main insight of our approach will be to link the quantification of the disparities present on the observed data with the underlying, and often unobserved, collection of causal mechanisms.
Our effort culminates in the Fairness Map, which is the first systematic attempt to organize and explain the relationship between different criteria found in the literature.
arXiv Detail & Related papers (2022-07-23T01:06:34Z) - Who Should Review Your Proposal? Interdisciplinary Topic Path Detection
for Research Proposals [24.995369698179317]
It has been a longstanding challenge to assign proposals to appropriate reviewers.
Existing systems mainly collect topic labels manually reported by discipline investigators.
What role can AI play in developing a fair and precise proposal review system?
arXiv Detail & Related papers (2022-03-07T03:30:50Z) - Through the Data Management Lens: Experimental Analysis and Evaluation
of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness.
We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability.
Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.