Leveraging Domain Knowledge for Inclusive and Bias-aware Humanitarian
Response Entry Classification
- URL: http://arxiv.org/abs/2305.16756v2
- Date: Tue, 30 May 2023 13:16:33 GMT
- Title: Leveraging Domain Knowledge for Inclusive and Bias-aware Humanitarian
Response Entry Classification
- Authors: Nicol\`o Tamagnone, Selim Fekih, Ximena Contla, Nayid Orozco, Navid
Rekabsaz
- Abstract summary: We aim to provide an effective and ethically-aware system for humanitarian data analysis.
We introduce a novel architecture adjusted to the humanitarian analysis framework.
We also propose a systematic way to measure and biases.
- Score: 3.824858358548714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and rapid situation analysis during humanitarian crises is critical
to delivering humanitarian aid efficiently and is fundamental to humanitarian
imperatives and the Leave No One Behind (LNOB) principle. This data analysis
can highly benefit from language processing systems, e.g., by classifying the
text data according to a humanitarian ontology. However, approaching this by
simply fine-tuning a generic large language model (LLM) involves considerable
practical and ethical issues, particularly the lack of effectiveness on
data-sparse and complex subdomains, and the encoding of societal biases and
unwanted associations. In this work, we aim to provide an effective and
ethically-aware system for humanitarian data analysis. We approach this by (1)
introducing a novel architecture adjusted to the humanitarian analysis
framework, (2) creating and releasing a novel humanitarian-specific LLM called
HumBert, and (3) proposing a systematic way to measure and mitigate biases. Our
experiments' results show the better performance of our approach on zero-shot
and full-training settings in comparison with strong baseline models, while
also revealing the existence of biases in the resulting LLMs. Utilizing a
targeted counterfactual data augmentation approach, we significantly reduce
these biases without compromising performance.
Related papers
- Bias in Large Language Models: Origin, Evaluation, and Mitigation [4.606140332500086]
Large Language Models (LLMs) have revolutionized natural language processing, but their susceptibility to biases poses significant challenges.
This comprehensive review examines the landscape of bias in LLMs, from its origins to current mitigation strategies.
Ethical and legal implications of biased LLMs are discussed, emphasizing potential harms in real-world applications such as healthcare and criminal justice.
arXiv Detail & Related papers (2024-11-16T23:54:53Z) - Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models [94.39278422567955]
Fine-tuning large language models (LLMs) on human preferences has proven successful in enhancing their capabilities.
However, ensuring the safety of LLMs during the fine-tuning remains a critical concern.
We propose a supervised learning framework called Bi-Factorial Preference Optimization (BFPO) to address this issue.
arXiv Detail & Related papers (2024-08-27T17:31:21Z) - Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback [58.049113055986375]
We develop a single stage approach named Alignment with Integrated Human Feedback (AIHF) to train reward models and the policy.
The proposed approach admits a suite of efficient algorithms, which can easily reduce to, and leverage, popular alignment algorithms.
We demonstrate the efficiency of the proposed solutions with extensive experiments involving alignment problems in LLMs and robotic control problems in MuJoCo.
arXiv Detail & Related papers (2024-06-11T01:20:53Z) - Aligning Large Language Models with Human Preferences through Representation Engineering [41.81020951061438]
Drawing inspiration from the emerging field of representation engineering (RepE), this study aims to identify relevant representations for high-level human preferences embedded in patterns of activity within an LLM.
This novel approach, denoted as Representation Alignment from Human Feedback (RAHF), proves to be effective, computationally efficient, and easy to implement.
arXiv Detail & Related papers (2023-12-26T11:01:36Z) - Which Prompts Make The Difference? Data Prioritization For Efficient
Human LLM Evaluation [9.452326973655445]
We find that metric-based methods enhance the efficiency of human evaluations by minimizing the number of required annotations.
We show that our method is effective across widely used model families, reducing instances of indecisive (or "tie") outcomes by up to 54%.
This potential reduction in required human effort positions our approach as a valuable strategy in future large language model evaluations.
arXiv Detail & Related papers (2023-10-22T21:48:51Z) - SALMON: Self-Alignment with Instructable Reward Models [80.83323636730341]
This paper presents a novel approach, namely SALMON, to align base language models with minimal human supervision.
We develop an AI assistant named Dromedary-2 with only 6 exemplars for in-context learning and 31 human-defined principles.
arXiv Detail & Related papers (2023-10-09T17:56:53Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Principle-Driven Self-Alignment of Language Models from Scratch with
Minimal Human Supervision [84.31474052176343]
Recent AI-assistant agents, such as ChatGPT, rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback to align the output with human intentions.
This dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision.
We propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision.
arXiv Detail & Related papers (2023-05-04T17:59:28Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.