WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding
Universal Value" Documents with Smoothed Labels
- URL: http://arxiv.org/abs/2104.05547v1
- Date: Mon, 12 Apr 2021 15:18:41 GMT
- Title: WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding
Universal Value" Documents with Smoothed Labels
- Authors: Nan Bai, Renqian Luo, Pirouz Nourian, Ana Pereira Roders
- Abstract summary: This study applies state-of-the-art NLP models to build a classifier on a new real-world dataset containing official OUV justification statements.
Label smoothing is innovatively adapted to transform the task smoothly between multi-class and multi-label classification.
The study shows that the best models fine-tuned from BERT and ULMFiT can reach 94.3% top-3 accuracy.
- Score: 1.6440434996206623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The UNESCO World Heritage List (WHL) is to identify the exceptionally
valuable cultural and natural heritage to be preserved for mankind as a whole.
Evaluating and justifying the Outstanding Universal Value (OUV) of each
nomination in WHL is essentially important for a property to be inscribed, and
yet a complex task even for experts since the criteria are not mutually
exclusive. Furthermore, manual annotation of heritage values, which is
currently dominant in the field, is knowledge-demanding and time-consuming,
impeding systematic analysis of such authoritative documents in terms of their
implications on heritage management. This study applies state-of-the-art NLP
models to build a classifier on a new real-world dataset containing official
OUV justification statements, seeking an explainable, scalable, and less biased
automation tool to facilitate the nomination, evaluation, and monitoring
processes of World Heritage properties. Label smoothing is innovatively adapted
to transform the task smoothly between multi-class and multi-label
classification by adding prior inter-class relationship knowledge into the
labels, improving the performance of most baselines. The study shows that the
best models fine-tuned from BERT and ULMFiT can reach 94.3% top-3 accuracy,
which is promising to be further developed and applied in heritage research and
practice.
Related papers
- Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions [62.12545440385489]
Large language models (LLMs) have brought substantial advancements in text generation, but their potential for enhancing classification tasks remains underexplored.
We propose a framework for thoroughly investigating fine-tuning LLMs for classification, including both generation- and encoding-based approaches.
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
arXiv Detail & Related papers (2024-10-02T20:48:28Z) - LocalValueBench: A Collaboratively Built and Extensible Benchmark for Evaluating Localized Value Alignment and Ethical Safety in Large Language Models [0.0]
The proliferation of large language models (LLMs) requires robust evaluation of their alignment with local values and ethical standards.
textscLocalValueBench is a benchmark designed to assess LLMs' adherence to Australian values.
arXiv Detail & Related papers (2024-07-27T05:55:42Z) - HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical
Criteria Decomposition [92.17397504834825]
HD-Eval is a framework that iteratively aligns large language models evaluators with human preference.
HD-Eval inherits the essence from the evaluation mindset of human experts and enhances the alignment of LLM-based evaluators.
Extensive experiments on three evaluation domains demonstrate the superiority of HD-Eval in further aligning state-of-the-art evaluators.
arXiv Detail & Related papers (2024-02-24T08:01:32Z) - Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement [3.537369004801589]
We study the classification of legal reasoning according to jurisprudential philosophy.
We use a novel dataset of historical United States Supreme Court opinions annotated by a team of domain experts.
We find that generative models perform poorly when given instructions equal to the instructions presented to human annotators.
arXiv Detail & Related papers (2023-10-27T19:27:59Z) - Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA)
First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios.
Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository.
Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z) - Robust Representation Learning for Unreliable Partial Label Learning [86.909511808373]
Partial Label Learning (PLL) is a type of weakly supervised learning where each training instance is assigned a set of candidate labels, but only one label is the ground-truth.
This is known as Unreliable Partial Label Learning (UPLL) that introduces an additional complexity due to the inherent unreliability and ambiguity of partial labels.
We propose the Unreliability-Robust Representation Learning framework (URRL) that leverages unreliability-robust contrastive learning to help the model fortify against unreliable partial labels effectively.
arXiv Detail & Related papers (2023-08-31T13:37:28Z) - Deep Dive into the Language of International Relations: NLP-based
Analysis of UNESCO's Summary Records [0.0]
The inscription process on the UNESCO World Heritage List and the UNESCO Representative List of the Intangible Cultural Heritage of Humanity often leads to tensions and conflicts among states.
We propose innovative topic modelling and tension detection methods based on UNESCO's summary records.
We have developed an application tailored for diplomats, lawyers, political scientists, and international relations researchers.
arXiv Detail & Related papers (2023-07-31T11:06:08Z) - KoLA: Carefully Benchmarking World Knowledge of Large Language Models [87.96683299084788]
We construct a Knowledge-oriented LLM Assessment benchmark (KoLA)
We mimic human cognition to form a four-level taxonomy of knowledge-related abilities, covering $19$ tasks.
We use both Wikipedia, a corpus prevalently pre-trained by LLMs, along with continuously collected emerging corpora, to evaluate the capacity to handle unseen data and evolving knowledge.
arXiv Detail & Related papers (2023-06-15T17:20:46Z) - Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans.
We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z) - Heri-Graphs: A Workflow of Creating Datasets for Multi-modal Machine
Learning on Graphs of Heritage Values and Attributes with Social Media [7.318997639507268]
Values (why to conserve) and Attributes (what to conserve) are essential concepts of cultural heritage.
Recent studies have been using social media to map values and attributes conveyed by public to cultural heritage.
This study presents a methodological workflow for constructing such multi-modal datasets using posts and images on Flickr.
arXiv Detail & Related papers (2022-05-16T09:45:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.