Improving Biomedical Abstractive Summarisation with Knowledge
Aggregation from Citation Papers
- URL: http://arxiv.org/abs/2310.15684v1
- Date: Tue, 24 Oct 2023 09:56:46 GMT
- Title: Improving Biomedical Abstractive Summarisation with Knowledge
Aggregation from Citation Papers
- Authors: Chen Tang, Shun Wang, Tomas Goldsack and Chenghua Lin
- Abstract summary: Existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts.
We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers.
Our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.
- Score: 24.481854035628434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstracts derived from biomedical literature possess distinct domain-specific
characteristics, including specialised writing styles and biomedical
terminologies, which necessitate a deep understanding of the related
literature. As a result, existing language models struggle to generate
technical summaries that are on par with those produced by biomedical experts,
given the absence of domain-specific background knowledge. This paper aims to
enhance the performance of language models in biomedical abstractive
summarisation by aggregating knowledge from external papers cited within the
source article. We propose a novel attention-based citation aggregation model
that integrates domain-specific knowledge from citation papers, allowing neural
networks to generate summaries by leveraging both the paper content and
relevant knowledge from citation papers. Furthermore, we construct and release
a large-scale biomedical summarisation dataset that serves as a foundation for
our research. Extensive experiments demonstrate that our model outperforms
state-of-the-art approaches and achieves substantial improvements in
abstractive biomedical text summarisation.
Related papers
- Generalized knowledge-enhanced framework for biomedical entity and relation extraction [0.6856896119187885]
We develop a novel framework to construct a task-independent and reusable background knowledge graph for biomedical entity and relation extraction.
The design of our model is inspired by how humans learn domain-specific topics.
Our framework employs such common-knowledge-sharing mechanism to build a general neural-network knowledge graph that is learning transferable to different domain-specific biomedical texts effectively.
arXiv Detail & Related papers (2024-08-13T04:06:45Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models [1.9665865095034865]
We formulate the relation extraction task as binary classifications for large language models.
We designate the main title as the tail entity and explicitly incorporate it into the context.
Longer contents are sliced into text chunks, embedded, and retrieved with additional embedding models.
arXiv Detail & Related papers (2023-12-13T16:43:41Z) - Enhancing Biomedical Lay Summarisation with External Knowledge Graphs [28.956500948255677]
We investigate the effectiveness of three different approaches for incorporating knowledge graphs within lay summarisation models.
Our results confirm that integrating graph-based domain knowledge can significantly benefit lay summarisation by substantially increasing the readability of generated text.
arXiv Detail & Related papers (2023-10-24T10:25:21Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Literature Triage on Genomic Variation Publications by
Knowledge-enhanced Multi-channel CNN [5.187865216685969]
The aim of this study is to investigate the correlation between genomic variation and certain diseases or phenotypes.
We adopt a multi-channel convolutional network to utilize rich textual information and bridge the semantic gaps from different corpora.
Our model improves the accuracy of biomedical literature triage results.
arXiv Detail & Related papers (2020-05-08T13:47:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.