OpenFraming: We brought the ML; you bring the data. Interact with your
data and discover its frames
- URL: http://arxiv.org/abs/2008.06974v1
- Date: Sun, 16 Aug 2020 18:59:30 GMT
- Title: OpenFraming: We brought the ML; you bring the data. Interact with your
data and discover its frames
- Authors: Alyssa Smith, David Assefa Tofu, Mona Jalal, Edward Edberg Halim,
Yimeng Sun, Vidya Akavoor, Margrit Betke, Prakash Ishwar, Lei Guo, Derry
Wijaya
- Abstract summary: We introduce a Web-based system for analyzing and classifying frames in text documents.
We provide both state-of-the-art pre-trained frame classification models on various issues and a user-friendly pipeline for training novel classification models.
The code making up our system is also open-sourced and well-documented, making the system transparent and expandable.
- Score: 13.695739582457872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When journalists cover a news story, they can cover the story from multiple
angles or perspectives. A news article written about COVID-19 for example,
might focus on personal preventative actions such as mask-wearing, while
another might focus on COVID-19's impact on the economy. These perspectives are
called "frames," which when used may influence public perception and opinion of
the issue. We introduce a Web-based system for analyzing and classifying frames
in text documents. Our goal is to make effective tools for automatic frame
discovery and labeling based on topic modeling and deep learning widely
accessible to researchers from a diverse array of disciplines. To this end, we
provide both state-of-the-art pre-trained frame classification models on
various issues as well as a user-friendly pipeline for training novel
classification models on user-provided corpora. Researchers can submit their
documents and obtain frames of the documents. The degree of user involvement is
flexible: they can run models that have been pre-trained on select issues;
submit labeled documents and train a new model for frame classification; or
submit unlabeled documents and obtain potential frames of the documents. The
code making up our system is also open-sourced and well-documented, making the
system transparent and expandable. The system is available on-line at
http://www.openframing.org and via our GitHub page
https://github.com/davidatbu/openFraming .
Related papers
- Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - FaKnow: A Unified Library for Fake News Detection [11.119667583594483]
FaKnow is a unified and comprehensive fake news detection algorithm library.
It covers the full spectrum of the model training and evaluation process.
It furnishes a series of auxiliary functionalities and tools, including visualization, and logging.
arXiv Detail & Related papers (2024-01-27T13:29:17Z) - Follow Anything: Open-set detection, tracking, and following in
real-time [89.83421771766682]
We present a robotic system to detect, track, and follow any object in real-time.
Our approach, dubbed follow anything'' (FAn), is an open-vocabulary and multimodal model.
FAn can be deployed on a laptop with a lightweight (6-8 GB) graphics card, achieving a throughput of 6-20 frames per second.
arXiv Detail & Related papers (2023-08-10T17:57:06Z) - Towards Open-Domain Topic Classification [69.21234350688098]
We introduce an open-domain topic classification system that accepts user-defined taxonomy in real time.
Users will be able to classify a text snippet with respect to any candidate labels they want, and get instant response from our web interface.
arXiv Detail & Related papers (2023-06-29T20:25:28Z) - SelfDocSeg: A Self-Supervised vision-based Approach towards Document
Segmentation [15.953725529361874]
Document layout analysis is a known problem to the documents research community.
With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain.
We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches.
arXiv Detail & Related papers (2023-05-01T12:47:55Z) - Unifying Vision, Text, and Layout for Universal Document Processing [105.36490575974028]
We propose a Document AI model which unifies text, image, and layout modalities together with varied task formats, including document understanding and generation.
Our method sets the state-of-the-art on 9 Document AI tasks, e.g., document understanding and QA, across diverse data domains like finance reports, academic papers, and websites.
arXiv Detail & Related papers (2022-12-05T22:14:49Z) - Synthetic Document Generator for Annotation-free Layout Recognition [15.657295650492948]
We describe a synthetic document generator that automatically produces realistic documents with labels for spatial positions, extents and categories of layout elements.
We empirically illustrate that a deep layout detection model trained purely on the synthetic documents can match the performance of a model that uses real documents.
arXiv Detail & Related papers (2021-11-11T01:58:44Z) - SelfDoc: Self-Supervised Document Representation Learning [46.22910270334824]
SelfDoc is a task-agnostic pre-training framework for document image understanding.
Our framework exploits the positional, textual, and visual information of every semantically meaningful component in a document.
It achieves superior performance on multiple downstream tasks with significantly fewer document images used in the pre-training stage compared to previous works.
arXiv Detail & Related papers (2021-06-07T04:19:49Z) - Framing Unpacked: A Semi-Supervised Interpretable Multi-View Model of
Media Frames [32.06056273913706]
We develop a novel semi-supervised model for understanding how news media frame political issues.
The model learns to embed local information about the events and related actors in a news article through an auto-encoding framework.
Our experiments show that our model outperforms previous models of frame prediction.
arXiv Detail & Related papers (2021-04-22T13:05:53Z) - DOC2PPT: Automatic Presentation Slides Generation from Scientific
Documents [76.19748112897177]
We present a novel task and approach for document-to-slide generation.
We propose a hierarchical sequence-to-sequence approach to tackle our task in an end-to-end manner.
Our approach exploits the inherent structures within documents and slides and incorporates paraphrasing and layout prediction modules to generate slides.
arXiv Detail & Related papers (2021-01-28T03:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.