A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver
Interaction in Los Angeles
- URL: http://arxiv.org/abs/2402.01703v3
- Date: Fri, 9 Feb 2024 05:25:11 GMT
- Title: A Multi-Perspective Machine Learning Approach to Evaluate Police-Driver
Interaction in Los Angeles
- Authors: Benjamin A.T. Grahama, Lauren Brown, Georgios Chochlakis, Morteza
Dehghani, Raquel Delerme, Brittany Friedman, Ellie Graeden, Preni Golazizian,
Rajat Hebbar, Parsa Hejabi, Aditya Kommineni, Mayag\"uez Salinas, Michael
Sierra-Ar\'evalo, Jackson Trager, Nicholas Weller, and Shrikanth Narayanan
- Abstract summary: Police officers, the most visible and contacted agents of the state, interact with the public more than 20 million times a year during traffic stops.
Body-worn cameras (BWCs) are lauded as a means to enhance police accountability and improve police-public interactions.
This article proposes an approach to developing new multi-perspective, multimodal machine learning (ML) tools to analyze the audio, video, and transcript information from this BWC footage.
- Score: 18.379058918856717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interactions between the government officials and civilians affect public
wellbeing and the state legitimacy that is necessary for the functioning of
democratic society. Police officers, the most visible and contacted agents of
the state, interact with the public more than 20 million times a year during
traffic stops. Today, these interactions are regularly recorded by body-worn
cameras (BWCs), which are lauded as a means to enhance police accountability
and improve police-public interactions. However, the timely analysis of these
recordings is hampered by a lack of reliable automated tools that can enable
the analysis of these complex and contested police-public interactions. This
article proposes an approach to developing new multi-perspective, multimodal
machine learning (ML) tools to analyze the audio, video, and transcript
information from this BWC footage. Our approach begins by identifying the
aspects of communication most salient to different stakeholders, including both
community members and police officers. We move away from modeling approaches
built around the existence of a single ground truth and instead utilize new
advances in soft labeling to incorporate variation in how different observers
perceive the same interactions. We argue that this inclusive approach to the
conceptualization and design of new ML tools is broadly applicable to the study
of communication and development of analytic tools across domains of human
interaction, including education, medicine, and the workplace.
Related papers
- Auto-Drafting Police Reports from Noisy ASR Outputs: A Trust-Centered LLM Approach [11.469965123352287]
This study presents an innovative AI-driven system designed to generate police report drafts from complex, noisy, and multi-role dialogue data.
Our approach intelligently extracts key elements of law enforcement interactions and includes them in the draft.
This frame-work holds the potential to transform the reporting process, ensur- ing greater oversight, consistency, and fairness in future policing practices.
arXiv Detail & Related papers (2025-02-11T16:27:28Z) - Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents [23.960719833886984]
M-CoDAL is a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations.
Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images.
Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation.
arXiv Detail & Related papers (2024-10-18T03:26:06Z) - Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation [70.52558242336988]
We focus on predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion.
In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation.
We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a multimodal transcript''
arXiv Detail & Related papers (2024-09-13T18:28:12Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Datastore Design for Analysis of Police Broadcast Audio at Scale [0.0]
We describe preliminary work towards enabling Speech Emotion Recognition (SER) in an analysis of the Chicago Police Department's (CPD)
We demonstrate the pipelined creation of a datastore to enable a multimodal analysis of composed raw audio files.
arXiv Detail & Related papers (2023-10-25T19:52:19Z) - Re-mine, Learn and Reason: Exploring the Cross-modal Semantic
Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task.
We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - The Frost Hollow Experiments: Pavlovian Signalling as a Path to
Coordination and Communication Between Agents [7.980685978549764]
This paper contributes a multi-faceted study into what we term Pavlovian signalling.
We establish Pavlovian signalling as a natural bridge between fixed signalling paradigms and fully adaptive communication learning.
Our results point to an actionable, constructivist path towards continual communication learning between reinforcement learning agents.
arXiv Detail & Related papers (2022-03-17T17:49:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.