A Qualitative Study of User Perception of M365 AI Copilot
- URL: http://arxiv.org/abs/2503.17661v2
- Date: Sun, 30 Mar 2025 01:08:08 GMT
- Title: A Qualitative Study of User Perception of M365 AI Copilot
- Authors: Muneera Bano, Didar Zowghi, Jon Whittle, Liming Zhu, Andrew Reeson, Rob Martin, Jen Parsons,
- Abstract summary: We present results from a six month trial of M365 Copilot conducted at our organisation in 2024.<n>The study explored user perceptions of M365 Copilot's effectiveness, productivity impact, evolving expectations, ethical concerns, and overall satisfaction.<n>While M365 Copilot demonstrated value in specific operational areas, its broader impact remained constrained by usability limitations and the need for human oversight.
- Score: 11.684396657620981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adopting AI copilots in professional workflows presents opportunities for enhanced productivity, efficiency, and decision making. In this paper, we present results from a six month trial of M365 Copilot conducted at our organisation in 2024. A qualitative interview study was carried out with 27 participants. The study explored user perceptions of M365 Copilot's effectiveness, productivity impact, evolving expectations, ethical concerns, and overall satisfaction. Initial enthusiasm for the tool was met with mixed post trial experiences. While some users found M365 Copilot beneficial for tasks such as email coaching, meeting summaries, and content retrieval, others reported unmet expectations in areas requiring deeper contextual understanding, reasoning, and integration with existing workflows. Ethical concerns were a recurring theme, with users highlighting issues related to data privacy, transparency, and AI bias. While M365 Copilot demonstrated value in specific operational areas, its broader impact remained constrained by usability limitations and the need for human oversight to validate AI generated outputs.
Related papers
- Generative AI in Knowledge Work: Perception, Usefulness, and Acceptance of Microsoft 365 Copilot [1.3362350462539474]
We assess usefulness, ease of use, output quality and reliability, and usefulness for typical knowledge-work activities.<n>Copilot is widely viewed as user-friendly and technically reliable, with greatest added value for clearly structured, text-based tasks.
arXiv Detail & Related papers (2026-02-20T19:32:58Z) - "Can You Tell Me?": Designing Copilots to Support Human Judgement in Online Information Seeking [2.901725877154321]
This paper introduces an LLM-based conversational copilot designed to scaffold information evaluation.<n>Our mixed-methods analysis reveals that users engaged deeply with the copilot, demonstrating metacognitive reflection.<n>The copilot did not significantly improve answer correctness or search engagement, largely due to a "time-on-chat vs. exploration" trade-off.
arXiv Detail & Related papers (2026-01-16T13:33:54Z) - Developer Productivity With and Without GitHub Copilot: A Longitudinal Mixed-Methods Case Study [1.745178216563833]
This study investigates the real-world impact of the generative AI (GenAI) tool GitHub Copilot on developer activity and perceived productivity.<n>We analyzed 26,317 unique non-merge commits from 703 of NAV IT's GitHub repositories over a two-year period.<n>Our analysis of activity metrics revealed that individuals who used Copilot were consistently more active than non-users, even prior to Copilot's introduction.
arXiv Detail & Related papers (2025-09-24T17:55:56Z) - A Human Centric Requirements Engineering Framework for Assessing Github Copilot Output [0.0]
GitHub Copilot introduces new challenges in how these software tools address human needs.<n>I analyzed GitHub Copilot's interaction with users through its chat interface.<n>I established a human-centered requirements framework with clear metrics to evaluate these qualities.
arXiv Detail & Related papers (2025-08-05T21:33:23Z) - Human Decision-making is Susceptible to AI-driven Manipulation [87.24007555151452]
AI systems may exploit users' cognitive biases and emotional vulnerabilities to steer them toward harmful outcomes.<n>This study examined human susceptibility to such manipulation in financial and emotional decision-making contexts.
arXiv Detail & Related papers (2025-02-11T15:56:22Z) - Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks [6.520709313101523]
This work investigates cognitive bias identification in high-stake decision making process by human experts.
We propose bias-aware AI-augmented workflow that surpass human judgment.
In our experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models.
arXiv Detail & Related papers (2024-11-13T10:42:11Z) - SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories [55.161075901665946]
Super aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) research repositories.
Our benchmark comprises three distinct problem sets: 45 end-to-end problems with annotated expert solutions, 152 sub problems derived from the expert set that focus on specific challenges, and 602 automatically generated problems for larger-scale development.
We show that state-of-the-art approaches struggle to solve these problems with the best model (GPT-4o) solving only 16.3% of the end-to-end set, and 46.1% of the scenarios.
arXiv Detail & Related papers (2024-09-11T17:37:48Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Design and evaluation of AI copilots -- case studies of retail copilot templates [2.7274834772504954]
Building a successful AI copilot requires a systematic approach.
This paper is divided into two sections, covering the design and evaluation of a copilot respectively.
arXiv Detail & Related papers (2024-06-17T17:31:33Z) - Adaptive Control in Assistive Application -- A Study Evaluating Shared Control by Users with Limited Upper Limb Mobility [4.858212893290674]
This study assesses an adaptive Degrees of Freedom control method specifically tailored for individuals with upper limb impairments.
It employs a between-subjects analysis with 24 participants, conducting 81 trials across three distinct input devices in a realistic everyday-task setting.
arXiv Detail & Related papers (2024-06-10T08:36:55Z) - From User Surveys to Telemetry-Driven Agents: Exploring the Potential of
Personalized Productivity Solutions [22.443000599725313]
We first conducted a survey with 363 participants, exploring various aspects of productivity, communication style, agent approach, personality traits, personalization, and privacy.
We developed a GPT-4 powered personalized productivity agent that utilizes telemetry data from information workers to provide tailored assistance.
Our findings highlight the importance of user-centric design, adaptability, and the balance between personalization and privacy in AI-assisted productivity tools.
arXiv Detail & Related papers (2024-01-17T04:20:10Z) - Improving Task Instructions for Data Annotators: How Clear Rules and Higher Pay Increase Performance in Data Annotation in the AI Economy [0.0]
The global surge in AI applications is transforming industries, leading to displacement and complementation of existing jobs, while also giving rise to new employment opportunities.
Data annotation, encompassing the labelling of images or annotating of texts by human workers, crucially influences the quality of a dataset directly influences the quality of AI models trained on it.
This paper delves into the economics of data annotation, with a specific focus on the impact of task instruction design and monetary incentives on data quality and costs.
arXiv Detail & Related papers (2023-12-22T09:50:57Z) - Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI [4.0247545547103325]
Fairness is a growing concern for high-risk decision-making using Artificial Intelligence (AI)
There is no universally accepted fairness measure, fairness is context-dependent, and there might be conflicting perspectives on what is considered fair.
Our work follows an approach where stakeholders can give feedback on specific decision instances and their outcomes with respect to their fairness.
arXiv Detail & Related papers (2023-12-13T11:17:29Z) - Exploring Federated Unlearning: Review, Comparison, and Insights [101.64910079905566]
federated unlearning enables the selective removal of data from models trained in federated systems.<n>This paper examines existing federated unlearning approaches, examining their algorithmic efficiency, impact on model accuracy, and effectiveness in preserving privacy.<n>We propose the OpenFederatedUnlearning framework, a unified benchmark for evaluating federated unlearning methods.
arXiv Detail & Related papers (2023-10-30T01:34:33Z) - Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow [49.724842920942024]
Industries such as finance, meteorology, and energy generate vast amounts of data daily.
We propose Data-Copilot, a data analysis agent that autonomously performs querying, processing, and visualization of massive data tailored to diverse human requests.
arXiv Detail & Related papers (2023-06-12T16:12:56Z) - Assisting Human Decisions in Document Matching [52.79491990823573]
We devise a proxy matching task that allows us to evaluate which kinds of assistive information improve decision makers' performance.
We find that providing black-box model explanations reduces users' accuracy on the matching task.
On the other hand, custom methods that are designed to closely attend to some task-specific desiderata are found to be effective in improving user performance.
arXiv Detail & Related papers (2023-02-16T17:45:20Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Let's Go to the Alien Zoo: Introducing an Experimental Framework to
Study Usability of Counterfactual Explanations for Machine Learning [6.883906273999368]
Counterfactual explanations (CFEs) have gained traction as a psychologically grounded approach to generate post-hoc explanations.
We introduce the Alien Zoo, an engaging, web-based and game-inspired experimental framework.
As a proof of concept, we demonstrate the practical efficacy and feasibility of this approach in a user study.
arXiv Detail & Related papers (2022-05-06T17:57:05Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.