Related papers: AIssistant: An Agentic Approach for Human--AI Collaborative Scientific Work on Reviews and Perspectives in Machine Learning

AIssistant: An Agentic Approach for Human--AI Collaborative Scientific Work on Reviews and Perspectives in Machine Learning

URL: http://arxiv.org/abs/2509.12282v1
Date: Sun, 14 Sep 2025 15:50:31 GMT
Title: AIssistant: An Agentic Approach for Human--AI Collaborative Scientific Work on Reviews and Perspectives in Machine Learning
Authors: Sasi Kiran Gaddipati, Farhana Keya, Gollam Rabby, Sören Auer,
Abstract summary: We present here the first experiments with AIssistant for perspective and review research papers in machine learning.<n>Our system integrates modular tools and agents for literature, section-wise experimentation, citation management, and automatic paper text generation.<n>Despite its effectiveness, we identify key limitations, including hallucinated citations, difficulty adapting to dynamic paper structures, and incomplete integration of multimodal content.
Score: 2.464267718050055
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Advances in AI-assisted research have introduced powerful tools for literature retrieval, hypothesis generation, experimentation, and manuscript preparation. However, systems remain fragmented and lack human-centred workflows. To address these gaps, we introduce AIssistant, an agentic, open-source Human-AI collaborative framework designed to simplify the end-to-end creation of scientific workflows. Since our development is still in an early stage, we present here the first experiments with AIssistant for perspective and review research papers in machine learning. Our system integrates modular tools and agents for literature synthesis, section-wise experimentation, citation management, and automatic LaTeX paper text generation, while maintaining human oversight at every stage to ensure accuracy, coherence, and scholarly rigour. We conducted a comprehensive evaluation across three layers: (1) Independent Human Review, following NeurIPS double-blind standards; (2) Automated LLM Review, using GPT-5 as a scalable human review proxy; and (3) Program Chair Oversight, where the chair monitors the entire review process and makes final validation and acceptance decisions. The results demonstrate that AIssistant improves drafting efficiency and thematic consistency. Nonetheless, Human-AI collaboration remains essential for maintaining factual correctness, methodological soundness, and ethical compliance. Despite its effectiveness, we identify key limitations, including hallucinated citations, difficulty adapting to dynamic paper structures, and incomplete integration of multimodal content.

Related papers

The Story is Not the Science: Execution-Grounded Evaluation of Mechanistic Interpretability Research [56.80927148740585]
We address the challenges of scalability and rigor by flipping the dynamic and developing AI agents as research evaluators.<n>We use mechanistic interpretability research as a testbed, build standardized research output, and develop MechEvalAgent.<n>Our work demonstrates the potential of AI agents to transform research evaluation and pave the way for rigorous scientific practices.
arXiv Detail & Related papers (2026-02-05T19:00:02Z)
Beyond Static Scoring: Enhancing Assessment Validity via AI-Generated Interactive Verification [0.4260312058817663]
Large Language Models (LLMs) challenge the validity of traditional open-ended assessments by blurring the lines of authorship.<n>This paper introduces a novel Human-AI Collaboration framework that enhances assessment integrity by combining rubric-based automated scoring with AI-generated, targeted follow-up questions.
arXiv Detail & Related papers (2025-12-14T08:13:53Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection [60.52240468810558]
We introduce CoCoNUTS, a content-oriented benchmark built upon a fine-grained dataset of AI-generated peer reviews.<n>We also develop CoCoDet, an AI review detector via a multi-task learning framework, to achieve more accurate and robust detection of AI involvement in review content.
arXiv Detail & Related papers (2025-08-28T06:03:11Z)
The next question after Turing's question: Introducing the Grow-AI test [51.56484100374058]
This study aims to extend the framework for assessing artificial intelligence, called GROW-AI.<n>GROW-AI is designed to answer the question "Can machines grow up?" -- a natural successor to the Turing Test.<n>The originality of the work lies in the conceptual transposition of the process of "growing" from the human world to that of artificial intelligence.
arXiv Detail & Related papers (2025-08-22T10:19:42Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
Human-AI Collaboration or Academic Misconduct? Measuring AI Use in Student Writing Through Stylometric Evidence [0.0]
This research investigates the use of authorship verification (AV) techniques to quantify AI assistance in academic writing.<n>We use three datasets - including a public dataset (PAN-14) and two from University of Melbourne students from various courses.<n>We develop an adapted Feature Vector Difference AV methodology to construct robust academic writing profiles for students.
arXiv Detail & Related papers (2025-05-13T00:36:36Z)
ScholarMate: A Mixed-Initiative Tool for Qualitative Knowledge Work and Information Sensemaking [2.1783708115247866]
We present ScholarMate, an interactive system designed to augment qualitative analysis by unifying AI assistance with human oversight.<n> ScholarMate enables researchers to dynamically arrange and interact with text snippets on a non-linear canvas, leveraging AI for theme suggestions, multi-level summarization, and evidence-based theme naming.
arXiv Detail & Related papers (2025-04-19T21:11:40Z)
Survey on Vision-Language-Action Models [0.2636873872510828]
This work does not represent original research, but highlights how AI can help automate literature reviews.<n>Future research will focus on developing a structured framework for AI-assisted literature reviews.
arXiv Detail & Related papers (2025-02-07T11:56:46Z)
Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions [101.67121669727354]
Recent advancements in AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. The lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. We introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML)
arXiv Detail & Related papers (2024-06-13T16:03:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.