Advancing Cyber Incident Timeline Analysis Through Rule Based AI and   Large Language Models
        - URL: http://arxiv.org/abs/2409.02572v3
- Date: Wed, 25 Sep 2024 06:50:29 GMT
- Title: Advancing Cyber Incident Timeline Analysis Through Rule Based AI and   Large Language Models
- Authors: Fatma Yasmine Loumachi, Mohamed Chahine Ghanem, 
- Abstract summary: This paper introduces a novel framework, GenDFIR, which combines Rule-Based Artificial Intelligence (R-BAI) algorithms with Large Language Models (LLMs) to enhance and automate the Timeline Analysis process.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Timeline Analysis (TA) plays a crucial role in Timeline Forensics (TF) within the field of Digital Forensics (DF). It focuses on examining and analyzing time-based digital artefacts, such as timestamps derived from event logs, file metadata, and other relevant data, to correlate events linked to cyber incidents and reconstruct their chronological sequence. Traditional tools often struggle to efficiently handle the large volume and variety of data generated during DF investigations and Incident Response (IR) processes. This paper introduces a novel framework, GenDFIR, which combines Rule-Based Artificial Intelligence (R-BAI) algorithms with Large Language Models (LLMs) to enhance and automate the TA process. The proposed approach consists of two key stages: (1) R-BAI is used to identify and select anomalous digital artefacts based on predefined rules. (2) The selected artefacts are then transformed into embeddings for processing by an LLM with the assistance of a Retrieval-Augmented Generation (RAG) agent. The LLM uses its capabilities to perform automated TA on the artefacts and predict potential incident outcomes. To validate the framework, we evaluated its performance, efficiency, and reliability. Several metrics were applied to simulated cyber incident scenarios, which were presented as forensic case documents. Our findings demonstrate the significant potential of integrating R-BAI and LLMs for TA. This innovative approach underscores the power of Generative AI (GenAI), particularly LLMs, and opens up new possibilities for advanced threat detection and incident reconstruction, marking a significant advancement in the field. 
 
      
        Related papers
        - Respecting Temporal-Causal Consistency: Entity-Event Knowledge Graphs   for Retrieval-Augmented Generation [69.45495166424642]
 We develop a robust and discriminative QA benchmark to measure temporal, causal, and character consistency understanding in narrative documents.<n>We then introduce Entity-Event RAG (E2RAG), a dual-graph framework that keeps separate entity and event subgraphs linked by a bipartite mapping.<n>Across ChronoQA, our approach outperforms state-of-the-art unstructured and KG-based RAG baselines, with notable gains on causal and character consistency queries.
 arXiv  Detail & Related papers  (2025-06-06T10:07:21Z)
- Deepfake Forensic Analysis: Source Dataset Attribution and Legal   Implications of Synthetic Media Manipulation [5.764826667785188]
 Synthetic media generated by Generative Adrial Networks (GANs) pose challenges in verifying authenticity and tracing dataset origins.<n>This paper introduces a novel forensic framework for identifying the training dataset (e.g., CelebA or FFHQ) of GAN-generated images through interpretable feature analysis.
 arXiv  Detail & Related papers  (2025-05-16T10:47:18Z)
- Forecasting from Clinical Textual Time Series: Adaptations of the   Encoder and Decoder Language Model Families [6.882042556551609]
 We introduce the forecasting problem from textual time series, where timestamped clinical findings serve as the primary input for prediction.
We evaluate a diverse suite of models, including fine-tuned decoder-based large language models and encoder-based transformers.
 arXiv  Detail & Related papers  (2025-04-14T15:48:56Z)
- See it, Think it, Sorted: Large Multimodal Models are Few-shot Time   Series Anomaly Analyzers [23.701716999879636]
 Time series anomaly detection (TSAD) is becoming increasingly vital due to the rapid growth of time series data.
We introduce a pioneering framework called the Time Series Anomaly Multimodal Analyzer (TAMA) to enhance both the detection and interpretation of anomalies.
 arXiv  Detail & Related papers  (2024-11-04T10:28:41Z)
- Enhancing Legal Case Retrieval via Scaling High-quality Synthetic   Query-Candidate Pairs [67.54302101989542]
 Legal case retrieval aims to provide similar cases as references for a given fact description.
Existing works mainly focus on case-to-case retrieval using lengthy queries.
Data scale is insufficient to satisfy the training requirements of existing data-hungry neural models.
 arXiv  Detail & Related papers  (2024-10-09T06:26:39Z)
- Metadata Matters for Time Series: Informative Forecasting with   Transformers [70.38241681764738]
 We propose a Metadata-informed Time Series Transformer (MetaTST) for time series forecasting.
To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates.
A Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information.
 arXiv  Detail & Related papers  (2024-10-04T11:37:55Z)
- Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
 Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
 arXiv  Detail & Related papers  (2024-07-17T20:01:21Z)
- State-Space Modeling in Long Sequence Processing: A Survey on Recurrence   in the Transformer Era [59.279784235147254]
 This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing.
The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
 arXiv  Detail & Related papers  (2024-06-13T12:51:22Z)
- PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly   Detection [51.20479454379662]
 We propose a.
 Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
 arXiv  Detail & Related papers  (2024-06-04T13:51:08Z)
- Can Foundational Large Language Models Assist with Conducting   Pharmaceuticals Manufacturing Investigations? [0.0]
 We focus on a specific use case, pharmaceutical manufacturing investigations.
We propose that leveraging historical records of manufacturing incidents and deviations can be beneficial for addressing and closing new cases.
We show that semantic search on vector embedding of deviation descriptions can be used to identify similar records.
 arXiv  Detail & Related papers  (2024-04-24T00:56:22Z)
- System for systematic literature review using multiple AI agents:
  Concept and an empirical evaluation [5.194208843843004]
 We introduce a novel multi-AI agent model designed to fully automate the process of conducting Systematic Literature Reviews.
The model operates through a user-friendly interface where researchers input their topic.
It generates a search string used to retrieve relevant academic papers.
The model then autonomously summarizes the abstracts of these papers.
 arXiv  Detail & Related papers  (2024-03-13T10:27:52Z)
- Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
 Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
 arXiv  Detail & Related papers  (2024-03-12T13:31:14Z)
- Position: What Can Large Language Models Tell Us about Time Series   Analysis [69.70906014827547]
 We argue that current large language models (LLMs) have the potential to revolutionize time series analysis.
Such advancement could unlock a wide range of possibilities, including time series modality switching and question answering.
 arXiv  Detail & Related papers  (2024-02-05T04:17:49Z)
- It Is Time To Steer: A Scalable Framework for Analysis-driven Attack   Graph Generation [50.06412862964449]
 Attack Graph (AG) represents the best-suited solution to support cyber risk assessment for multi-step attacks on computer networks.
Current solutions propose to address the generation problem from the algorithmic perspective and postulate the analysis only after the generation is complete.
This paper rethinks the classic AG analysis through a novel workflow in which the analyst can query the system anytime.
 arXiv  Detail & Related papers  (2023-12-27T10:44:58Z)
- AART: AI-Assisted Red-Teaming with Diverse Data Generation for New
  LLM-powered Applications [5.465142671132731]
 Adversarial testing of large language models (LLMs) is crucial for their safe and responsible deployment.
We introduce a novel approach for automated generation of adversarial evaluation datasets to test the safety of LLM generations on new downstream applications.
We call it AI-assisted Red-Teaming (AART) - an automated alternative to current manual red-teaming efforts.
 arXiv  Detail & Related papers  (2023-11-14T23:28:23Z)
- A Comprehensive Analysis of the Role of Artificial Intelligence and
  Machine Learning in Modern Digital Forensics and Incident Response [0.0]
 The goal is to look closely at how AI and ML techniques are used in digital forensics and incident response.
This endeavour digs far beneath the surface to unearth the intricate ways AI-driven methodologies are shaping these crucial facets of digital forensics practice.
Ultimately, this paper underscores the significance of AI and ML integration in digital forensics, offering insights into their benefits, drawbacks, and broader implications for tackling modern cyber threats.
 arXiv  Detail & Related papers  (2023-09-13T16:23:53Z)
- TSGM: A Flexible Framework for Generative Modeling of Synthetic Time   Series [61.436361263605114]
 Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
 arXiv  Detail & Related papers  (2023-05-19T10:11:21Z)
- RESAM: Requirements Elicitation and Specification for Deep-Learning
  Anomaly Models with Applications to UAV Flight Controllers [24.033936757739617]
 We present RESAM, a requirements process that integrates knowledge from domain experts, discussion forums, and formal product documentation.
We present a case-study based on a flight control system for small Uncrewed Aerial Systems and demonstrate that its use guides the construction of effective anomaly detection models.
 arXiv  Detail & Related papers  (2022-07-18T18:09:59Z)
- A Review of Open Source Software Tools for Time Series Analysis [0.0]
 This paper describes a typical Time Series Analysis (TSA) framework with an architecture and lists the main features of TSA framework.
Overall, this article considered 60 time series analysis tools, and 32 of which provided forecasting modules, and 21 packages included anomaly detection.
 arXiv  Detail & Related papers  (2022-03-10T07:12:20Z)
- Anomaly Detection for Aggregated Data Using Multi-Graph Autoencoder [21.81622481466591]
 We focus on creating an Anomaly detection models for system logs.
We present a thorough analysis of the aggregated data and the relationships between aggregated events.
We propose Multiple-graphs autoencoder MGAE, a novel convolutional graphs-autoencoder model.
 arXiv  Detail & Related papers  (2021-01-11T17:38:42Z)
- Learning summary features of time series for likelihood free inference [93.08098361687722]
 We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
 arXiv  Detail & Related papers  (2020-12-04T19:21:37Z)
- A Causal-based Framework for Multimodal Multivariate Time Series
  Validation Enhanced by Unsupervised Deep Learning as an Enabler for Industry
  4.0 [0.0]
 A conceptual validation framework for multi-level contextual anomaly detection is developed.
A Long Short-Term Memory Autoencoder is successfully evaluated to validate the learnt representation of contexts associated to multiple assets of a blast furnace.
A research roadmap is identified to combine causal discovery and representation learning as an enabler for unsupervised Root Cause Analysis applied to the process industry.
 arXiv  Detail & Related papers  (2020-08-05T14:48:02Z)
- Meta-learning framework with applications to zero-shot time-series
  forecasting [82.61728230984099]
 This work provides positive evidence using a broad meta-learning framework.
 residual connections act as a meta-learning adaptation mechanism.
We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
 arXiv  Detail & Related papers  (2020-02-07T16:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.