Organization sPecific Threat Intelligence Mining and sharing

Informacje na temat projektu

OPTIMA

Identyfikator umowy o grant: 101063107

DOI

10.3030/101063107

Projekt został zamknięty

Data podpisania przez KE 11 Lipca 2022

Data rozpoczęcia 1 Grudnia 2022

Data zakończenia 31 Marca 2025

Finansowanie w ramach

Marie Skłodowska-Curie Actions (MSCA)

Koszt całkowity

Brak danych

Wkład UE

€ 188 590,08

Koordynowany przez

UNIVERSITA DEGLI STUDI DI PADOVA
Italy

Periodic Reporting for period 1 - OPTIMA (Organization sPecific Threat Intelligence Mining and sharing)

Okres sprawozdawczy: 2022-12-01 do 2025-03-31

Digitalisation is creating new opportunities in finance, healthcare, industrial control systems, network security, and AI-driven cybersecurity, but it also introduces critical risks. These include cyber threats, adversarial AI attacks, privacy concerns, and regulatory challenges. To maximize the benefits while mitigating risks, organisations must adopt explainable AI, privacy-preserving federated learning, secure blockchain integration, and proactive cyber threat intelligence mechanisms. Strengthening cybersecurity frameworks with robust attack detection, secure data-sharing, and adversarial resilience is essential to ensure a secure digital future. Thus, the OPTIMA project (Organization sPecific Threat Intelligence Mining and sharing) aimed to design techniques and tools for the extraction of Threat Intelligence targeted to organizations using ML algorithms, and effectively share attack records using privacy-preserving methods.

The Research & Innovation Objectives (RIO) of the project are as follows:
1. RIO1-To develop techniques for automatic extraction of threat intelligence using OSINT data for multiple institutions (eg., health care, finance, IoT, education) using deep learning approaches.
2. RIO2-To create a novel automated system to derive Indicator of Compromise (IOC) based on word embedding and syntactic dependencies of words to identify unseen IOCs. Utilizing the extracted IOCs a threat index will be estimated to define the impact of threat and attack trends across individual organizations;
3. RIO3-To build a system by integrating cryptographic tools and Federated learning which will enable an organization to anonymously share threat logs with different parties in a privacy-preserving manner.

The OPTIMA project (Organization-sPecific Threat Intelligence Mining and shAring) developed advanced AI-driven tools and frameworks to generate, analyze, and securely share cyber threat intelligence (CTI) tailored to organizational needs. The core outcome, OSTIS, enables organization-specific CTI generation through a dedicated crawler and NLP pipeline that extracts threat data from reliable sources, classifies it by domain (e.g. healthcare, finance), and visualizes attack patterns via knowledge graphs. Explainable AI tools like SHAP were integrated to interpret threat predictions and support trust in automation. Complementing OSTIS, we proposed SeCTIS, a privacy-preserving CTI sharing framework using Blockchain and Swarm Learning. SeCTIS ensures secure collaboration and verifiable trust among participants through Zero-Knowledge Proofs. The MoRSE and IntellBot systems advanced AI-based CTI delivery by deploying Retrieval-Augmented Generation (RAG) models to provide accurate, real-time cybersecurity insights. Additionally, our efforts in darknet traffic analysis, malware visualization, and multi-modal threat detection delivered interpretable models using SHAP, GradCAM, and LIME. In parallel, we addressed security in federated learning (FL) with tools like DLShield, SecDefender, and LFGuard, which detect low-quality or poisoned models and improve global accuracy while preserving privacy. Through these contributions, OPTIMA has enhanced both the granularity and trustworthiness of CTI across diverse domains, enabling proactive, explainable, and collaborative cybersecurity defense.

The OPTIMA project delivered a suite of advanced, AI-driven frameworks and tools for the collection, analysis, generation, and secure sharing of organization-specific Cyber Threat Intelligence (CTI).
(1) Threat Intelligence Generation and Analysis
-Developed OSTIS, a novel end-to-end framework that collects threat data from curated online sources using a custom web crawler.
-Applied deep learning models (BERT-based) for domain classification, achieving an F1-score of 0.93 and entity-relation extraction, reaching 0.95 and 0.89 F1-scores respectively.
-Introduced Explainable AI techniques (SHAP) to provide interpretability of CTI predictions, aiding human analysts in trust calibration.
-Constructed OSTIKG, an organization-specific threat knowledge graph, enabling contextual visualization of attack patterns, actors, and tools.
(2) CTI Sharing and Privacy-Preserving Collaboration
-Designed SeCTIS, a secure CTI sharing framework combining Blockchain and Swarm Learning.
-Integrated Zero Knowledge Proofs to assess data/model integrity and validate participant trustworthiness during collaborative CTI exchange.
- Demonstrated resistance to data inference and poisoning attacks through rigorous threat modeling.
(3) Explainable Cyber Threat Analysis
-Conducted darknet traffic analysis using SHAP, LIME, and counterfactuals across ISCXTor2016 and CIC-Darknet2020 datasets.
-Identified key discriminative features (e.g. Protocol, Source Port, IdleMax) and extracted malicious IPs, malware types, and TTPs such as MITRE’s T1071.
(4)Malware Detection and Visualization
-Proposed multi-modal deep learning fusion techniques combining visual features (e.g. entropy graphs, SimHash) for malware classification.
-Achieved 100% detection rate on benchmark datasets (Malhub, BIG2015) and integrated adversarial robustness using GAN-based retraining.
-Developed vDefender for hypervisor-layer introspection, achieving 95.8% F1-score in detecting new malware behaviors.
(5)Secure Federated Learning
-Addressed FL security via DLShield, SecDefender, and LFGuard, capable of detecting low-quality or poisoned client models.
-Demonstrated up to 24% improvement in source class recall and 22.8% reduction in attack success rate, with minimal degradation in global accuracy.
(6) LLM-based CTI Delivery Systems
-Introduced MoRSE and IntellBot, cybersecurity-focused Retrieval-Augmented Generation (RAG) chatbots, outperforming GPT-4 in answer correctness and relevance by over 10%, verified on 600 cybersecurity-specific queries.

Key Results and Alignment

1)OSTIS Framework and Threat Knowledge Graph:
-Developed a web-crawling pipeline and entity extraction system to collect and process threat data from trusted sources.
-Designed domain classification and relation extraction models achieving F1-scores of 0.93 and 0.89 respectively.
-Constructed the OSTIS Knowledge Graph (OSTIKG), enabling graph-based threat reasoning and visualization tailored to domains such as healthcare, ICS, IoT, etc.

2)RAG-based CTI Delivery Systems
-Developed two cybersecurity co-pilot systems, MoRSE and IntellBot, based on Retrieval-Augmented Generation (RAG) architecture and evaluated them on over 600 cybersecurity-specific queries.
-We demonstrate that MoRSE leverages its unique real-time cybersecurity keyword detection capability to enhance response accuracy by 10% compared to GPT-4, effectively addressing the critical need for timely and precise security analysis.
-IntellBot aggregates data from diverse sources such as threat reports, vulnerability databases, and CTI feeds. It achieves high relevance, with BERT scores > 0.8 and cosine similarity scores > 0.8–1.0 in Question Answer (QA) evaluations.

3)Secure CTI Sharing
-Developed SeCTIS a secure CTI sharing framework which adopts an Swarm-Learning (SL) Network to generate a CTI Model in a distributed manner collaboratively. Since SL does not require data to be shared with a central entity, this decentralization protects data privacy as the raw data never leaves the local node, ensuring the confidentiality of each organization’s data.
-Trust among participants and quality of CTI data: SeCTIS provides a process based on validator nodes to assess CTI data and model quality using reputation scores. In addition, through the Zero-Knowledge Proofs (ZKP) mechanism, validator activities can be verified ensuring that malicious entities cannot compromise the system. These mechanisms make SeCTIS also a collaborative trust framework.
-Interoperability and automation: SeCTIS provides a middleware that can manage heterogeneous data formats and establish a unique methodology to be employed. Indeed, only model parameters (weights and biases) are shared, not the raw data. Moreover, different ML frameworks can be used to train local models as long as they can produce compatible model parameters for aggregation, thus enhancing interoperability across different platforms and tools. Furthermore, automation is achieved through both the decentralized and autonomous model training and the automated validation of participants, thus reducing the need for manual oversight and intervention.
-Scalability: SeCTIS offers a scalable solution by distributing the workload of training models across multiple nodes which reduces the burden on any single entity and allows the network to expand as needed.

4)Explainable Threat Detection and Interpretation
-Integrated SHAP and LIME for interpretability of predictions across all AI models.
-Applied XAI techniques to darknet traffic analysis, revealing key features and adversary strategies mapped to MITRE ATT&CK knowledge base.

5)Advanced Malware Detection
-Proposed multimodal fusion approaches for malware detection using CNNs trained on grayscale, entropy, and SimHash image representations.
-Achieved near-perfect detection rates (F1 > 0.99) with robustness against adversarial attacks using Generative Adversarial Networks (GAN).

6)Federated Learning Security
-Designed FL-specific defenses including DLShield, SecDefender, and LFGuard.
-Demonstrated 22.8% reduction in attack success rate and 24% improvement in source class recall across benchmark datasets.

CTI Co-pilot for sharing CTI

CTI Generation Framework from Twitter

Organisation Specific Threat Intelligence Mining and Sharing

CTI sharing using Federated Learning

Secure CTI Sharing Framework using Swarm Learning and Zero Knowledge Proof

Periodic Reporting for period 1 - OPTIMA (Organization sPecific Threat Intelligence Mining and sharing)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony