Skip to main content

This website only uses technically necessary cookies. They will be deleted at the latest when you close your browser. To learn more, please read our Privacy Policy.

DE EN
Login
Logo, to home
  1. You are here:
  2. AVIATOR: A MITRE Emulation Plan-Derived Living Dataset for Advanced Persistent Threat Detection and Investigation
...

    Dataset: AVIATOR: A MITRE Emulation Plan-Derived Living Dataset for Advanced Persistent Threat Detection and Investigation

    • RADAR Metadata
    • Content
    • Statistics
    • Technical Metadata
    Alternate identifier:
    -
    Related identifier:
    (Is Supplement To) 10.1109/BigData62323.2024.10826006 - DOI
    Creator/Author:
    Liu, Qi https://orcid.org/0000-0002-9334-953X [Karlsruhe Institute of Technology (KIT)]
    Contributors:
    -
    Title:
    AVIATOR: A MITRE Emulation Plan-Derived Living Dataset for Advanced Persistent Threat Detection and Investigation
    Additional titles:
    -
    Description:
    (Abstract) With the growing trend for developing new detection and investigation systems for Advanced Persistent Threat (APT), the urgent issue of lacking sound and authentic datasets becomes more visible. New datasets for research on APT detection and investigation have been released over the past few years i... With the growing trend for developing new detection and investigation systems for Advanced Persistent Threat (APT), the urgent issue of lacking sound and authentic datasets becomes more visible. New datasets for research on APT detection and investigation have been released over the past few years in an accelerated manner. Yet, our examination of the existing datasets yields the finding that the gap between these datasets’ attack scenarios and real-world APT attacks is significant. Recognizing the flaws of prior datasets particularly in terms of attack scenario complexity and authenticity, we develop a novel sound dataset called Aviator, which is backed by MITRE emulation plans. The well-known organization MITRE has released nearly a dozen emulation plans, which closely reproduce APT groups’ real-world attack campaigns observed in the past. However MITRE has not published any datasets. Thus, we resort to stringently implementing these emulation plans. Further, we extend these emulation plans to include an industrial control system and attack steps on it, mimicking APT groups most known for their attacks against critical infrastructures in the past. Comparing to existing datasets, our dataset Aviator has the highest attack scenario complexity and authenticity. Moreover, Aviator is designed with dataset operability, usability, reproducibility and extensibility in mind, for which existing datasets lag far behind. That is, along with the Aviator dataset, we also provide log shipping tools, log parsing tools, and logging configuration files to encourage other researchers to make their own datasets, which may better suit the evaluation of their detection systems. Besides, we would add more log types in future versions of our dataset Aviator. We are committed to maintaining Aviator as a living dataset.

    With the growing trend for developing new detection and investigation systems for Advanced Persistent Threat (APT), the urgent issue of lacking sound and authentic datasets becomes more visible. New datasets for research on APT detection and investigation have been released over the past few years in an accelerated manner. Yet, our examination of the existing datasets yields the finding that the gap between these datasets’ attack scenarios and real-world APT attacks is significant. Recognizing the flaws of prior datasets particularly in terms of attack scenario complexity and authenticity, we develop a novel sound dataset called Aviator, which is backed by MITRE emulation plans. The well-known organization MITRE has released nearly a dozen emulation plans, which closely reproduce APT groups’ real-world attack campaigns observed in the past. However MITRE has not published any datasets. Thus, we resort to stringently implementing these emulation plans. Further, we extend these emulation plans to include an industrial control system and attack steps on it, mimicking APT groups most known for their attacks against critical infrastructures in the past. Comparing to existing datasets, our dataset Aviator has the highest attack scenario complexity and authenticity. Moreover, Aviator is designed with dataset operability, usability, reproducibility and extensibility in mind, for which existing datasets lag far behind. That is, along with the Aviator dataset, we also provide log shipping tools, log parsing tools, and logging configuration files to encourage other researchers to make their own datasets, which may better suit the evaluation of their detection systems. Besides, we would add more log types in future versions of our dataset Aviator. We are committed to maintaining Aviator as a living dataset.

    Show all
    Keywords:
    Datensatz
    Advanced Persistent Threat emulation
    data provenance analysis
    auditing
    logging
    Related information:
    -
    Language:
    English
    Publishers:
    Karlsruhe Institute of Technology
    Production year:
    2024
    Subject areas:
    Computer Science
    Resource type:
    Dataset
    Data source:
    -
    Software used:
    -
    Data processing:
    -
    Publication year:
    2025
    Rights holders:
    Karlsruhe Institute of Technology
    Funding:
    Helmholtz-Gemeinschaft - (37.12.01)
    Helmholtz-Gemeinschaft - (46.23.02)
    Karlsruhe Institute of Technology
    Show all Show less
    Name Storage Metadata Upload Action
    Status:
    Published
    Uploaded by:
    72904bf710abeefd03f3ce1779041e37
    Created on:
    2024-09-30
    Archiving date:
    2025-03-03
    Archive size:
    109.3 GB
    Archive creator:
    04776b2a56abc08138e1cfae264e938e
    Archive checksum:
    4d03f85f6d65b6e3849646fac4b5d734 (MD5)
    Embargo period:
    -
    DOI: 10.35097/8s5b0u5yqgfs2y0d
    Publication date: 2025-03-03
    Download Dataset
    Download (109.3 GB)

    Download Metadata
    Statistics
    0
    Views
    0
    Downloads
    Rights statement for the dataset
    This work is licensed under
    CC BY 4.0
    CC icon
    Cite Dataset
    Liu, Qi (2025): AVIATOR: A MITRE Emulation Plan-Derived Living Dataset for Advanced Persistent Threat Detection and Investigation. Karlsruhe Institute of Technology. DOI: 10.35097/8s5b0u5yqgfs2y0d
    • About the Repository
    • Privacy Policy
    • Terms and Conditions
    • Legal Notices
    • Accessibility Declaration
    powered by RADAR
    1.22.10 (f) / 1.16.2 (b) / 1.22.4 (i)

    RADAR4KIT ist ein über das Internet nutzbarer Dienst für die Archivierung und Publikation von Forschungsdaten aus abgeschlossenen wissenschaftlichen Studien und Projekten für Forschende des KIT. Betreiber ist das Karlsruher Institut für Technologie (KIT). RADAR4KIT setzt auf dem von FIZ Karlsruhe angebotenen Dienst RADAR auf. Die Speicherung der Daten findet ausschließlich auf IT-Infrastruktur des KIT am Steinbuch Centre for Computing (SCC) statt.

    Eine inhaltliche Bewertung und Qualitätsprüfung findet ausschließlich durch die Datengeberinnen und Datengeber statt.

    1. Das Nutzungsverhältnis zwischen Ihnen („Datennutzerin“ bzw. „Datennutzer“) und dem KIT erschöpft sich im Download von Datenpaketen oder Metadaten. Das KIT behält sich vor, die Nutzung von RADAR4KIT einzuschränken oder den Dienst ganz einzustellen.
    2. Sofern Sie sich als Datennutzerin oder als Datennutzer registrieren lassen bzw. über Shibboleth legitimieren, kann Ihnen seitens der Datengeberin oder des Datengebers Zugriff auch auf unveröffentlichte Dokumente gewährt werden.
    3. Den Schutz Ihrer persönlichen Daten erklären die Datenschutzbestimmungen.
    4. Das KIT übernimmt für Richtigkeit, Aktualität und Zuverlässigkeit der bereitgestellten Inhalte keine Gewährleistung und Haftung, außer im Fall einer zwingenden gesetzlichen Haftung.
    5. Das KIT stellt Ihnen als Datennutzerin oder als Datennutzer für das Recherchieren in RADAR4KIT und für das Herunterladen von Datenpaketen keine Kosten in Rechnung.
    6. Sie müssen die mit dem Datenpaket verbundenen Lizenzregelungen einhalten.