Cyb3rSn0rlax
Social MediaGitHub
  • About Cyb3rSn0rlax
  • 🛡️ SOC Engineering
  • 🧞Building an Open SIEM From Scratch
    • 1. Introduction to Elastic Stack
      • a. Installing and configuring Elasticsearch
      • b. Installing and configuring Kibana
      • c. Installing and configuring Logstash
    • 2. Installing OpenDistro for Elasticsearch Plugins
    • 3. Installing ElastAlert
    • 4. ELK Stack: "L" is for Lord of the Stack
      • a- Event Parsing: Pipelines
      • b - Event Parsing : From Beats to Logstash
      • c- Event Normalization with ECS
    • 5. Alerting in ELK
    • 6. Building Detection Rules
    • 7. Metrics Reports & Dashboards
  • 🛡️A Primer to Detection Engineering Dimensions in a SOC Universe
    • Operationalization
    • Execution
    • Analytics
  • 😺GitHub Projects
    • ELK4QRadar
    • Automating ELK Health Check
  • 💾DFIR
    • DFIR-01 : $MFT
    • DFIR-02 : Journal Forensics
    • DFIR-03: RDP Authentication Artifacts
  • ☢️ DEATH : Detection Engineering And Threat Hunting
    • 🔑TA0006 : Credential Access
      • Detecting Remote Credentials Dumping via comsvcs.dll
    • 🦘TA0008 : Lateral Movement
      • Detecting Lateral Movement via Service Configuration Manager
      • Detecting CONTI CobaltStrike Lateral Movement Techniques - Part 1
      • Detecting CONTI CobaltStrike Lateral Movement Techniques - Part 2
  • 🔎Misc
    • Infosec Game-Sense
Powered by GitBook
On this page
  • Introduction
  • Event Visibility
  • Identify Critical Data Sources
  • Create a Collection Strategy
  • Log Storage
  • Data Observability
  • Event Traceability
  • Example 1 : Antivirus Data Traceability
  • Example 2 : Windows Authentication Event Traceability
  • Example 3 : Windows Process Creation and DNS Event Traceability
  1. A Primer to Detection Engineering Dimensions in a SOC Universe

Execution

The ability to collect, quantify, evaluate and enrich your data.

Introduction

SOC's ability to develop good detections for different techniques and tactics relies heavily on its ability to execute them. Short story is logging is the dark side of detection.

Evaluating your data quality and visibility to measure your detection execution capabilities is not an easy task. In this part we will be talking about two main drivers of this second dimension which are:

  • Event Visibility : Data prioritization, collection and processing.

    • Identify critical data sources.

    • Define your collection strategy.

    • Define your log storage approach.

    • Data Observability

  • Event Traceability: Data source quality and richness

    • Evaluate the quality of your data

Event Visibility

Data is air the detection breathes, without it, it is dead. Insuring you have good visibility over data sources that matter to you is crucial. There are 4 main things to keep in mind derived from 4 questions when you need to evaluate your visibility from a SOC perspective:

  • What do you need to collect ?

  • How you want to collect it ?

  • How are you planning on storing it ?

  • How could you know when you're not collecting it ?

Identify Critical Data Sources

Create a Collection Strategy

Your collection strategy can be driven by security operations like compliance requirements or threat detection like detecting post-exploitation techniques used in windows environments. Your collection strategy can also be helpful for tuning purposes to reduce alert fatigue, false positives, data storage costs and optimizing EPS licensing price. Here are some concepts to take in consideration.

  • Volume vs Relevance: Depending on what drives your log collection strategy (SecOps or Threat Detection or hybrid) if you did the previous step "Identify Critical Data Sources" you now know what do you need to collect and you can build a balanced and hybrid list of when you need to collect everything and when you will go with most relevant events for your use cases.

  • Log Retention Policy : Your log retention duration can be influenced by regulatory requirements and your detection needs like how far your analysts look back when they're investigating alerts or hunting threats, usage of historical correlations can also be impacted by your log retention policy. We will be talking about this next in Log Storage.

If you're looking for how to configure Windows Event Forwarding here is a great blog from Elastic on how to configure WEF/WEC :

Log Storage

Your SIEM's database type and log storage approach matters and affects your execution capabilities if you care about speed of your queries and availability of your data.

Database Types

There are two main types of SIEM databases for a logging use case:

  • Schema-on-Write

  • Schema-on-Read

Schema-on-Write databases define the schema of your data; i.e. fields, structure, and mappings of data at ingestion time but Schema-on-Read applies the schema at search time, meaning that if you run a query on a Schema-on-Write database type the search time decreases but the ingestion time increases because your data is already indexed mapped and the heavy work was done at ingestion time, however in the other hand Schema-on-Read would prioritize ingesting data to avoid data loss hence increasing your queries' search time.

Several SIEMs claim they can do both however trust but verify and going for a hybrid approach is recommended in a security monitoring use case, so here are some pros and cons for Schema-on-Write approach:

  • Pros:

    • Faster search

    • Less computing resources are needed

    • Easier correlations.

  • Cons:

    • Writing data to disk speed could be affected and data loss is a risk too (may impact forensic evidence compliance for a court case).

    • Requires knowledge of your data models

    • Difficult to handle unstructured data.

Reference :

Storage Architecture

Each SIEM vendor have their own methods for sizing a security monitoring solution storage capacity but most of them adopt the same approach internally, Hot, Warm and Cold architecture :

  • Hot: Most recent and active logs to monitor. These nodes are know to have fast disk writing capabilities (SSDs) and low storage capacity. Most analysts or threat hunters use a duration from last 15 minutes to last 7 days. Depending on your query rate and look back time, log retention on this type of nodes must preferably be set on 30 days or more.

  • Warm: Once past the time frame of the most use, logs can be moved from SSDs to slower but larger mediums like Hard Disk or Tape. These are typically stored for at least 90 days.

  • Cold: Beyond the first 90 days, the chances of needing a particular log file is slim, but not none. Cold storage is a cheap long term solution, but will take a long time to spool back up for use if needed.

Reference :

Data Observability

Do you get notified when a data source is down or a new one is integrated ? Can you tell when the data stream of last week is much less than the week before or the month before? Do you know when an important event is no longer collected from a specific asset? do you notice when an event field is no longer populated?

Observability is maintaining a data pipeline with minimal downtime and maximum reliability by running regular health checks. Data observability is important for your detection engineering and can be applied on many level

  • Index level : When an index stores much less data than the usual.

  • Log source type level : When a data source type like a firewall cluster or web servers stop sending events.

  • Asset level : When a log source stops sending data.

  • Event level : When an Event ID for example stops being recorded from a data source

  • Field level : When Process Command Line field for example stops being populated.

Event Traceability

After approaching Event Visibility and defining relevant data sources, Event Traceability comes to help you estimate the reliability of your data on a much deeper level in order to have simplified and trusted detection implementations. The following are some use cases for evaluating event traceability of some data sources.

The success of event traceability requires a data model in place to parse an normalize your events. I won't be going through these aspects since this article is already long enough but I will suggest great references to check out.

Example 1 : Antivirus Data Traceability

I started by defining the event types that I will need in my security operations and threat detections development no matter who is the vendor. For example I will need to be informed when a virus is detected or when an asset's license expires or when a virus deletion fails...etc. After that, I listed the event field that must be present for each event type and evaluated it based on the following color scale:

  • GREEN : Collected, Parsed and Normalized

  • YELLOW: Collected, Parsed but Not Normalized

  • ORANGE: Only collected

  • RED: Not Collected

  • GREY: Not Applicable

You can use numbers instead of colors and calculate a score for every data source traceability evaluations

Example 2 : Windows Authentication Event Traceability

Example 3 : Windows Process Creation and DNS Event Traceability

As you can notice in the following figures, doing such exercise let you know what events you can rely on with relevant data during your detection engineering process. For example, Process Command Line is not audited by default in EID 4688 and Process is very helpful for correlations as it can be used by SIEM platforms like Elastic to create a process tree context of execution.

This example highlights the differences between Windows DNS Debug logs which are written to a file on the Windows DNS Server and Sysmon's EID 22 which is recorded on Windows Event Logs and generated on the client side. It is important to know these differences since your agent should be able collect Windows Server DNS logs only if it can read and parse them from the dnslog.txt file written to disk. If you're using WEF/WEC to collect logs you won't be able to collect them from a file also using an agent like Winlogbeat won't do it unless you're using Filebeat (yet another agent). QRadar's Wincollect for example can collect both Windows Event Logs and DNS Debug logs but custom filtering can be more complex.

PreviousOperationalizationNextAnalytics

Last updated 3 years ago

Prioritize your data sources types on a high level then go deeper. In the next part we will be talking about a much deeper level of evaluation "Event Traceability". You can adopt tweet bellow as a first step then adapt it to your environment needs. This will give you a first overview of your visibility where you can add more criteria like number of instances per log source type to have an initial list of solutions integrated in your SIEM or being able to include high availability instances and replication servers you can also consider event types as each asset might have multiple like applications logs, service logs, systems logs...etc.

Other useful resources like or can help you at the beginning. You can also use Jupyter notebooks by to interact with MITRE ATT&CK Framework and identify most relevant telemetry and the necessary data sources to cover techniques of interest to you.

Agent vs Agentless: Technically there is no agentless approach in log collection, there are built-in agents or third party agents. Going with agent or not is simply a matter of scale, the bigger the environment the more difficult it is to manage it. For a Windows environment there is a built-in mechanism that allows you to automatically forward logs called . Here is a video from SANS by and explaining advantages and disadvantages of WEF/WEC in depth. I put a summary of this in the figure bellow.

If your SIEM have an API you can use it for daily automated health checks and measure specific metrics on the levels of observability you care about. I consider combining Jupiter notebooks and Vega visualizations like Trends as demonstrated can be very effective. The following tweet by started a great conversation related to data observability.

We should also take in consideration the OS version limitations. As per Sysmon The telemetry for EID 22 was added for Windows 8.1 so it is not available on Windows 7 and earlier in the other hand DNS Debug Logging is available on Windows Server 2003, 2008, 2012, 2016 and 2019.

You can score your events traceability during your evaluations to define goals for improvement by using approaches adopted in ATT&CK Datamap by or in DeTT&CT by and

If you're still not convinced this great talk by and would make you see the added value of data modeling and data quality for you detection and threat hunting needs:

🛡️
@cyb3rops
NIST SP800-92[1] Guide to Computer Security Log Management
MITRE ATT&CK Framework
Attack-Python-Project
@Cyb3rWard0g
Windows Event Forwarding
@SecurityMapper
@packetengineer
here
@nas_bench
documentation
@olafhartong
@bakk3rm
@rubenb_2
@Cyb3rWard0g
@Cyb3rPandaH
Ingest Windows Event Logs via WEC & WEFElastic Blog
What is Schema On Read and Schema On Write in Hadoop? - GeeksforGeeksGeeksforGeeks
How To Create a Logging Strategy - deepwatchdeepwatch
Assess your data potential with ATT&CK DatamapMedium
DeTT&CT: Mapping your Blue Team to MITRE ATT&CK™ — MB SecureMB Secure
AttackCon2018 Presentation by Roberto and Jose Rodriguez
WEC/WEC Pros and Cons from SANS talk
Schema-on-Read
Schema-on-Write
Example of AV Event Traceability Evaluation
Example of Perfect Score of Windows Authentication Events Traceability Evaluation
Process Creation Event Traceability Evaluation
DNS Request Event Traceability Evaluation
Logo
Logo
Logo
Logo
Logo
https://github.com/olafhartong/ATTACKdatamap
https://github.com/rabobank-cdc/DeTTECT