This guest blog was provided by Jared Atkinson, Chief Strategist at SpecterOps
One of the top cybersecurity priorities for organizations is to detect and prevent threats across all their systems. In the past, red team assessments, penetration testing and non-optimized purple team assessments have all been utilized to accomplish this. However, due to the increasing complexity of attacks, these assessments are now struggling to detect all the threats they are meant to identify.
This complexity comes from the many possible variations within the techniques attackers deploy. Assessment services typically test a system's defense against a limited number of techniques, often using only one (or a few) variations of each. This approach falls far short of capturing the full scope of variations available to attackers.
Individual attack techniques can have a staggering number of variants. For instance, one technique I examined had 39,000 variations, and another had 2.4 million. This vast array of potential variations challenges organizations in determining whether they are genuinely secure or merely prepared for the specific technique variant employed by the red team. Given the countless options available, the odds that a real attacker uses the same variant as the red team are quite low.
To improve security assessments, many security teams have started to use purple teaming, an approach in which red and blue teams work more collaboratively. This has many benefits, but the issue of the myriad of possible variations of each technique remains, especially when teams don’t account for or understand the variations in the first place. Purple team assessments must adapt to solve this issue.
Building a Representative Sample of Attack Techniques
Instead of evaluating defenses one technique variation at a time, purple team assessments should test a representative sample of attack technique variants. Clearly, testing each variant of an attack technique is not practical, as highlighted by the technique where I found 2.4 million variants. Initially, teams should determine the techniques they wish to assess, then catalog the variants of those attacks to the best of their knowledge. Finally, they should select a representative and diverse sample of those variants. It is reasonable to assume defenders will detect variants within their samples, therefore teams should pick a diverse group of test cases that represents the full range of potential techniques an adversary might use.
Selecting a representative sample is good in theory, but it is easier said than done. This is because there is currently no in-depth system in cybersecurity for cataloging different attack variants. The system we have now glosses over too much detail. Traditionally, attack techniques are broken down into three levels – tactics (like Persistence), techniques (like Kerberoasting) and procedures (the specific tools or steps to execute a technique, like the Invoke-Kerberoast tool created by Will Schroeder). But this model loses too much detail, particularly in the “procedures” category. For example, a technique, like Credential Dumping, can be accomplished with many different procedures, like Mimikatz or Dumpert.
Each procedure can have many different sequences of function calls, making the definition of a procedure quite complex and challenging to articulate!
To solve this problem, I believe we should break down the attack techniques even further into five or six levels:
Tactics
Techniques
Sub-techniques
Procedures (potentially)
Operations
Functions
The Six Levels of Attack Techniques
In my new proposed system, tactics are short-term, tactical, adversary goals. Examples of tactics include Defense Evasion and Lateral Movement (definitions and examples are from the MITRE ATT&CK framework). Techniques are how adversaries achieve those tactical goals. For example, Process Injection and Rootkit are both techniques for accomplishing the Defensive Evasion tactic. Further, sub-techniques are more specific means by which adversaries achieve tactical goals at a lower level than techniques. For example, Dynamic-link Library Injection and Asynchronous Procedure Call are two distinct types of Process Injection.
Operations represents the specific actions that must be taken against resources on the target system/environment to implement the (sub-)technique. For example, the Dynamic-link Library Injection sub-technique can be accomplished with these four operations: Process Open -> Memory Allocate -> Process Write -> Thread Create. Or it can be done with a different sequence: Process Open -> Section Create -> Section Map (local) -> Section Map (remote) -> Thread Create. The final level are the functions – the literal API functions that a specific tool calls to implement the operations in the chain. Developers often encounter many nearly identical functions provided by the operating system. However, the subtle distinctions between these functions could be enough to evade detection. Often, a single operation will serve as a category for multiple API functions. The Process Open operation might be either of these two strings of functions: “OpenProcess -> VirtualAllocEx -> WriteProcessMemory – CreateRemoteThread” or “NtOpenProcess -> NtAllocateVirtualMemory -> NtWriteVirtualMemory -> NtThreadCreate.”
I also include the “Procedure” layer, which I define as “a chain of operations.” It’s occasionally useful to include in this breakdown, but not always as it is less precise than these other definitions. In any case, this five or six-layered model captures attack techniques more comprehensively, allowing defenders to select more representative test cases for their assessments.
Assessments are a vital component in evaluating the security of an organization, but in their current state, they're not the right tool for testing defenses. For these assessments to be comprehensive, we must expand our current understanding of the models we test through, which will boost our effectiveness and yield improved results. You can explore some of my posts here to learn more about threat detection and assessment concepts and challenges.