PSScriptAnalyzer: Tokens and ASTs
Introduction
Bumped into the PSScriptAnalyzer the other day and a very interesting work on detecting PowerShell scripts obfuscation using ASTs by Daniel Bohannon. So figured it might be interesting to create a small write-up about ASTs, PSScriptAnalyzer, custom PSScriptAnalyzer rules, and put all the info that's out there in one spot for the interested parties. So here goes the rant... 😁
ASTs and PSScriptAnalyzer
Abstract Syntax Trees (ASTs) are a way of representing a code in an abstract way and they are mostly used by compilers.
PSScriptAnalyzer is an open source tool developed by Microsoft that was designed as a static code checker for PowerShell scripts and modules. The main idea is to check the quality of the PowerShell code by running the PowerShell script content through the set of rules built using the script AST object types.
PowerShell Script ASTs Analysis
Ok great! At this point we know what ASTs are and that PSScriptAnalyzer is built to work with ASTs. Cool... now what?
Well, let's take a look at what PowerShell AST looks like!
PowerShell script AST is built using the set of tokens. The process of creating a set of tokens from code is called lexical analysis. You can extract all the tokens from the PowerShell script using the ASTHelper module.
In the example below, we will be using the "Get-VaultCredentials.ps1" script from PowerSploit since it is publicly available but any custom PowerShell script will do.
$script = "C:\Research\Get-VaultCredential.ps1"
$tokens = Invoke-Tokenize $script
$tokens > full_tokens_dataset.txt
Each token will have related "Content" and "Type" fields. Let's take a look at the different token types present in our script:
$tokens | Group-Object Type | Sort-Object Count -Descending | Select-Object Count, Name
Hm, so in here, we have a bunch of interesting token types: CommandArgument
, Command
, String
, CommandParameter
, and many more.
Now that we know token types - lets analyze the ones we like Command
, String
, or CommandArgument
:
$tokens | Where-Object {$_.Type -eq 'Command'} | Group-Object Content | Sort-Object Count -Descending | Select-Object Count, Name
$tokens | Where-Object {$_.Type -eq 'CommandArgument'} | Group-Object Content | Sort-Object Count -Descending | Select-Object Count, Name
$tokens | Where-Object {$_.Type -eq 'String'} | Group-Object Content | Sort-Object Count -Descending | Select-Object Count, Name
Things that stands out, to name a few, are Get-VaultCredential
, System.Reflection.AssemblyName
, vaultcli.dll
, VaultOpenVault
, and so on.
Ok, I wonder if we can create a custom PSScriptAnalyzer rule to pick up scripts like that? How would one go about creating one?
Well, first we would need to figure out which AST object types these interesting strings belong to. The process of creating AST is called syntax analysis and it will convert out tokens into a tree that will represent the actual structure of the code. We have two options to view the PowerShell script AST:
- We can use the PowerShell module ShowPSAst to visualize the tree
- Or we can use the PowerShell module ASTHelper to pull all AST object types from the AST and to investigate each AST object type separately
Let's take a look at our script using ShowPSAst first. Each assignment statement, loop, and command inside the PowerShell script will be represented as some kind of an AST object type. For example, the full $OSVersion = [Environment]::OSVersion.Version
statement has an AST object type AssignmentStatementAst
. The AssignmentStatementAst
consists of VariableExpressionAst
and CommandExpressionAst
which in turn consist of other AST object types.
If you want to list all AST object types present in the script or dig into some specific AST object types like CommandAst
you can use ASTHelper cmdlets for this.
Get-AstType $script
Get-AstObject $script -Type CommandAst | select -First 1
$commandASTs = Get-AstObject $script -Type CommandAst | select -First 1
$commandASTs.Extent.Text
Great! Now we have a good understanding of what kind of tokens and AST object types the script contains, and we can proceed to rule creation.
PSScriptAnalyzer Custom Rule
Let's say we want to create a PSScriptAnalyzer rule that will pick up all the scripts that contain VaultOpenVault
, vaultcli.dll
, DefinePInvokeMethod
, ::Winapi
strings in their CommandAst
AST object types.
The process of creating custom PSScriptAnalyzer rules is kind of documented here. But I think that "A Crash Course in Writing Your Own PSScriptAnalyzer Rules" by Thomas Rayner is a bit more useful than Microsoft documentation. And another great source for figuring out how to put these rules together is Daniel Bohannon's custom set of rules designed to detect obfuscated PowerShell scripts.
In any case, the PSScriptAnalyzer rules must be stored in .psm1
file and as long as you know AST object types you want to create a rule for - it does not take too long to put it together.
<#
.DESCRIPTION
Custom Rule Description
#>
function Detect-GetVaultCredential
{
[CmdletBinding()]
[OutputType([Microsoft.Windows.Powershell.ScriptAnalyzer.Generic.DiagnosticRecord[]])]
param
(
[Parameter(Mandatory = $true)]
[ValidateNotNullOrEmpty()]
[System.Management.Automation.Language.ScriptBlockAst]
$ScriptBlockAst
)
[ScriptBlock] $predicate = {
param ([System.Management.Automation.Language.Ast] $Ast)
$targetAst = $Ast -as [System.Management.Automation.Language.AssignmentStatementAst]
if (($targetAst.Extent.Text -replace "`n", "") -match 'DefinePInvokeMethod.*VaultOpenVault.*vaultcli\.dll.*::winapi')
{
return $true
}
}
$foundNodes = $ScriptBlockAst.FindAll($predicate, $false)
foreach ($foundNode in $foundNodes)
{
[Microsoft.Windows.Powershell.ScriptAnalyzer.Generic.DiagnosticRecord] @{
"Message" = "Found: " + $foundNode.Extent.Text
"Extent" = $foundNode.Extent
"RuleName" = "CustomRule1"
"Severity" = "Warning"
}
}
}
And here you go - now you know a little bit about ASTs, PSScriptAnalyzer, and PSScriptAnalyzer custom rules. Idk, to me, it seems that ASTs could be used to detect not only obfuscated but also malicious PowerShell scripts, and with ML involved ASTs could provide a script evaluation\detection solution that will pick up custom scripts just as well as boilerplate PowerShell Empire scripts - but I am too lazy to dig into this.
Here is a good start though.
References
https://github.com/thomasrayner/AstHelper
https://github.com/danielbohannon/DevSec-Defense
https://www.youtube.com/watch?v=xHqj7Icc3LM
https://tosbourn.com/abstract-syntax-trees/
https://github.com/PowerShell/PSScriptAnalyzer