Introduction and Use Case:

Workspace Transformation Rules are a very effective way to fine tune your ingest volume. Perhaps you need data from the SecurityEvent table but not ALL of the EventIDs that go with it? Let’s take out the trash!

In this post we will:

  • Identify the most voluminous table
  • Identify the most frequently thrown EventID in that table
  • Determine which machines are throwing this EventID and how often
  • Build a DCR Transformation rule to filter out a verbose EventID that does not contribute any detection or investigation value.

Find the Most Voluminous Table:

Lets craft a KQL query to return EventIDs from the SecurityEvent table and their respective count (number of times they’ve fired over a given period) to figure out which EventIDs are the loudest, and more importantly, how much it’s costing you.

SecurityEvent				//<-- Query the SecurityEvent table
| where TimeGenerated > ago(30d)	//<-- Query the last 30 days
| summarize count() by EventID	        //<-- Summarize EventIDs by number of times they fire



We now know that the most frequently thrown EventID in the SecurityEvents table is EventID 8002. This is potentially the “trash” we need to take out. Let’s see how much of it has gathered in our environment in terms of GB in our next query:

SecurityEvent						//<-- Query the SecurityEvent table
| where TimeGenerated > ago(1h)			        //<-- Query the last hour
| where EventID == 8002					//<-- Query for EventID 8002
| summarize GB=sum(_BilledSize)/1000/1000/1000	        //<-- Summarize billable volume in GB



18GB / hour is pretty steep. From my experience, EventID 8002 offers little to no detection (ability to detect malicious activity) or investigation value (post-breach). Let’s filter 18GB of 8002 / hour from our billable ingest volume and save the day!

Notes:

  • You can change the last line in the above query to the following if you’re a stickler for Gibibytes (GB) versus Gigabytes(GiB):
    | summarize GB=sum(_BilledSize)/1024/1024/1024	//<-- Summarize billable volume in GiB
    


  • The _BilledSize table column is a standard column in Azure Monitor Logs, but not explicitly listed to the left of the query GUI for you to choose from. You can find out more about _BilledSize and other Standard columns.

  • Before implementing a Workspace Transformation Rule, it may be worth figuring out which devices are throwing this EventID. If this is a single problem device, then it may make more sense to troubleshoot locally before tuning out this event. In this scenario however, this EventID provides no value regardless, and so can be safely dropped from this environment in this specific example.

  • Please confirm with your administrator or supervisor prior to implementing a Workspace Transformation Rule, as they can result in catastrophic results if implemented incorrectly.

Implementing a Workspace Transformation Rule:

  1. First, go to your Log Analytics Workspace:


  2. Select the Tables blade:


  3. Search for the SecurityEvent table:


  4. Click on the “…” for that table and then on Create transformation.


  5. Name the transformation rule:


  6. Click on </> Transformation editor


  7. Use KQL to define ‘everything except 8002’ for collection as illustrated below:


  8. Review and confirm:


  9. To test, we can query the SecurityEvent table for EventID 8002 after the workspace transformation rule was deployed (just over 5 minutes ago):



EventID 8002 has effectively been excluded from ingest volume, saving 18GB / hour and therefore lots of $$$ over the course of a month or quarter. Make sure you check in with accounting whenever you successfully knock one of these out and forever secure your legacy as THE KQL ninja in your network.


WARNING: While transformations themselves don’t incur direct costs, the following scenarios can result in additional charges:

  • If a transformation increases the size of the incoming data, such as by adding a calculated column, you’ll be charged the standard ingestion rate for the extra data.

  • If a transformation reduces the incoming data by more than 50% and Sentinel is NOT deployed, you will be charged for the amount of filtered data above 50%.


Summary:

In this post, we learned how to:

  • Identify the most verbose table
  • Identify the most frequently thrown EventID in that table
  • Determine which machines are throwing this EventID and how often
  • Build a DCR Transformation rule to filter out a verbose EventID that does not contribute any detection or investigation value.


References: