Mastering JSON Manipulation in Cortex XQL - Beyond the Basics

In modern security operations, data is rarely flat. In Cortex XDR, the most valuable insights—like asset ownership, vulnerability details, and cloud metadata—are often stored as JSON strings within fields like xdm.issue.extended_fields or raw_log.

To be a master threat hunter, you must know how to peel back these layers. This guide provides a deep dive into JSON manipulation with realistic data samples and detailed XQL queries.

The Sample Dataset

Throughout this guide, we will refer to a hypothetical extended_fields JSON object that follows the standard Cortex CSM (Cloud Security Management) structure:

{
  "cve_id": "CVE-2024-1234",
  "xdm_assets": [
    {
      "xdm__asset__name": "SEC-SVR-01",
      "xdm__asset__realm": "Cloud-Prod-West",
      "xdm__asset__type": "Virtual Machine",
      "owner": {
        "owner_name": "DevSecOps Team",
        "email": "[email protected]"
      }
    }
  ],
  "user_context": {
    "department": "Finance",
    "login_geo": "US"
  },
  "metrics": {
    "sent_bytes": "15728640",
    "usage_pct": "87.5"
  },
  "indicators": ["192.168.1.100", "8.8.8.8", "malicious-site.com"]
}

1. The Workhorse: `json_extract_scalar`

This function is designed to pull a single value (string, number, or boolean) and return it as a XQL-native string.

The Goal: Identify the Department

We want to extract the department from the user_context object to audit finance-related activity.

The Query:

dataset = xdr_data 
| alter dept = json_extract_scalar(additional_data, "$.user_context.department")
| filter dept == "Finance"
| comp count() as finance_activity by action_process_name

Detailed Breakdown:

$.user_context.department: The $ represents the root of the JSON. We then navigate through keys using dot notation.
Result: The function returns "Finance".
Limitation: If you tried to extract $.user_context, the function would return null because user_context is an object, not a scalar value.

2. Handling Nested Structures: `json_extract`

When you need to extract a whole sub-section (like an entire array or object) to process it later, use json_extract.

The Goal: Isolate Asset Metadata

In vulnerability management, you often need to grab the entire asset record to perform multiple extractions from it.

The Query:

dataset = issues
| alter first_asset = json_extract(xdm.issue.extended_fields, "$.xdm_assets[0]")
| alter asset_type = json_extract_scalar(first_asset, "$.xdm__asset__type")
| filter asset_type != null

Detailed Breakdown:

$.xdm_assets[0]: Uses array indexing to grab the first item in the asset list.
Return Value: Unlike json_extract_scalar, this returns the entire stringified JSON: {"xdm__asset__name": "SEC-SVR-01", ...}.
Why use this? It saves you from writing $.xdm_assets[0] over and over again in subsequent alter stages.

3. Dealing with Lists: Array Functions

Handling arrays of IP addresses or indicators is a common SOC requirement.

The Goal: Find a Specific Malicious IP

We want to check if the indicators list contains a known malicious IP.

The Query:

dataset = cloud_logs
| alter ioc_list = json_extract_array(raw_log, "$.indicators")
| filter array_contains(ioc_list, "192.168.1.100")
| fields _time, ioc_list, action

Detailed Breakdown:

json_extract_array: Converts the JSON string ["192.168.1.100", ...] into a native XQL array.
array_contains: This function only works on native arrays, making the previous step mandatory.
json_extract_scalar_array: If your only goal is to display the indicators cleanly in a report without quotes, use this function instead.

4. Advanced: The "Swiss Army Knife" (JSONPath)

For complex extractions, Cortex supports recursive descent and wildcards through json_path_extract.

The Goal: Find the Owner Name Anywhere

If your JSON structure changes (e.g., owner is sometimes in asset and sometimes in project), you can search for the key globally.

The Query:

dataset = issues
| alter owner = json_path_extract(xdm.issue.extended_fields, "$..owner_name")
// The $.. syntax triggers a recursive search

Syntactic Sugar: The Operators

Cortex provides two extremely helpful operators for cleaner code:

->: Shortcut for json_extract.
->->: Shortcut for json_extract_scalar.

Modernized Query:

dataset = issues
| alter name = xdm.issue.extended_fields ->-> "$.xdm_assets[0].xdm__asset__name"

dataset = network_logs
| alter bytes = to_integer(json_extract_scalar(raw_payload, "$.metrics.sent_bytes"))
| filter bytes > 10485760 // Greater than 10MB
| comp sum(bytes) as total_outbound by bin(_time, 1h)

Cast Function	Use Case
`to_integer()`	Count of issues, byte sizes, port numbers.
`to_float()`	Percentages (like `usage_pct`), risk scores.
`to_timestamp()`	Custom event times within JSON logs.

Best Practices Summary

Casing: Always double-check your casing. $.Owner $\neq$ $.owner.
Validation: Use to_json_string() if your extraction returns null on a field you know is there—the field might not be properly typed as JSON yet.
Visualization: Use json_extract_scalar_array for dashboard tables; it removes the brackets and quotes that often clutter UI widgets.

Mastering these JSON functions transforms you from a basic user into a technical power user who can squeeze every bit of value from Cortex logs.

Happy Hunting!

Mastering JSON Manipulation in Cortex XQL - Beyond the Basics

The Sample Dataset

1. The Workhorse: `json_extract_scalar`

The Goal: Identify the Department

The Query:

Detailed Breakdown:

2. Handling Nested Structures: `json_extract`

The Goal: Isolate Asset Metadata

The Query:

Detailed Breakdown:

3. Dealing with Lists: Array Functions

The Goal: Find a Specific Malicious IP

The Query:

Detailed Breakdown:

4. Advanced: The "Swiss Army Knife" (JSONPath)

The Goal: Find the Owner Name Anywhere

The Query:

Syntactic Sugar: The Operators

5. Final Step: Data Type Casting

The Goal: Filter by Traffic Volume (Bytes)

The Query:

Best Practices Summary