Splunk SPL Commands

What Is the stats Command?

The stats command is the most essential transforming command in Splunk’s Search Processing Language (SPL). It computes aggregate statistics over a result set — counting events, summing values, calculating averages, and much more. If you learn one SPL command deeply, make it this one.

Unlike chart or timechart, the stats command does not require a time axis. It simply groups your data by specified fields and applies statistical functions to produce summary rows.

Core Syntax

| stats <function>(<field>) [AS <alias>] [BY <field-list>]

The BY clause defines how results are grouped. Without it, stats returns a single row summarizing the entire result set.

Essential Statistical Functions

Function Description Example
count Number of events stats count BY src_ip
dc(field) Distinct count of values stats dc(user) BY host
sum(field) Sum of numeric values stats sum(bytes) BY dest_ip
avg(field) Average of numeric values stats avg(duration) BY action
min / max Minimum / Maximum values stats min(response_time) max(response_time)
values(field) Distinct values (multivalue) stats values(dest_port) BY src_ip
list(field) All values including duplicates stats list(action) BY user
earliest / latest First / last event by time stats earliest(_time) latest(_time) BY session_id

Security Operations Examples

1. Failed Authentication Summary (NIST AC-7)

Identify accounts with excessive failed login attempts — a key indicator for brute force detection and NIST 800-53 AC-7 (Unsuccessful Logon Attempts) compliance.

index=security sourcetype=WinEventLog:Security EventCode=4625
| stats count AS failed_attempts dc(src_ip) AS unique_sources earliest(_time) AS first_attempt latest(_time) AS last_attempt BY Account_Name
| where failed_attempts > 10
| sort - failed_attempts

2. Network Traffic Baseline by Source

Establish bandwidth baselines per source IP to identify anomalous data exfiltration patterns.

index=network sourcetype=firewall
| stats sum(bytes_out) AS total_bytes_out avg(bytes_out) AS avg_bytes_out count AS connection_count BY src_ip
| eval total_MB = round(total_bytes_out / 1048576, 2)
| sort - total_MB

3. Vulnerability Scan Coverage (NIST RA-5)

Track vulnerability scanning activity across your environment to ensure compliance with NIST 800-53 RA-5 (Vulnerability Monitoring and Scanning).

index=vulnerability sourcetype=qualys OR sourcetype=nessus
| stats dc(dest_ip) AS hosts_scanned count AS total_findings values(severity) AS severity_levels BY scan_date
| sort - scan_date

Multiple Aggregations in One Query

One of the most powerful patterns is combining multiple stats functions to build comprehensive summaries in a single pass.

index=web sourcetype=access_combined
| stats count AS total_requests
        dc(clientip) AS unique_visitors
        avg(response_time) AS avg_response_ms
        sum(bytes) AS total_bytes
        values(status) AS status_codes
        BY uri_path
| sort - total_requests

Best Practices

  • Filter before aggregating — Use where or field filters upstream of stats to reduce the dataset and improve performance.
  • Use dc() for cardinality checks — Distinct count is invaluable for identifying lateral movement (e.g., one user authenticating to many hosts).
  • Combine with eval — Post-process stats results with eval for unit conversions, thresholds, and conditional labeling.
  • Alias your fields — Always use AS to name your aggregated fields for dashboard readability.

Common Pitfalls

  • Forgetting the BY clause — Without BY, you get a single summary row. This is useful but often not what analysts intend.
  • Using list instead of valueslist includes duplicates and preserves order, while values returns distinct values. Choose intentionally.
  • Not pre-filtering — Running stats across an entire index without time bounds or source filters is a performance killer.

Next in the series: We dive into timechart — the essential command for time-series analysis and dashboard visualization.


Next in the Rhombic SPL Series → Splunk timechart Command