What Is the stats Command?
The stats command is the most essential transforming command in Splunk’s Search Processing Language (SPL). It computes aggregate statistics over a result set — counting events, summing values, calculating averages, and much more. If you learn one SPL command deeply, make it this one.
Unlike chart or timechart, the stats command does not require a time axis. It simply groups your data by specified fields and applies statistical functions to produce summary rows.
Core Syntax
| stats <function>(<field>) [AS <alias>] [BY <field-list>]
The BY clause defines how results are grouped. Without it, stats returns a single row summarizing the entire result set.
Essential Statistical Functions
| Function | Description | Example |
|---|---|---|
count |
Number of events | stats count BY src_ip |
dc(field) |
Distinct count of values | stats dc(user) BY host |
sum(field) |
Sum of numeric values | stats sum(bytes) BY dest_ip |
avg(field) |
Average of numeric values | stats avg(duration) BY action |
min / max |
Minimum / Maximum values | stats min(response_time) max(response_time) |
values(field) |
Distinct values (multivalue) | stats values(dest_port) BY src_ip |
list(field) |
All values including duplicates | stats list(action) BY user |
earliest / latest |
First / last event by time | stats earliest(_time) latest(_time) BY session_id |
Security Operations Examples
1. Failed Authentication Summary (NIST AC-7)
Identify accounts with excessive failed login attempts — a key indicator for brute force detection and NIST 800-53 AC-7 (Unsuccessful Logon Attempts) compliance.
index=security sourcetype=WinEventLog:Security EventCode=4625
| stats count AS failed_attempts dc(src_ip) AS unique_sources earliest(_time) AS first_attempt latest(_time) AS last_attempt BY Account_Name
| where failed_attempts > 10
| sort - failed_attempts
2. Network Traffic Baseline by Source
Establish bandwidth baselines per source IP to identify anomalous data exfiltration patterns.
index=network sourcetype=firewall
| stats sum(bytes_out) AS total_bytes_out avg(bytes_out) AS avg_bytes_out count AS connection_count BY src_ip
| eval total_MB = round(total_bytes_out / 1048576, 2)
| sort - total_MB
3. Vulnerability Scan Coverage (NIST RA-5)
Track vulnerability scanning activity across your environment to ensure compliance with NIST 800-53 RA-5 (Vulnerability Monitoring and Scanning).
index=vulnerability sourcetype=qualys OR sourcetype=nessus
| stats dc(dest_ip) AS hosts_scanned count AS total_findings values(severity) AS severity_levels BY scan_date
| sort - scan_date
Multiple Aggregations in One Query
One of the most powerful patterns is combining multiple stats functions to build comprehensive summaries in a single pass.
index=web sourcetype=access_combined
| stats count AS total_requests
dc(clientip) AS unique_visitors
avg(response_time) AS avg_response_ms
sum(bytes) AS total_bytes
values(status) AS status_codes
BY uri_path
| sort - total_requests
Best Practices
- Filter before aggregating — Use
whereor field filters upstream ofstatsto reduce the dataset and improve performance. - Use
dc()for cardinality checks — Distinct count is invaluable for identifying lateral movement (e.g., one user authenticating to many hosts). - Combine with
eval— Post-process stats results withevalfor unit conversions, thresholds, and conditional labeling. - Alias your fields — Always use
ASto name your aggregated fields for dashboard readability.
Common Pitfalls
- Forgetting the BY clause — Without
BY, you get a single summary row. This is useful but often not what analysts intend. - Using
listinstead ofvalues—listincludes duplicates and preserves order, whilevaluesreturns distinct values. Choose intentionally. - Not pre-filtering — Running
statsacross an entire index without time bounds or source filters is a performance killer.
Next in the series: We dive into timechart — the essential command for time-series analysis and dashboard visualization.
Next in the Rhombic SPL Series → Splunk timechart Command