Metrics Quantization
Sumo ingests individual metric data points from your metric sources. In metric visualizations, rather than charting individual data points, Sumo presents the aggregated value of the data points received during an interval.
Quantization is the process of aggregating metric data points for time series over an interval, for example, an hour or a minute, using a particular aggregation function: avg
, min
, max
, sum
, or count
.
Quantization terminology
This section defines the quantization-related terms we use in Sumo. So, what is quantization? At a high level, it’s the process that Sumo Logic performs on raw data points to produce the aggregated metric values that your metric queries run against.
Buckets
We use the term bucket to refer to the intervals across which Sumo quantizes your metrics.
When you run a metric query, Sumo divides up your metric query time range into contiguous buckets, either automatically, or based on the interval you specify in the quantize
operator. For example, given this query:
cpu | quantize to 15m
Sumo divides your time range into 15 minute buckets.
For each bucket, Sumo uses a rollup type, described below, to aggregate the values of all the data points in the bucket. The aggregated values are displayed in your metric visualization or processed further in the pipeline.
By default, Sumo uses the avg
rollup type. You can specify another rollup type by using the quantize
operator, as described in Quantize with rollup type specified below.
Rollup types
We use the term rollup to refer to the aggregation function Sumo uses when quantizing metrics. This table describes the different rollup types you can select when running a query.
Rollup type | Description |
avg | Calculates the average value of the data points for a time series in each bucket. |
min | Calculates the minimum value among the data points for a time series in each bucket. |
max | Calculates the maximum value among the data points for a time series in each bucket. |
sum | Calculates the sum of the values of the data points for a time series in each bucket. |
count | Calculates the count of data points for a time series in each bucket. |
Sumo quantizes metrics upon ingestion and at query time.
Quantization at ingestion
Upon ingestion, Sumo quantizes raw metric data points to one hour resolutions for all rollup types: avg
, min
, max
, sum
, and count
. This data is stored in one hour rollup tables in Sumo. The raw data is stored in a table referred to as the baseline table. For information about retention times, see Metric Ingestion and Storage.
Automatic quantization at query time
This section describes how Sumo quantizes metrics when you run a metric query without the quantize
operator.
If you do not use the quantize
operator in your metric query, Sumo automatically determines an optimal quantization interval, based on the age of the data you are querying and the query time range. The quantization interval is shown at the top of the metric query tab.
The age of the metrics in the time range governs the minimum quantization interval (based on what rollups are available for the query time range). Sumo retains only the last 30 days of raw metric data. So, when you query metrics that are more than 30 days old, Sumo must quantize the data to at least 1 hour, because that’s the minimum resolution rollup available given the age of the data.
If you want, you can override the automatic quantization interval. In the Metrics Explorer’s basic mode you can set the quantization interval in the row creator in the UI. In advanced mode, use the quantize
operator and specify the interval that fits your need
Sumo Logic sets the actual quantization interval to be as close to your selection as possible. If it is not possible to set the actual interval to the targeted interval—typically because too many buckets would be produced to reasonably show on the chart—Sumo displays a message like the following:
Sumo Logic will never decrease the quantization interval that you specify. We’ll either use that interval, or increase it as appropriate.
How Sumo chooses rollup table and quantization interval
If you don't specify a rollup type in your query, Sumo Logic will run the query using the avg
rollup, unless the query contains a max
or min
aggregation after the first pipe, in which case the query will run against the max
or min
rollup respectively.
The table below shows how Sumo Logic selects a quantization interval based on query time range, in the case that you do not specify those options explicitly using the quantize
operator.
Query time range | Default quantization interval |
400 days | 3 days |
200 days | 2 days |
150 days | 1 day |
90 days | 12 hours |
30 days | 6 hours |
14 days | 2 hours |
7 days | 1 hour |
3 days | 30 minutes |
1 day | 10 minutes |
6 hours | 3 minutes |
3 hours | 1 minute |
1 hour | 30 seconds |
Explicit quantization at query time
When you run a metric query, you can optionally use the quantize
operator to specify a quantization interval and rollup type, or both.
When you run a query with the quantize
operator, the way that Sumo quantizes your metric data points depends on the rollup type you specify, if any, in the quantize
clause of your query. Rollup types include avg
, min
, max
, sum
, and count
. (Specifying rollup type is optional for the quantize
operator.)
Quantize with rollup type specified
If your metric query uses the quantize
operator and specifies a rollup type, Sumo will only quantize metric data points accordingly. For example, given this query:
cpu | quantize to 15m using sum
Sumo will quantize to the sum
rollup type.
Quantize with no rollup type specified
If your metric query uses the quantize
operator without specifying a rollup type, internally, Sumo Logic produces the default rollup, (typically, avg
).
Query | What Happens |
cpu | quantize to 1m | Use avg rollup. |
cpu | quantize to 1m | min | Use min rollup. |
cpu | quantize to 1m | max | Use max rollup. |
cpu | quantize to 1m | sum | Use avg rollup. |
cpu | quantize to 1m | count | Use avg rollup. |
cpu | quantize to 1m | avg | Use avg rollup. |
quantize operator is followed by a parse operator
The descriptive points might be passed through without change. For example, the parse
operator changes time series metadata but lets data points through unchanged. For example,
... | quantize to 5s | parse field=_sourceHost - as cluster,instance | ..
quantize operator is followed by another quantize operator
In this case, the first specified quantization function is used. For example,
... | quantize to 15s using max | quantize to 1m | ...
and
... | quantize to 15s | quantize to 1m using max | ...
both use max
quantization.
In the following example:
... | quantize to 15s using min | quantize to 1m using max | ...
the data is quantized to 15s using min
quantization and then, max
quantization is applied on top of the result of the previous step.