Skip to main content

Varnish

Thumbnail icon

The Sumo Logic App for Varnish provides dashboards that help you analyze log and metric events generated by Varnish servers. This app allows you to identify traffic sources, monitor and improve application and website workflows, and understand how customers use your product.

This App is tested with the following versions:

  • For Kubernetes environments: Varnish Version 6.4.
  • Non-Kubernetes environments: Varnish Version 6.0.7.

Sample Log Messages

{
"timestamp": 1625219282000,
"log": "187.255.220.191 - - [01/Jul/2021:15:15:53 +0700] "GET /_includes/wp/blog/wp-content/themes/sumologic/style.css HTTP/1.1" 200 33229114 "http://www.greylock.com" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.21.1 (KHTML, like Gecko) Chrome/19.0.1084.30 Safari/536.5""
"stream": "stdout",
"time": "2021-07-02T09:21:20.005706219Z"
}

Collecting Logs and Metrics for Varnish

This section provides instructions for configuring log and metric collection for the Sumo Logic App for Varnish.

Configuring log and metric collection for the Varnish App includes the following tasks:

Step 1: Configure Fields in Sumo Logic

Create the following Fields in Sumo Logic before configuring the collection. This ensures that your logs and metrics are tagged with relevant metadata, which the app dashboards require. For information on setting up fields, see Sumo Logic Fields.

If you're using Varnish in a Kubernetes environment, create the fields:

  • pod_labels_component
  • pod_labels_environment
  • pod_labels_cache_system
  • pod_labels_cache_cluster

Step 2: Configure Logs and Metrics Collection

Instructions below show how to configure Kubernetes and Non-Kubernetes environments.

The Sumo Logic App for Varnish has been tested for Varnish Version: 6.4.

In Kubernetes environments, we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more about it here. The diagram below illustrates how data is collected from Varnish in a Kubernetes environment. In the architecture shown below, there are four services that make up the metric collection pipeline: Telegraf, Telegraf Operator, Prometheus, and Sumo Logic Distribution for OpenTelemetry Collector.

Varnish

The first service in the pipeline is Telegraf. Telegraf collects metrics from Varnish. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment, for example, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Varnish input plugin to obtain metrics. For simplicity, the diagram doesn’t show the input plugins.The injection of the Telegraf sidecar container is done by the Telegraf Operator. Prometheus pulls metrics from Telegraf and sends them to Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends metrics to Sumo Logic.

In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic.

Prerequisites

It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions here.

Configure Metrics Collection

This section explains the steps to collect Varnish metrics from a Kubernetes environment.

  1. Set up Kubernetes Collection with the Telegraf Operator.
  2. On your Varnish Pods, add the following annotations:
    annotations:
    telegraf.influxdata.com/class: sumologic-prometheus
    prometheus.io/scrape: "true"
    prometheus.io/port: "9273"
    telegraf.influxdata.com/inputs: |+
    [[inputs.varnish]]
    use_sudo = true
    binary = "/usr/bin/varnishstat"
    stats = ["*"]
    [inputs.varnish.tags]
    component="cache"
    environment="dev_CHANGME"
    cache_system="varnish"
    cache_cluster="varnish_on_k8s_CHANGEME"

Enter in values for the following parameters (marked CHANGEME in the snippet above):

  • telegraf.influxdata.com/inputs - This contains the required configuration for the Telegraf varnish Input plugin. Please refer to this doc for more information on configuring the Varnish input plugin for Telegraf. Note: As telegraf will be run as a sidecar, the host should always be localhost.
    • In the input plugins section, which is [[inputs.varnish]]
      • binary - The default location of the varnish stat binary. Please override as per your configuration.
      • use_sudo - If running as a restricted user, prepend sudo for additional access.
      • stats - Stats may also be set to ["*"], which will collect all stats. Please see this doc for more information on additional parameters for configuring the Varnish input plugin for Telegraf.
    • In the tags section, which is [inputs.varnish.tags]
      • environment - This is the deployment environment where the Varnish cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
      • cache_cluster - Enter a name to identify this Varnish cluster. This cluster name will be shown in the Sumo Logic dashboards.

Here’s an explanation for additional values set by this configuration that we request you please do not modify as they will cause the Sumo Logic apps to not function correctly.

  • telegraf.influxdata.com/class: sumologic-prometheus - This instructs the Telegraf operator what output to use. This should not be changed.
  • prometheus.io/scrape: "true" - This ensures our Prometheus will scrape the metrics.
  • prometheus.io/port: "9273" - This tells prometheus what ports to scrape on. This should not be changed.
  • telegraf.influxdata.com/inputs
    • In the tags section, [inputs.varnish.tags]
      • component: “cache” - This value is used by Sumo Logic apps to identify application components.
      • cache_system: “varnish” - This value identifies the web server system.

For all other parameters, see this doc for more parameters that can be configured in the Telegraf agent globally.

  1. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step.
  2. Verify metrics in Sumo Logic.

Configure Logs Collection

This section explains the steps to collect Varnish logs from a Kubernetes environment.

  1. (Recommended Method) Add labels on your Varnish pods to capture logs from standard output. Follow the instructions below to capture Varnish logs from stdout on Kubernetes.
    1. Apply following labels to the Varnish pods:
    environment: "prod_CHANGEME"
    component: "cache"
    cache_system: "varnish"
    cache_cluster: "varnish_on_k8s_CHANGEME"
    1. Enter in values for the following parameters (marked CHANGEME in the snippet above):
    • environment - This is the deployment environment where the Varnish cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
    • cache_cluster - Enter a name to identify this Varnish cluster. This cluster name will be shown in the Sumo Logic dashboards.

Here’s an explanation for additional values set by this configuration that we request you please do not modify as they will cause the Sumo Logic apps to not function correctly.

  • component: “cache” - This value is used by Sumo Logic apps to identify application components.
  • cache_system: “varnish” - This value identifies the cache system.

For all other parameters, see this doc for more parameters that can be configured in the Telegraf agent globally.

  1. (Optional) Collecting Varnish Logs from a Log File. Follow the steps below to capture Varnish logs from a log file on Kubernetes.
    1. Install the Sumo Logic tailing sidecar operator.
    2. Add the following annotation in addition to the existing annotations.
      annotations:
      tailing-sidecar: sidecarconfig;<mount>:<path_of_Varnish_log_file>/<Varnish_log_file_name>
      Example:
      annotations:
      tailing-sidecar: sidecarconfig;data: /var/log/varnish/varnishncsa.log
    3. Make sure that the Varnish pods are running and annotations are applied by using the command:
    kubectl describe pod <Varnish_pod_name>
    1. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above.
    2. Verify logs in Sumo Logic.
  2. Add an FER to normalize the fields in Kubernetes environments. Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for WebServer Application Components. To do so:
    1. Go to Manage Data > Logs > Field Extraction Rules.
    2. Click the + Add button on the top right of the table.
    3. The Add Field Extraction Rule form will appear:
    4. Enter the following options:
    • Rule Name. Enter the name as App Observability - Cache.
    • Applied At. Choose Ingest Time
    • Scope. Select Specific Data
    • Scope: Enter the following keyword search expression:
      pod_labels_environment=* pod_labels_component=cache pod_labels_cache_cluster=* pod_labels_cache_cluster=
    • Parse Expression.Enter the following parse expression:
      if (!isEmpty(pod_labels_environment), pod_labels_environment, "") as environment
      | pod_labels_component as component
      | pod_labels_cache_system as cache_system
      | pod_labels_cache_cluster as cache_cluste
    1. Click Save to create the rule.

Installing Varnish Monitors

Sumo Logic has provided pre-packaged alerts available through Sumo Logic monitors to help you proactively determine if a Varnish cluster is available and performing as expected. These monitors are based on metric and log data and include pre-set thresholds that reflect industry best practices and recommendations. For more information about individual alerts, see Varnish Alerts.

To install these monitors, you must have the Manage Monitors role capability.

There are limits to how many alerts can be enabled. For more information, see Monitors for details.

You can install monitors by importing a JSON file or using a Terraform script.

Method A: Importing a JSON file

  1. Download the JSON file that describes the monitors.
  2. The JSON contains the alerts based on Sumo Logic searches that do not have any scope filters. Therefore, it will apply to all Varnish clusters, the data for which has been collected via the instructions in the previous sections. However, if you would like to restrict these alerts to specific clusters or environments, update the JSON file by replacing the text cache_cluster=* with <Your Custom Filter>. Custom filter examples:
    • For alerts applicable only to a specific cluster, your custom filter would be cache_cluster=dev-varnish01
    • For alerts applicable to all clusters that start with varnish-prod, your custom filter would be cache_cluster=varnish-prod*
    • For alerts applicable to a specific cluster within a production environment, your custom filter would be cache_cluster=dev-varnish01 AND environment=prod. This assumes you have set the optional environment tag while configuring collection.
  3. Go to Manage Data > Alerts > Monitors.
  4. Click Add.
  5. Click Import.
  6. On the Import Content popup, enter Varnish in the Name field, paste the JSON into the popup, and click Import.
  7. The monitors are created in a "Varnish" folder. The monitors are disabled by default. See the Monitors topic for information about enabling monitors and configuring notifications or connections.

Method B: Using a Terraform script

  1. Generate an access key and access ID for a user with the Manage Monitors role capability; for instructions, see Access Keys.
  2. Download Terraform 0.13 or later and install it.
  3. Download the Sumo Logic Terraform package for MySQL monitor. The alerts package is available in the Sumo Logic GitHub repository. You can either download it using the git clone command or as a zip file.
  4. Alert Configuration. After extracting the package , navigate to the terraform-sumologic-sumo-logic-monitor/monitor_packages/Varnish/ directory.

Edit the varnish.auto.tfvars file and add the Sumo Logic Access Key and Access ID from Step 1 and your Sumo Logic deployment. If you're not sure of your deployment, see Sumo Logic Endpoints and Firewall Security.

access_id   = "<SUMOLOGIC ACCESS ID>"
access_key = "<SUMOLOGIC ACCESS KEY>"
environment = "<SUMOLOGIC DEPLOYMENT>"

The Terraform script installs the alerts without any scope filters; if you would like to restrict the alerts to specific clusters or environments, update the varnish_data_source variable. For example:

  • To configure alerts for a specific cluster, set varnish_data_source to something like: cache_cluster=varnish.prod.01
  • To configure alerts for All clusters in an environment, set varnish_data_source to something like: environment=prod
  • To configure alerts for multiple clusters using a wildcard, set varnish_data_source to something like: cache_cluster=varnish-prod*
  • To configure alerts for a specific cluster within a specific environment, set varnish_data_source to something like: cache_cluster=varnish-1 and environment=prod. This assumes you have configured and applied Fields as described in Step 1: Configure Fields for Sumo Logic.

All monitors are disabled by default on installation. To enable all of the monitors, set the monitors_disabled parameter to false.

By default, the monitors will be located in a "Varnish" folder on the Monitors page. To change the name of the folder, update the monitor folder name in the folder variable in the varnish.auto.tfvars file.

  1. If you want the alerts to send email or connection notifications, edit the varnish_notifications.auto.tfvars file to populate the connection_notifications and email_notifications sections. Examples are provided below.
Pagerduty connection example
connection_notifications = [
{
connection_type = "PagerDuty",
connection_id = "<CONNECTION_ID>",
payload_override = "{\"service_key\": \"your_pagerduty_api_integration_key\",\"event_type\": \"trigger\",\"description\": \"Alert: Triggered {{TriggerType}} for Monitor {{Name}}\",\"client\": \"Sumo Logic\",\"client_url\": \"{{QueryUrl}}\"}",
run_for_trigger_types = ["Critical", "ResolvedCritical"]
},
{
connection_type = "Webhook",
connection_id = "<CONNECTION_ID>",
payload_override = "",
run_for_trigger_types = ["Critical", "ResolvedCritical"]
}
]

In the variable definition below, replace <CONNECTION_ID> with the connection ID of the Webhook connection. You can obtain the Webhook connection ID by calling the Monitors API.

For information about overriding the payload for different connection types, see Set Up Webhook Connections.

Email notifications example
email_notifications = [
{
connection_type = "Email",
recipients = ["abc@example.com"],
subject = "Monitor Alert: {{TriggerType}} on {{Name}}",
time_zone = "PST",
message_body = "Triggered {{TriggerType}} Alert on {{Name}}: {{QueryURL}}",
run_for_trigger_types = ["Critical", "ResolvedCritical"]
}
]
  1. Install Monitors:
    1. Navigate to the terraform-sumologic-sumo-logic-monitor/monitor_packages/varnish/ directory and run terraform init. This will initialize Terraform and download the required components.
    2. Run terraform plan to view the monitors that Terraform will create or modify.
    3. Run terraform apply.
  2. This section demonstrates how to install the Varnish App.

Installing the Varnish App

Locate and install the app you need from the App Catalog. If you want to see a preview of the dashboards included with the app before installing, click Preview Dashboards.

  1. From the App Catalog, search for and select the app.
  2. Select the version of the service you're using and click Add to Library.vVersion selection applies only to a few apps currently. For more information, see the Install the Apps from the Library.
  3. To install the app, complete the following fields.
    • App Name. You can retain the existing name or enter a name of your choice for the app.

    • Data Source. Choose Enter a Custom Data Filter, and enter a custom Varnish cluster filter. Examples:
      • For all Varnish clusters cache_cluster=*
      • For a specific cluster: cache_cluster=varnish.dev.01.
      • Clusters within a specific environment: cache_cluster=varnish-1 and environment=prod. This assumes you have set the optional environment tag while configuring collection.
  4. Advanced. Select the Location in the Library (the default is the Personal folder in the library), or click New Folder to add a new folder.
  5. Click Add to Library.

Once an app is installed, it will appear in your Personal folder or another folder that you specified. From here, you can share it with your organization.

Panels will start to fill automatically. It's important to note that each panel slowly fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but you'll see full graphs and maps in a bit of time.

Viewing Varnish Dashboards

Filter with template variables

Template variables provide dynamic dashboards that can rescope data on the fly. As you apply variables to troubleshoot through your dashboard, you view dynamic changes to the data for a quicker resolution to the root cause. You can use template variables to drill down and examine the data on a granular level. For more information, see Filter with template variables.

Overview

The Varnish - Overview Dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on visitor geographic locations, traffic volume and distribution, responses over time, and time comparisons for visitor locations and uptime, cache hit, requests, VLC.

Use this dashboard to:

  • Analyze Request backend, frontend, VLCs, Pool, Thread, VMODs, and cache hit rate.
  • Analyze HTTP request about status code
  • Gain insights into Network traffic for your Varnish server.
  • Gain insights into originated traffic location by region. This can help you allocate computer resources to different regions according to their needs.
  • Gain insights into Client, Server Responses on Varnish Server. This helps you identify errors in Varnish Server.
Varnish dashboard

Visitor Traffic Insight

The Varnish - Visitor Traffic Insight Dashboard provides detailed information on the top documents accessed, top referrers, top search terms from popular search engines, and the media types served.

Use this dashboard to:

  • Gain insights into visitor traffic.
  • Identify top URLs visited.
  • Determine visitor locations.
  • Platforms, browsers, PC, Mac versions are used by the visitors to access.
Varnish dashboard

Web Server Operations

The Varnish - Web Server Operations Dashboard provides a high-level view combined with detailed information on the top ten bots, geographic locations, and data for clients with high error rates, server errors over time, and non 200 response code status codes. Dashboard panels also show server error logs, error log levels, error responses by the server, and the top URLs responsible for 404 responses.

Use this dashboard to:

  • Determine failures in responding.
  • Identify visitor locations with 4xx errors.
  • Gain insights into Clients causing a lot of 4xx errors.
Varnish dashboard

Traffic Timeline Analysis

The Varnish - Traffic Timeline Analysis dashboard provides a high-level view of the activity and health of Varnish servers on your network. Dashboard panels display visual graphs and detailed information on traffic volume and distribution, responses over time, as well as time comparisons for visitor locations and server hits.

Use this dashboard to:

  • To understand the traffic distribution across servers, provide insights for resource planning by analyzing data volume and bytes served.
  • Gain insights into originated traffic location by region. This can help you allocate compute resources to different regions according to their needs.
Varnish dashboard

Outlier Analysis

The Varnish - Outlier Analysis dashboard provides a high-level view of Varnish server outlier metrics for bytes served, the number of visitors, and server errors. You can select the time interval over which outliers are aggregated, then hover the cursor over the graph to display detailed information for that point in time.

Use this dashboard to:

  • Detect outliers in your infrastructure with Sumo Logic’s machine learning algorithm.
  • To identify outliers in incoming traffic and the number of errors encountered by your servers.
Varnish dashboard

Threat Intel

The Varnish - Threat Intel Dashboard provides an at-a-glance view of threats to Varnish servers on your network. Dashboard panels display threats count over a selected time period, geographic locations where threats occurred, source breakdown, actors responsible for threats, severity, and a correlation of IP addresses, method, and status code of threats.

Use this dashboard to:

  • To gain insights and understand threats in incoming traffic and discover potential IOCs.
  • Incoming traffic requests are analyzed using the Sumo - Crowdstrikes threat feed.
Varnish dashboard

Backend Servers

Varnish - Backend Servers dashboard provides several metrics that describe the communication between Varnish and its backend servers.

Use this dashboard to:

  • Review and manage the health of backend and frontend communication.
Varnish dashboard

Bans and Bans Lurker

Varnish - Bans and Bans Lurker tells you the list of Bans filters applied to keep Varnish from serving stale content.

Use this dashboard to:

  • Gain insights into bans and make sure that Varnish is serving the latest content.
Varnish dashboard

Cache Performance

The Varnish - Cache Performance dashboard provides worker thread related metrics to tell you if your thread pools are healthy and functioning well.

Use this dashboard to:

  • Gain insights into the performance and health of Varnish Cache.
  • Determine if any corrective actions are required to provide high performance and availability.
Varnish dashboard

Clients

The Varnish - Clients dashboard check collects Varnish metrics regarding connections and requests.

Use this dashboard to:

  • Review the current sessions and load on Varnish.
  • Determine if there are failures because of overloading and if additional resources are required.
Varnish dashboard

Threads

The Varnish - Threads Dashboard helps you to keep track of threads metrics to watch your Varnish Cache.

Use this dashboard to:

  • Manage and understand threads in the Varnish system
Varnish dashboard

Varnish Alerts

Sumo Logic has provided out-of-the-box alerts available via Sumo Logic monitors to help you quickly determine if the Varnish cache is available and performing as expected.

Alert Type (Metrics/Logs)Alert NameAlert DescriptionTrigger Type (Critical / Warning)Alert ConditionRecover Condition
MetricsVarnish - Backend BusyThis alert fires when the Varnish backend is busy for more than 5 minutes and is unable to serve requests.Warning >0 < =0
MetricsVarnish - Backend Connection RetriesThis alert fires when there a more than 5 backend connection retries, which can indicate misconfiguration.Warning >5 < =5
MetricsVarnish - Backend Failed ConnectionsThis alert fires when there are failed connections to the backend.Warning >0 < =0
MetricsVarnish - Unhealthy BackendThis alert fires when we detect that a backend server is unhealthy for more than 5 minutes.Critical >0 < =0
MetricsVarnish - Thread creation failedThis alert fires when Varnish is unable to create threads, which indicates either under-provisioning or misconfiguration.Warning >0 < =0
LogsVarnish - Access from Highly Malicious SourcesThis alert fires when Varnish is accessed from highly malicious IP addresses.Critical >0 < =0
LogsVarnish - High 4XX Error RateThis alert fires when too many HTTP requests (>5%) with a response status of 4xx.Critical >5 < =5
LogsVarnish - High 5XX Error RateThis alert fires when too many HTTP requests (>5%) with a response status of 5xx.Critical >5 < =5
Legal
Privacy Statement
Terms of Use

Copyright © 2023 by Sumo Logic, Inc.