SNMP polling for performance data

There are many commercial products that will collect device and interface statistics for network devices. I am investigating open source products to perform snmp polling, data storage and graphing. There are some open source products (like Observium) that will do fine with a small number of devices, but I am looking to find something that can scale to over 10,000 network devices.

Currently I’m looking at collectd for actually collecting the data. So far I have tested it as a local collector for Linux servers, so I have to determine how well the snmp poller performs.

Most system monitoring tools use RRD for data storage. There seems to be quite a bit of innovation around time series databases. I started looking at OpenTSDB and InfluxDB for data storage.

My current setup consists of collectd pushing data into InfluxDB (via write_graphite). For graphing, Grafana seems to be the most commonly used front end. If you have done any work with the ELK stack, Grafana will look very familiar as it is based on Kibana.

If I make any meaningful progress, I will document the configurations of the applications. I also plan on investigating some of the more recent methods of collecting performance data.

ThousandEyes Alerts – API and Webhooks

Introduction

There are a few different ways of handling ThousandEyes alerts. Active alerts can be polled via the API, forwarded by email, sent to PagerDuty or via webhook. I will be covering alerting via API and notifications via webhooks.

API

Accessing the alerts API is simple. Examples from my previous post on the ThousandEyes API are relevant. Here is Python code to query and “pretty print” active alerts in json format:

By default only active alerts are returned. In order to pull back previous alerts, time ranges should be used. These are also referenced in the alerts API documenation. I will append ?window=5d in order to return active alerts:

Output (removed all but one test location):

Webhook

Webhooks are configured via the Alerts page. You provide a name and a URL, optionally providing basic HTTP authentication. You should use authentication and SSL/TLS to avoid sending the traffic in the clear.

ThousandEyesWebhook

 

Here is a simple webhook handler in Python. It does not include SSL/TLS or authentication and I would not use this for anything other than simple testing.

An event notification message is sent when an alert is triggered and a notification is sent when the condition has cleared.

Conclusion

I have covered how to pull back alert information via both the ThousandEyes API and receiving alerts via Webhooks. The next step would be to perform some sort of integration with a logging, management or ticketing system.

Accessing the ThousandEyes API

The ThousandEyes API is very well documented. The examples in the documentation show the request/response transactions using curl. I will be writing my API calls with Python so they can be accessed and post-processed easily. I will be using the requests and json Python modules.

Here is the “Hello World” of requesting API data and saving it as json:

Output:

The API uses HTTP Basic authentication to access your account. The username is your account login and the password is the user token which can be found in your user settings. To test access to my account, I am going to pull back a list of tests.

Output:

I currently have one test created, so the API response contains information about that test. You can see that the response data contains API links to test data. In order to programmatically access the response data, you can treat the json object as a Python dictionary. For instance, to return the API link for BGP test data you would access j['test'][0]['apiLinks'][4]['href']: 'https://api.thousandeyes.com/net/bgp-metrics/39575'. This link will return the BGP metrics from the test.

This post has gone over accessing the ThousandEyes API. In my next post I will review additional API functionality.

Try the Docker tutorial

As network engineers we must keep up with the technologies and applications that will be leveraging the network. Server virtualization is quite common at this point, but many network engineers may be unfamiliar with the concept of containers.

While virtual machines require the installation of a full operating system and supporting libraries to run an application, containers virtualize the OS. The container only requires the installation of libraries and the application.

Docker is a popular container for deploying applications. There is a tutorial available that walks you through setting up a container and running an application within the container. It is a good experience for a network engineer to become familiar with the way developers could deploy an application with docker.

Networking in the container world is interesting. When a Docker container is created, Docker bridges the container’s NIC with the host’s NIC (by default). Leveraging iptables you can have the container accessible to the outside world.

See Docker’s advanced networking page for more details on the container’s network implementation.

Try the Docker Tutorial

TCP Time Sequence graphs with tcptrace

Overview

In Wireshark, you may have seen TCP time sequence graphs under Statistics > TCP StreamGraph. Time sequence graphs can be useful to troubleshoot TCP flows, but the Wireshark graphs lack some of the details that tcptrace provides.

wireshark-timesequence

Boring Wireshark tcptrace time sequence graph

tcptrace-xplot

Fancy tcptrace graphed with xplot

Notice that the tcptrace/xplot version of the graph flags the SYN packet. The S around 40ms represents a SACK. The white arrows (zoomed image below) indicate packets. An arrow with a diamond at the top indicates a packet with the PSH flag set. For more details on the different markings on a tcptrace time sequence graph, refer to the manual.

SACK / packet detailed view

SACK / packet detailed view

 

Installation and usage

In order to use tcptrace on a Debian/Ubuntu, install tcptrace and xplot.

In this example, I am providing tcptrace with a pcap containing a single TCP stream. tcptrace has more advanced methods for filtering, but if you are already analyzing a flow in Wireshark, it is easy enough to export the selected stream to a separate pcap.

Using tcptrace, create the xplot .xpl files. -zxy anchors the axis at 0; -S creates the time sequence graphs.

Graph the output using xplot. I had to use the xplot.org binary as vanilla xplot would not open the files generated by tcptrace.

In addition to time sequence graphs, tcptrace will generate throughput, round trip time, owin (or “bytes in flight”), and segment size graphs. Use tcptrace with the -G flag to generate .xpls for all graph types.

I hope you enjoy your new and improved TCP graphs. They are useful for getting a high level picture of the health of a TCP stream.

My new blog – JO Packet

I finally got around to starting my blog. I have started fresh with a new domain, new hosting provider and new blog platform.

Originally I installed Ghost because I wanted to work with node.js on the server side. I had a full Ghost setup behind nginx, which was a bit different than the standard nginx php-fpm setup that I have experience with.

Ghost is a great platform if your goal is to write blogs with no bells and whistles. I liked writing with Markdown, but the lack of commenting pushed me back towards WordPress.

In any case, the page is up and running with an A+ from SSL Labs. In the future I plan to write about packet analysis, networking, Linux and automation.