Thursday 26 September 2013

Auroraplot: data processing software for AuroraWatchNet

With the dispatch of the AuroraWatch schools' magnetometers imminent I have implemented a  Python  toolkit to process the data. The numpy and matplotlib modules are used extensively. The toolkit provides an API to load data and perform various processing actions on it, such as plotting data. The concept is influenced by my previous Multi-instrument Analysis toolbox for Matlab. In addition to the loading and processing of magnetic field data auroraplot allows other data types to be added later.

Loading data

Data can be loaded with arbitrary start and end times very simply:

md = ap.load_data('AURORAWATCHNET', 'LAN1', 'MagData',
                  np.datetime64('2013-09-20T12:00:00+0000'),
                  np.datetime64('2013-09-21T12:00:00+0000'))


In this case the selected portion of data crosses midnight and two data files must be loaded, concatenated and trimmed to get the desired time range. This is performed automatically by auroraplot, the user need not be concerned with the format of the files or where they are located. It is even possible for the files to be downloaded on-the-fly using FTP or HTTP transfer protocols.

load_data returns an object (of type MagData) to the user containing the actual magnetic field data and various other metadata, such as a timestamp for each sample and the data units. Each object can store more multiple data channels but all data points must share the same timestamps, be of the same type and share the same units. Therefore it is not possible to store operating temperatures (units °C) in an object holding magnetic field strength (units tesla). The operating temperature data can be accessed as:

td = ap.load_data('AURORAWATCHNET', 'LAN1', 'TemperatureData',
                  np.datetime64('2013-09-20T12:00:00+0000'),
                  np.datetime64('2013-09-21T12:00:00+0000'))


Battery voltage (data type VoltageData) can be accessed in a similar same way.

Plotting data

High-level plot functions enable the data be be plotted very simply, for the magnetic field data loaded previously

md.plot()


will produce a matplotlib figure with a title and the axes labelled with the correct units. Temperature and voltage data are plotted in the same way.

I have created some tools to make working with numpy's datetime64 and timedelta64 objects more convenient, including rounding functions (round, ceil and floor) which round to an interval. They are useful for finding the start of an hour, or the end of a day. I have also created Locator and Formatter classes to sensibly label time axes using datetime64 times and timedelta64 intervals. Tick marks are located on the nearest second, minute, hour, day, month or year boundary (or multiple thereof) depending on the time interval being displayed. Thanks to matplotlib's structure the labels are automatically regenerated with the most appropriate time units when a plot is zoomed.

Other operations

Quiet-day curves

Other operations include the generation of quiet-day curves. These are the curves from which we measure geomagnetic activity and are of critical importance for AuroraWatch UK. There are are not flat but have a daily variation caused by the equatorial electrojet. The empirical algorithm selects the days (typically 5) with the least geomagnetic activity. A truncated Fourier series is used to guarantee that the quiet-day curves are cyclic, with the start and end points having the same magnitude and slope. This is essential otherwise our rolling plots would show up the discontinuities in the QDC at midnight, and would falsely cause step changes in the geomagnetic activity. An example QDC is shown below.

qdc
Quiet-day curve for magnetometer at Lancaster ,UK. This is derived from recorded data
and clearly shows the Sq current system caused by the equatorial electrojet.

From this we can see that even on a geomagnetically quiet day we would expect a 30nT variation in field strength seen by the magnetometer. The AuroraWatch threshold for minor geomagnetic activity is 50nT so this shows the importance of using a quiet-day curve instead of a flat line when calculating geomagnetic activity.

Stack plots

Stack plots (also called magnetograms) are a convenient representation for magnetic field data from a set of magnetometers separated in latitude.Data from the northernmost instruments is placed at the top and that from the southernmost at the bottom. An example stackplot is shown below:

20130920
Stackplot showing data from two Lancaster stations and from Ormskirk.
The magnetometer at Ormskirk is operated by the Met Office as part of a test. The stackplots will be more interesting as the network grows.

Open source

The source code is available under a BSD-type license from Github.You will need python, along with the numpy (version 1.7), matplotlib and scipy python modules.auroraplot has been tested under Debian Linux (64 bit version) and Raspbian on the Raspberry Pi.