Linux Australia expense breakdown – a data visualisation in d3.js

Linux Australia expenses 2015-2016 infographic

After learning a lot of new techniques and approaches (and gotchas) in d3.js in my last data visualisation (Geelong Regional Libraries by branch), I wanted to turn my new-found skills to Linux Australia’s end of year report. This is usually presented in a fairly dry manner at the organisation’s AGM each year, and although we have a Timeline of Events, it was time to add some visual interest to the presentation of data.

Collecting and cleaning the data

The dataset that I chose to explore was the organisation’s non-event expenses – that is, the expenditure of the organisation not utilised on specific events – items like insurance, stationery, subscriptions to online services and so on. These were readily available in the accounting software – Xero, then a small amount of data cleansing yielded a simple CSV file. The original file had a ‘long tail’ distribution – there were many data point that had only a marginal value and didn’t help in explaining the data, so I combined these into an ‘other’ category.

Visualising the data

Using the previous annular (donut chart) visualisation as the base, I set some objectives for the visualisation;

  • The colours chosen had to match those of Linux Australia’s branding
  • The donut chart required lines and labels
  • The donut chart required markers inside each arc
  • The donut chart had to be downloadable in svg format so that it could be copied and pasted into Inkscape (which has svg as its standard save format)
Colour choice

There was much prototyping involved with colour selection. The first palette selected used shading of a base colour (#ff0000 – red), but the individual arcs were difficult to distinguish. A second attempt added many (16) colours into the palette, but they didn’t work as a colour set. I settled on a combination of three colours (red, yellow, dark grey) and shades of these, with the shading becoming less saturated the smaller the values of the arc.

For anyone interested, the color range was defined as a d3.scaleOrdinal object as below.

var color = d3.scaleOrdinal()
    .range([
      '#ffc100',
      '#ff0000',
      '#393939',
      '#ffcd33',
      '#ff3333',
      '#616161',
      '#ffda66',
      '#ff6666',
      '#888888',
      '#ffe699',
      '#ff9999',
      '#b0b0b0',
      '#fff3cc',
      '#ffcccc',
      '#fff'
    ])
Lines and markers

I hadn’t used lines (polylines) and markers in d3.js before and this visualisation really needed them – because the data series labels were too wordy to easily fit on the donut chart itself. There were some examples that were particularly useful and relevant in figuring this out:

The key learning from this exercise about svg polylines is that the polyline is essentially a series of x,y Cartesian co-ordinates – the tricky part is actually using the right circular trigonometry to calculate the correct co-ordinates. This took me right back to sin and cos basics, and I found it helpful to sketch out a diagram of where I wanted the polyline points to be before actually trying to code them in d3.js.

A gotcha that tripped me up for about half an hour here was that I hadn’t correctly associated the markers with the polylines – because the markers only had a class attribute, but not an id attribute. Whenever I use markers on polylines from now on, I’ll be specifying both class and id attributes.

    .attr('class', 'marker')
    .attr('id', 'marker')

I initially experimented with a polyline that was drawn not just from the centroid of the arc for each data point out past the outerArc, but one that also went horizontally across to the left / right margin of the svg. While I was able to achieve this eventually, I couldn’t get the horizontal spacing looking good because there were so many data points on the donut chart – this would work well with a donut chart with far fewer data points.

Markers were also generally straightforward to get right, after reading up a bit on their attributes. Again, one of the gotchas I encountered here was ensuring that the markerWidth and markerHeight attributes were large enough to contain the entire marker – for a while, the markers were getting truncated, and I couldn’t figure out why.

    .attr('markerWidth', '12')
    .attr('markerHeight', '12')
Labels

Once the positioning for the polylines was solved, then positioning the labels was relatively straightforward, as many of the same trigonometric functions were used.

The challenge I encountered here was that d3.js by default has no text wrapping solution built in to the package, although alternative approaches such as the below had been documented elsewhere. From what I could figure out, d3.js does not support the tspan svg element. That is, I can’t just append tspan elements to text elements to achieve word-wrapping.

  • Example block from Mike Bostock – ‘Wrapping long labels‘: in which Mike Bostock (the creator and maintainer of d3.js) has written a custom function for wrapping text
  • d3-textwrap function from Vijith Assar: which provides a function that can be included into d3.js projects, like a plugin

In the end I ended up just abbreviating a couple of the data point labels rather than sink several hours into text wrapping approaches. It seems odd that svg provides such poor native support for text wrapping, but considering the myriad ways that text – particularly foreign language text – can be wrapped – it’s incredibly complex.

Downloadable svg

The next challenge with this visualisation was to allow the rendered svg to be downloaded – as the donut chart was intended to be part of a larger infographic. Again, I was surprised that a download function wasn’t part of the core d3.js library, but again a number of third party functions and approaches were available:

  • Example block from Miłosz Kłosowicz – ‘Download svg generated from d3‘: in this example, the svg node is converted to base-64 encoded ASCII then downloaded.
  • d3-save-svg plugin: this plugin provides a number of methods to download the svg, and convert it to a raster file format (such as PNG). This is a fork of the svg-crowbar tool, written for similar purposes by the New York Times data journalism team.

I chose to use the d3-save-svg plugin simply because of the abstraction it provided. However, I came up against a number of hurdles. When I first used the example code to try and create a download button, the download function was not being triggered. To work around this, I referenced the svg object by id:

d3_save_svg.save(d3.select('#BaseSvg').node(), config);

The other hiccup with this approach was that CSS rules were not preserved in the svg download if the CSS selector had scope outside the svg object itself. For instance, I had applied basic font styling rules to the entire body selector, but in order for font styling to be preserved in the download, I had to re-specify the font styling at the svg selector level in the CSS file. This was a little frustrating, but the ease of using a function to do the download compensated for this.

Linux Australia expenses 2015-2016 infographic
Linux Australia expenses 2015-2016 infographic

 

Save