Linux Australia expense breakdown – a data visualisation in d3.js

After learning a lot of new techniques and approaches (and gotchas) in d3.js in my last data visualisation (Geelong Regional Libraries by branch), I wanted to turn my new-found skills to Linux Australia’s end of year report. This is usually presented in a fairly dry manner at the organisation’s AGM each year, and although we have a Timeline of Events, it was time to add some visual interest to the presentation of data.

Collecting and cleaning the data

The dataset that I chose to explore was the organisation’s non-event expenses – that is, the expenditure of the organisation not utilised on specific events – items like insurance, stationery, subscriptions to online services and so on. These were readily available in the accounting software – Xero, then a small amount of data cleansing yielded a simple CSV file. The original file had a ‘long tail’ distribution – there were many data point that had only a marginal value and didn’t help in explaining the data, so I combined these into an ‘other’ category.

Visualising the data

Using the previous annular (donut chart) visualisation as the base, I set some objectives for the visualisation;

  • The colours chosen had to match those of Linux Australia’s branding
  • The donut chart required lines and labels
  • The donut chart required markers inside each arc
  • The donut chart had to be downloadable in svg format so that it could be copied and pasted into Inkscape (which has svg as its standard save format)
Colour choice

There was much prototyping involved with colour selection. The first palette selected used shading of a base colour (#ff0000 – red), but the individual arcs were difficult to distinguish. A second attempt added many (16) colours into the palette, but they didn’t work as a colour set. I settled on a combination of three colours (red, yellow, dark grey) and shades of these, with the shading becoming less saturated the smaller the values of the arc.

For anyone interested, the color range was defined as a d3.scaleOrdinal object as below.

var color = d3.scaleOrdinal()
    .range([
      '#ffc100',
      '#ff0000',
      '#393939',
      '#ffcd33',
      '#ff3333',
      '#616161',
      '#ffda66',
      '#ff6666',
      '#888888',
      '#ffe699',
      '#ff9999',
      '#b0b0b0',
      '#fff3cc',
      '#ffcccc',
      '#fff'
    ])
Lines and markers

I hadn’t used lines (polylines) and markers in d3.js before and this visualisation really needed them – because the data series labels were too wordy to easily fit on the donut chart itself. There were some examples that were particularly useful and relevant in figuring this out:

The key learning from this exercise about svg polylines is that the polyline is essentially a series of x,y Cartesian co-ordinates – the tricky part is actually using the right circular trigonometry to calculate the correct co-ordinates. This took me right back to sin and cos basics, and I found it helpful to sketch out a diagram of where I wanted the polyline points to be before actually trying to code them in d3.js.

A gotcha that tripped me up for about half an hour here was that I hadn’t correctly associated the markers with the polylines – because the markers only had a class attribute, but not an id attribute. Whenever I use markers on polylines from now on, I’ll be specifying both class and id attributes.

    .attr('class', 'marker')
    .attr('id', 'marker')

I initially experimented with a polyline that was drawn not just from the centroid of the arc for each data point out past the outerArc, but one that also went horizontally across to the left / right margin of the svg. While I was able to achieve this eventually, I couldn’t get the horizontal spacing looking good because there were so many data points on the donut chart – this would work well with a donut chart with far fewer data points.

Markers were also generally straightforward to get right, after reading up a bit on their attributes. Again, one of the gotchas I encountered here was ensuring that the markerWidth and markerHeight attributes were large enough to contain the entire marker – for a while, the markers were getting truncated, and I couldn’t figure out why.

    .attr('markerWidth', '12')
    .attr('markerHeight', '12')
Labels

Once the positioning for the polylines was solved, then positioning the labels was relatively straightforward, as many of the same trigonometric functions were used.

The challenge I encountered here was that d3.js by default has no text wrapping solution built in to the package, although alternative approaches such as the below had been documented elsewhere. From what I could figure out, d3.js does not support the tspan svg element. That is, I can’t just append tspan elements to text elements to achieve word-wrapping.

  • Example block from Mike Bostock – ‘Wrapping long labels‘: in which Mike Bostock (the creator and maintainer of d3.js) has written a custom function for wrapping text
  • d3-textwrap function from Vijith Assar: which provides a function that can be included into d3.js projects, like a plugin

In the end I ended up just abbreviating a couple of the data point labels rather than sink several hours into text wrapping approaches. It seems odd that svg provides such poor native support for text wrapping, but considering the myriad ways that text – particularly foreign language text – can be wrapped – it’s incredibly complex.

Downloadable svg

The next challenge with this visualisation was to allow the rendered svg to be downloaded – as the donut chart was intended to be part of a larger infographic. Again, I was surprised that a download function wasn’t part of the core d3.js library, but again a number of third party functions and approaches were available:

  • Example block from Miłosz Kłosowicz – ‘Download svg generated from d3‘: in this example, the svg node is converted to base-64 encoded ASCII then downloaded.
  • d3-save-svg plugin: this plugin provides a number of methods to download the svg, and convert it to a raster file format (such as PNG). This is a fork of the svg-crowbar tool, written for similar purposes by the New York Times data journalism team.

I chose to use the d3-save-svg plugin simply because of the abstraction it provided. However, I came up against a number of hurdles. When I first used the example code to try and create a download button, the download function was not being triggered. To work around this, I referenced the svg object by id:

d3_save_svg.save(d3.select('#BaseSvg').node(), config);

The other hiccup with this approach was that CSS rules were not preserved in the svg download if the CSS selector had scope outside the svg object itself. For instance, I had applied basic font styling rules to the entire body selector, but in order for font styling to be preserved in the download, I had to re-specify the font styling at the svg selector level in the CSS file. This was a little frustrating, but the ease of using a function to do the download compensated for this.

Linux Australia expenses 2015-2016 infographic
Linux Australia expenses 2015-2016 infographic

 

Save

Geelong Libraries by branch – a data visualisation

At a glance

Introduction

Geelong Regional Libraries Corporation (GRLC) came on board GovHack this year, and as well as being a sponsor, opened a number of datasets for the hackathon. Being the lead organiser for GovHack, I didn’t actually get a chance to explore the open data during the competition. However, while I was going for a walk one day – as it always does – I had an idea around how the GRLC datasets could be visualised. I’d previously done some work visualising data using a d3.chord layout, and while this data wasn’t suitable for that type of layout, the concept of using annulars – donut charts – to represent and compare the datasets seemed appealing. There was only one problem – I’d never tackled anything like this before.

Challenge: accepted

Understanding what problem I was trying to solve

Of course the first question here was what problem I was trying to solve (thanks Radia Perlman for teaching me to always solve the right problem – I’ll never forget your LCA2013 keynote). Was this an exploratory data visualisation or an explanatory one? This led to formulating a problem statement:

How do the different Libraries in the Geelong region compare to each other in terms of holdings, membership, visits and other attributes?

This clearly established some parameters for the visualisation: it was going to be exploratory, and comparative. It would need to have a way to identify each Library – likely via colour code, and have appropriate use of shapes and axes to allow for comparison. While I was tempted to use a stacked bar chart, I really wanted to dig deeper into d3.js and extend my skills in this Javascript library – so resolved to visualise the data using circular rings.

Colour selection

The first challenge was to ensure that the colours of the visualisation were both appealling and appropriate. While this seems an unlikely starting place for a visualisation – with most practitioners opting to get the basic shape right first, for this project getting the colours right felt like the best starting point. For inspiration, I turned to the Geelong Regional Library Corporation’s Annual Report, and used the ColorZilla extension to eyedropper the key brand colours used in the report. However, this only provided about 7 colours, and I needed 17 in order to map each of the different libraries. In order to identify ‘in between’ colours, I used this nifty tool from Meyerweb, which is super-handy for calculating gradients. The colours were then used as an array for a d3.scaleOrdinal object, and mapped to each library.

var color = d3.scaleOrdinal()
    .range([
        "#59d134",
        "#4CCA62",
        "#40C28F",
        "#33bbbd",
        "#36AFC7",
        "#3b98da",
        "#427DC9",
        "#5148a6",
        "#8647A8",
        "#BC47A9",
        "#f146ab",
        "#F03E85",
        "#F0355E",
        "#f0431e",
        "#F8880F",
        "#FFCC00",
        "#ACCF1A"
      ])
    .domain([
        "Geelong",
        "Belmont",
        "Corio",
        "Geelong West",
        "Waurn Ponds",
        "Ocean Grove",
        "Newcomb",
        "Torquay",
        "Drysdale",
        "Lara",
        "Bannockburn",
        "Queenscliff",
        "Chilwell",
        "Highton",
        "Mobile Libraries",
        "Barwon Heads",
        "Western Heights College"
    ]);

Annular representation of data using d3.pie

Annular representation of data - step 1
First step in annular representation

The first attempt at representing the data was … a first attempt. While I was able to create an annular representation (donut chart) from the data using d3.pie and d3.arc, the labels of the Libraries themselves weren’t positioned well. The best tutorial I’ve read on this topic by far is from data visualisation superstar, Nadieh Bremer, over on her blog, Visual Cinnamon. I decided to leave labels on the arcs as a challenge for later in the process, and instead focus on the next part of visualisation – multiple annulars in one visualisation.

Multiple annulars in one visualisation

Annular representation of data - step 2
Uh-oh!

The second challenge was to place multiple annulars – one for each dataset – within the same svg. Normally with d3.js, you create an svg object which is appended to the body element of the html document. So what happens when you place two d3.pie objects on the svg object? You guessed it! Fail! The two annulars were positioned one under the other, rather than over the top of each other. I was stuck on this problem for a while, until I realised that the solution was to place different annulars on different layers within the svg object. This also gave more control over the visualisation. However, SVG doesn’t have layers as part of its definition – objects in SVG are drawn one on top of the other, with the last drawn object ‘on top’ – sometimes called stacking . But by creating groups within the BaseSvg like the below, for shapes to be drawn within, I was able to approximate layering.

var BaseSvg = d3.select("body").append("svg")
    .attr("width", width)
    .attr("height", height)
    .append("g")
    .attr("transform", "translate(" + (width / 2 - annularXOffset) + "," + (height / 2 - annularYOffset) + ")");

/*
  Layers for each annular
*/

var CollectionLayer = BaseSvg.append('g');
var LoansLayer      = BaseSvg.append('g');
var MembersLayer    = BaseSvg.append('g');
var EventsLayer     = BaseSvg.append('g');
var VisitsLayer     = BaseSvg.append('g');
var WirelessLayer   = BaseSvg.append('g');
var InternetLayer   = BaseSvg.append('g');
var InternetLayer   = BaseSvg.append('g');
var TitleLayer      = BaseSvg.append('g');
var LegendLayer     = BaseSvg.append('g');

At this point I found Scott Murray’s SVG Primer very good reading.

Annular representation of data - step 3
The annulars are now positioned concentrically

I was a step closer!

Adding in parameters for spacing and width of the annulars

Once I’d figured out how to get annulars rendering on top of each other, it was time to experiment with the size and shape of the rings. In order to do this, I tried to define a general approach to the shapes that were being built. That general approach looked a little like this (well, it was a lot more scribble).

General approach to calculating size and proportion of multiple annulars
General approach to calculating size and proportion of multiple annulars

By being able to define a general approach, I was able to declare variables for elements such as the annular width and annular spacing, which became incredibly useful later as more annulars were added – the positioning and shape of the arcs for each annular could be calculated mathematically using these variables (see the source code for how this was done).

var annularXOffset  = 100; // how much to shift the annulars horizontally from centre
var annularYOffset  = 0; // how much to shift the annulars vertically from centre
var annularSpacing  = 26; // space between different annulars
var annularWidth    = 22; // width of each annular
var annularMargin   = 70; // margin between annulars and canvas
var padAngle        = 0.027; // amount that each segment of an annular is padded
var cornerRadius    = 4; // amount that the sectors are rounded

This allowed me to ‘play around’ with the size and shape of the annulars until I got something that was ‘about right’.

Annular representation of data - step 4
Annular spacing overlapped

 

Annular representation of data - step 3
Annular widths and spacing looking better

At this stage I also experimented with the padAngle of the annular arcs (also defined as a variable for easy tweaking), and with the stroke weight and colour, which was defined in CSS. Again, I took inspiration from GRLC’s corporate branding.

Placing dataset labels on the arcs

Now that I had the basic shape of the visualisation, the next challenge was to add dataset labels. This was again a major blocking point, and it took me a lot of tinkering to finally realise that the dataset labels would need to be svg text, sitting on paths created from separate arcs than that rendered by the d3.pie function. Without separate paths, the text wrapped around each arc segment in the annular – shown below. So, for each dataset, I created a new arc and path for the dataset label to be rendered on, and then appended a text element to the path. I’d never used this technique in svg before and it was an interesting learning experience.

Annular representation of data - step 6
Text on arcs is a dark art

Having sketched out a general approach again helped here, as with the addition of a few extra variables I was able to easily create new arcs for the dataset text to sit on. A few more variables to control the positioning of the dataset labels, and voila!

Annular representation of data - step 7
Dataset labels looking good

Adding a legend

The next challenge was to add a legend to the diagram, mostly because I’d decided that the infographic would be too busy with Library labels on each data point. This again took a bit of working through, because while d3.js has a d3.legend function for constructing legends, it’s only intended for data plotted horizontally or vertically, not 7 data sets plotted on consecutive annulars. This tutorial from Zero Viscosity and this one from Competa helped me understand that a legend is really just a group of related rectangles.

var legend = LegendLayer.selectAll("g")
    .data(color.domain())
    .enter()
    .append('g')
    .attr('x', legendPlacementX)
    .attr('y', legendPlacementY)
    .attr('class', 'legend')
    .attr('transform', function(d, i) {
        return 'translate(' + (legendPlacementX + legendWidth) + ',' + (legendPlacementY + (i * legendHeight)) + ')';
});

legend.append('rect')
    .attr('width', legendWidth)
    .attr('height', legendHeight)
    .attr('class', 'legendRect')
    .style('fill', color)
    .style('stroke', legendStrokeColor);

legend.append('text')
    .attr('x', legendWidth + legendXSpacing)
    .attr('y', legendHeight - legendYSpacing)
    .attr('class', 'legendText')
    .text(function(d) { return d; });
Annular representation of data - step 8
The legend isn’t positioned correctly

Again, the positioning took a little work, but eventually I got the legend positioned well.

Annular representation of data - step 9
The legend is finally positioned well

Responsive design and data visualisation with d3.js

One of the other key challenges with this project was attempting to have a reasonably responsive design. This appears to be incredibly hard to do with d3.js. I experimented with a number of settings to aim for a more responsive layout. Originally, the narrative text was positioned in a sidebar to the right of the image, but at different screen resolutions the CSS float rendered awkwardly, so I decided to use a one column layout instead, and this worked much better at different resolutions.

Next, I experimented with using the Javascript values innerWidth and innerHeight to help set the width and height of the svg element, and also dynamically positioned the legend. This gave a much better, while not perfect, rendering at different resolutions. It’s still a little hinkey, particularly at smaller resolutions, but is still an incremental improvement.

Thinking this through more deeply, although SVG and d3.js in general are vector-based, and therefore lend themselves well to responsive design to begin with, there are a number of elements which don’t scale well at different resolutions – such as text sizes. Unless all these elements were to be made dynamic, and likely conditional on viewport and orientation, then it’s going to be challenging indeed to produce a visualisation that’s fully responsive.

Adding tooltips

While I was reasonably pleased with the progress on the project, I felt that the visualisation needed an interactive element. I considered using some sort of arc tween to show movement between data sets, but given that historical data (say for previous years) wasn’t available, this didn’t seem to be an appropriate choice.

After getting very frustrated with the lack of built in tooltips in d3.js itself, I happened upon the d3.tip library. This was a beautifully written addition to d3.js, and although its original intent was for horizontal and vertical chart elements, it worked passably on annular segments.

Annular representation of data - step 10
Adding tooltips

Drawbacks in using d3.tip for circular imagery

One downside I found in using this library was the way in which it considers the positioning of the tooltip – this has some unpredictable, and visually unpleasant results when data is being represented in circular format. In particular, the way that d3.tip calculates the ‘centre’ of the object that it is applied to does not translate well to arc and circular shapes. For instance, look at how the d3.tip is applied to arc segments that are large and have only small amounts of curvature – such as the Geelong arc segment for ‘Members’. I’ve had a bit of a think about how to solve this problem, and the solution involves a more optimal approach to calculating the the ‘centre’ point of an arc segment.

This is beyond what I’m capable of with d3.js, but wanted to call this out as a future enhancement and exploration.

Adding percentage values to the tooltip with d3.nest and d3.sum

The next key challenge was to include the percentage figure, as well as Library and data value in the d3.tip. This was significantly more challenging than I had anticipated, and meant learning up on d3.nest and d3.sum functions. These tutorials from Phoebe Bright, and LearnJS were helpful, and Zan Armstrong’s tutorial on d3.format helped me get the precision formatting correct. After much experimentation, it turned out that summing the values of each dataset (in order to calculate percentage) was but a mere three lines of Javascript:

var CollectionItemCount = d3.nest()
    .rollup(function (v) { return d3.sum(v, function(d) { return d.Items})})
    .entries(CollectionData);

Concluding remarks

Data visualisation is much more challenging than I thought it would be, and the learning curve for d3.js is steep – but it’s worth it. This exercise drew on a range of technical skills, including circular trigonometry, HTML and knowledge of the DOM, CSS and Javascript, and above all the ability to ‘break a problem down’ and look at it from multiple angles (no pun intended).

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Valentine’s Day: A data visualization to learn Chords in d3.js

One of my learning goals this year was to really understand d3.js, and become more proficient in creating interactive data visualizations. In turn, this lead me to attempting to learn and analyse Chord diagrams. Chord diagrams visualize relationships either unilaterally or bilaterally. For example, they have been used to show capital flows inbound and outbound in financial visualization.

Learning how Chord diagrams are constructed in d3.js

Firstly, I wanted a solid primer on how Chord diagrams are constructed in d3.js. Steven Hall’s excellent blog post for Delimited provided the best overview, and clearly articulated elements of a Chord diagram such as the Matrix, the map of flows, arcs and paths. Using this approach, I decided to construct my own scenario and see if I could visualize it. The scenario had to have:

  • Bi-directional data flows which may be asymmetric (x has a relationship with y, but y may have a different relationship with x)
  • A small enough dataset that I could manually construct it (without having to do a lot of CSV or json processing – this exercise was about learning Chords, not about grokking data loading in d3.js)
  • A dataset that could be easily understood by a layperson

I settled on the concept of Valentine’s Day crushes, because they satisfy the above criteria. Next, I constructed a number of statements that were to be visualized. They assumed that one person expressed a crush on one other person, and that this may or may not be mutual. After doing an initial list, I had to call myself out – I’d assumed hetero-normative relationships (male attracted to female and vice-versa), but of course that’s simply not diverse or inclusive thinking.

  • Bob (male, hetero) likes Emily (female, hetero)
  • Giovanni (male, hetero) likes Emily (female, hetero)
  • Kevin (male, hetero) likes Poh (female, hetero)
  • Art (male, hetero) likes Viva (female, hetero)
  • Pyotr (male, hetero) likes Viva (female, hetero)
  • Rohan (male, hetero) likes Rachel (female, hetero)
  • Sasha (male, same-sex attracted) likes Pyotr
  • Emily likes Bob
  • Rachel likes Rohan
  • Poh (female, hetero) likes Sasha
  • Viva likes Pyotr
  • Lee (female, same-sex attracted) likes Poh

The next step was to convert these statements into a matrix.

The matrix

Matrices are usually built from spreadsheet or other tabular datasets. Therefore, it was helpful for me to represent the above relationships in a table.

Valentine’s Day preferences
Name Bob Giovanni Steve Kevin Art Pyotr Rohan Sasha Emily Rachel Poh Viva Lee
Bob No No No No No No No Yes No No No No
Giovanni No No No No No No No Yes No No No No
Steve No No No No No No No No Yes No No No
Kevin No No No No No No No No No Yes No No
Art No No No No No No No No No No Yes No
Pyotr No No No No No No No No No No Yes No
Rohan No No No No No No No No Yes No No No
Sasha No No No No No Yes No No No No No No
Emily Yes No No No No No No No No No No No
Rachel No No No No No No Yes No No No No No
Poh No No No No No No No Yes No No No No
Viva No No No No No Yes No No No No No No
Lee No No No No No No No No No No Yes No

The matrix is an inherent part of Chord diagrams. Chord diagrams are based on a symmetric matrix  – that is, there are as many rows in the matrix as there are columns. One of the first mistakes I made in this exercise was not to have the columns and rows in the same order – I ordered the names of people in the columns differently to the rows. When this was implemented as a Chord layout in d3.js, it was an incorrect representation.

Trap: Ensure that in your data matrix, that the data in rows and columns is in the same order. If you don’t have your data in the same order, the Chord diagram will assume that your row data is in the same order as your column data.

From this table, I was then able to declare a matrix variable:

var matrix = [
  [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], // Bob
  [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0], // Giovanni
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], // Steve
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0], // Kevin
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], // Art
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0], // Pyotr
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0], // Rohan
  [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0], // Rachel
  [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], // Sasha
  [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], // Emily
  [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0], // Lee
  [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0], // Viva
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]  // Poh
]

Chords and ribbons

The next step in this process is to calculate the flows in the matrix in both directions. In this example, it means calculating the relationship between each of the people, and that relationship may not be equal or symmetric. For instance, Rachel likes Rohan, and Rohan likes Rachel this is a symmetric relationship. However, Kevin likes Poh, but Poh likes Sasha. The relationship is asymmetric. This is reasonably simple to do (and Steven Hall provides an excellent pseudocode example in his blog post). d3.js being the knock-your-socks-off piece of awesome that is provides an inbuilt function for this:

chords(matrix)

This function calculates several values needed to calculate the chords in a Chord diagram, including the starting and ending angle of the chords, as well as the values (inbound and outbound) of the chords.  See the d3.js manual entry for more information.

Arcs

A Chord diagram often has labels around the outside of the diagram, and these are produced in d3.js by passing the subgroups (from the matrix() function) to the

d3.arc()

generator. See the d3.js manual entry for more information.

Problems encountered

One of the key issues I encountered in getting this far with the data visualization was the syntax changes between version 3 of d3.js and version 4. There are a number of changes to method names, and the way that they are called under version 4. Many of the examples I used as a jumping off point were done prior to mid-2016 (when version 4 was released), using the older syntax. My example used version 4, and this resulted in a number of syntax errors if I was ‘copying and pasting’ code.

The way I handled this was to have a good read through the ChangeLog for version 4, noting in particular the changes to Chord and Ribbon methods.

Another issue that occupied some brain cycles was the different way that the matrices were calculated. Many earlier examples and tutorials included a ‘mapper’ function where series data was mapped to a matrix. In my example, the matrix() function did mapping as well.

As someone who’s familiar with OO-style programming, I’m still getting used to the way that d3.js modifies the document object model, first by selecting, entering and then modifying one or more DOM elements. This is something I’m just going to have to “get used to” as I use d3.js more.

Trap: Ensure you’re using the correct method calls for the version of d3.js you’re using

Visual design aspects

Data is only one part of the data visualization lifecycle. In order to be useful, it has to be visualized in a meaningful way.

Colours

The first major choice was how to represent the different people in the visualisation. It made sense to use different colours for men and women, and following (traditional, socialised, boring, gender-normative – I get it) what people are expecting, I chose blue for men and pink for women (from the Pantone Colours of Spring 2016 palette. Because Pantone). This provided a pleasant looking graphic, but data visualization is about telling a story.

I decided to add in two more colours to represent the same-sex attracted people in the data series (Sasha and Poh). This added visual interest, and made it easier to interpret some of the interesting details about the whole that were not apparent from the initial statements.

Chord diagram with solid colour in ribbons
Chord diagram with solid colour in ribbons

Using gradients in the ribbons

As you can see from the above, the solid colours (well, solid colours with a transparency applied) don’t really narrate the story of this visualisation very well. The pink colour dominates, and nuances (such as the unrequited love triangle between Poh, Sasha and Lee) in the data are less obvious.

Visually, I wanted the ribbons in the diagram to have gradients. At first glance this looked incredibly complex, and I was about to give up, when I found an excellent article by Nadieh Bremer, one of the gurus of d3.js, on this exact topic. Nadieh’s article provides the mathematical basis for visually appealing gradients in ribbons, including how the direction of the gradient is calculated, based on the position and direction of the ribbon. It’s very well articulated, and you don’t even need basic trigonometry skills to get it – it’s visually explained.

In a nutshell, Nadieh’s code calculates the gradient start and stop points for each ribbon, and the angular direction in which the gradient should be applied.

Using Nadieh’s code, I then applied gradients to the ribbons, for a much more informative and meaningful visualization.

Chord diagram with gradients
Chord diagram with gradients

Arc labels

The next tricky piece visually was adding the name of each person to the arc. For this, I relied on code from this Chord example from AndrewRP.  This included applying CSS styles to the svg text, which I hadn’t done before. Because you’re styling text within an svg element, you need to prefix the selector with the svg element:

svg .titles {
  font-size: 180%;
  font-family: "Abel", sans-serif;
  font-weight: bold
}

This wasn’t something I’d done before, so it was a great learning experience.

I’m still not entirely happy with the labels in the arcs – I would much prefer them to be larger, bolder and centred within the arc segments themselves. A good extension activity for another day.

Telling the story

Of course a key point with a data visualization is for the data to tell a story.

Visualizing the Valentine’s day sentiments that we started off with allows us to derive a lot more meaning from the data overall:

  • We can see a tragic unrequited love triangle. Lee is attracted to Poh, who is attracted to Sasha, who is attracted to Pyotr, but Pyotr and Viva are mutually attracted
  • No-one is attracted to Lee, Giovanni, Kevin or Art
  • We can see that Rohan and Rachel, Bob and Emily and Viva and Pyotr have mutual attraction.
  • Rachel and Viva are both liked by two men and thus are the most popular women

Next steps

There were some additional elements I would have liked to have added to this visualization, but ran out of time to implement – but I’m noting them here as extension activities if I come back to this in the future.

  • As mentioned, I’d like to clean up the arc titles and visually enhance them
  • Being able to click on a ribbon and learn more information about a particular relationship would be useful and visually pleasing. Nadieh Bremer again has a worked example of how to achieve this, however the code is quite complex and requires a large code base for the ToolTip functionality.
  • It would also be ideal to isolate a particular ribbon – especially given that so many ribbons are overlapping in the centre, making it harder to follow visually who is attracted to who. This would use some form of opacity change to ‘fade’ the ribbons not selected.
  • A number of the variables are statically declared in the code, as arrays. While this is totally fine as a learning example, for reusability I’d much prefer to put them into CSV or JSON files, then use Javascript to read them in.

Get the code

See the final visualization at:

http://blog.kathyreid.id.au/valentines/

And get the source code on GitHub at:

https://github.com/KathyReid/valentines-dataviz