Part Four: Data Joins

The data join is a cornerstone of D3’s philosophy and is one of the mechanisms that allow amazing interactive data visualisations to be built with relative simplicity.

What is a data join?

Given an array of data and a D3 selection of HTML or SVG elements, we can attach or ‘join’ each array element to each element of the selection.

This creates a close relationship between your data and graphical elements which makes manipulation of the elements based on the data straighforward.

How to make a join

There are two functions in D3 which join data to a d3 selection: .datum() and data().

.datum()

.datum() joins a single entity to a selection whilst .data() joins an array to a selection. The former is only occasionally used. Let’s look at some examples. Supposing we have 5 circles on our page:

Let’s now select the first circle and join a single piece of data to it:

var s = d3.select('circle');
s.datum(100);
.data()

Similarly we can select all the circles and join an array to it:

var s = d3.selectAll('circle');
s.data([5, 25, 20, 50, 40]);

Manipulation based on joined data

If you recall the previous article on selections we can manipulate style and attributes using .style() and .attr(), respectively.

For example, to change the radius of each circle we can use:

s.attr('r', 20);

Supposing we want to set the radius of each circle to the value of its joined data? Instead of providing a constant value to the .attr() function, we instead provide an anonymous function:

s.attr('r', function(d) {
  return d;
});

What’s happening here is that when D3 sees that we’re passing a function into .attr() it calls this function for each element in the selection. It passes the value of the joined data into the function (via the variable d) and sets the radius to the return value of the function.

You can think of it like this:

For each circle in the selection:
  Call function(d) {return d;} passing in the circle's joined data value into the function
  Set the radius to the return value of the function

Arrays of objects

Where this gets interesting is if we have an array of objects where each object can be a multi-variable entity. For example:

var cities = [
  {name: 'London', population: 8416500, continent: 'Europe'},
  {name: 'New York City', population: 8419000, continent: 'North America'},
  {name: 'Paris', population: 2241000, continent: 'Europe'},
  {name: 'Shanghai', population: 24150000, continent: 'Asia'},
  {name: 'Tokyo', population: 13297000, continent: 'Asia'},
];

Now lets join this to the circles:

s.data(cities);

And now lets set up some scale functions and then set the circle area proportionally to the population and the colour according to the continent:

var radiusScale = d3.scale.sqrt().domain([0, 25000000]).range([0, 50]);

var colour = {
  Europe: '#66c2a5',
  'North America': '#8da0cb',
  Asia: '#fc8d62'
};

// Set the radius according to the population
s.attr('r', function(d) {
  return radiusScale(d.population);
});

// Set the colour according to the continent
s.style('fill', function(d) {
  return colour[d.continent];
});

This results in a simple visualisation where each country is represented by a circle with its size proportional to the city’s population and the colour representing the country’s continent:

Under the hood

It can be useful knowing what D3 is actually doing behind the scenes when it’s joining data. It’s pretty straightforward: it adds a property called __data__ to the HTML/SVG element that represents the joined data. We can inspect this property in the debugging console of Chrome. For example:

  • right click on the left-most circle and select ‘Inspect Element’. This shows the element in Chrome’s Developer Tools:

  • open the ‘Properties’ tab and expand circle, and then __data__
  • you’ll be able to see the joined data, in this case the name, population and continent of the selected city:

Summary

In this article we’ve taken a look D3’s data joins. We’ve seen how a single datum can be joined to a selection using .datum() and how an array can be joined to a selection using .data().

Once data has been joined to a selection we can manipulate the selection based on the joined data. For example, we can alter the element’s geometry or style based on the data. This is a very powerful technique that allows anything from bar charts, through to something much more customised to be created.

Data joins are fundamental to D3, so I do recommend experimenting further with them and studying other resources, in particular D3’s documentation on joins.

Stay in touch

I’ll be publishing more D3 must knows over the coming months and covering subjects such as scale functions, data joins, enter/exit and lots more. If you’d like to be the first to read the next articles then please add your name to my mailing list and you’ll be the first to receive articles as they’re published.