Joining data

The data join

D3 works with a concept called the data join. A data join creates a correspondence between an array of data and a selection of elements.

The data(dataArray) method joins a selection to an array of data. This creates three selections: enter, update, and exit.

From "Thinking with Joins" by Mike Bostock. https://bost.ocks.org/mike/join/ Data Enter Elements Exit Update

const svg = d3.select("body").append("svg")
  .attr("width", "100%")
  .attr("height", "100%")
  .attr("viewBox", "0 0 200 95");

let circle = svg.selectAll("circle")
  .data([10, 20, 30, 40]);    // returns the update selection

circle.enter()                // returns the enter selection
  .append("circle")
    .style("fill", "orange")
    .attr("cy", 50)
    .attr("r",  (d, i) => d)		  
    .attr("cx", (d, i) => i * d + d);	

Result:

In the example above svg.selectAll("circle") returns an empty selection (assuming that there are no circle elements in the document's body).

The data() method binds the specified data array with the selection (empty selection yet). It returns a new selection that represents the update selection (yet empty). The default setting is that data and selected elements are joined-by-index (the first datum is assigned to the first element, the second datum to the second element, and so on). An optional key function can be passed as a second argument of data() to change the default assignment rule.

Next to the update selection, data() also defines the enter and exit selections. The selections are bound to the data.

The enter() method returns the enter selection: placeholder nodes for each datum not bound to an element. Performing an append() on the enter selection appends "missing" elements, corresponding to the data, to the parent element (appending 4 circles to the document body, in the example above).

Now suppose that the data array changes, every time the user clicks a button:


const svg = d3.select("body").append("svg")
  .attr("width", "100%")
  .attr("height", "100%")
  .attr("viewBox", "0 0 300 25");
  
update();
document.querySelector("button")
  .addEventListener("click", update, false);

function update() {
  let text = svg.selectAll("text")
    .data(randomLetters())           // returns the update selection		
	  .attr("stroke", "blue")
	  .attr("stroke-width", "0.5");	

  text.enter()                       // returns the enter selection
    .append("text")
      .style("fill", "orange")
      .style("font-size", "1 rem")
      .attr("y", 16)
    .merge(text)
      .text((d, i) => d)							  
      .attr("x", (d, i) => i * 16);

  text.exit()                       // returns the exit selection 
    .remove();                      // remove possible superfluous elements
};

function randomLetters() {
  return d3.shuffle("abcdefghijklmnopqrstuvwxyz".split(""))
	.slice(0, Math.floor(6 + Math.random() * 20))
	.sort();
};	

Result:

Original example from: Observable: selection.join

On every update, the data() method binds the new data array with the new selection. If the new data array contains more data points than the already existing elements (the new selection), the placeholder node for each extra data point is placed in the enter selection. If the new data array contains less data points than the already existing elements, the obsolete elements are placed in the exit selection. The remaining existing elements populate the update selection.

In the example above: If the new data array contains more data points than the previous one, .enter().append() only appends "missing" elements, i.e., only elements for the extra new data points. However, the existing elements need to be re-positioned, along with the new elements. And existing elements need to get a new text content, according to the new data. But it is not necessary to create all new elements for both new and existing elements. To be able to perform the modifications on both new and existing elements we first need to merge the enter selection and the update selection.

If the new data array contains less data points than the previous one, data will remove "superfluous" elements, corresponding to old data, from the update selection and place them into the exit selection. The superfluous elements can then be removed from the exit selection.

selection.join

Method selection.join replaces and combines selection.enter, selection.exit, selection.append, selection.merge, selection.remove, and selection.order (the last one is used when the new data's order is different from the old one's).

Optional enter, update and exit functions may be specified:


let rect = svg.selectAll("rect")
  .data([1, 2, 3])
  .join("rect");

This is equivalent to:


let rect = svg.selectAll("rect")
  .data([1, 2, 3])
  .join(
    enter => enter.append("rect"),
    update => update,
    exit => exit.remove()			  
  );	

By specifying enter, update and exit functions and by specifying a key function to selection.data, you can minimize changes to the DOM or to attributes or styles of elements, which optimizes runtime performance.

The next example updates a string of letters, like the example in the previous section, only now selection.join is used instead of the .enter().append() and .exit() methods (and now the update is animated, instead of using a button).


let lastUpdate = 0;
const svg = d3.select("body")
  .append("svg")
    .attr("width", "100%")
    .attr("height", "100%")
    .attr("viewBox", "0 0 300 25");

function update() {
let text = svg.selectAll("text")
  .data(randomLetters())
  .join(
    enter => enter.append("text")
	  // modifications on the enter selection, i.e., the new elements:
      .style("fill", "orange")
      .style("font-size", "1 rem")
      .attr("y", 16),
    update => update
	  // modifications on the update selection, i.e., the existing elements:
      .attr("stroke", "blue")
      .attr("stroke-width", "0.5")			  
  )
  // modifications on all elements, after the merge:
  .text((d, i) => d)							  
  .attr("x", (d, i) => i * 16);
};

function animation(timestamp) {
  // call update() every 2 seconds
  if (timestamp - lastUpdate >= 2000) {
    lastUpdate = timestamp;
    update();
  }
  requestAnimationFrame(animation);
};
requestAnimationFrame(animation);

function randomLetters() {
  return d3.shuffle("abcdefghijklmnopqrstuvwxyz".split(""))
	.slice(0, Math.floor(6 + Math.random() * 20))
	.sort();
};	

Result:

PS: Note that an exit function is omitted since exit.remove() is executed by default when using .join().

Examples

The data joined in D3 needs to be in an array. The raw data that needs to be visualized is usually not in a JavaScript array format. Data in data interchange formats like JSON or CSV first need to be transformed into a JavaScript array of objects. Later more about this.

So, in D3 you will typically join arrays of objects and elements:


const players = [
 { name: "Joe", score: 35},
 { name: "Abby", score: 16},
 { name: "Casey", score: 44},
 { name: "Max", score: 62}
];	

const maxScore = d3.max(players, d => d.score); // 62
const barWidth = 20;
const fontSize = 6;	

const svg = d3.select("svg")
  .attr("font-family", "sans-serif")
  .attr("font-size", fontSize)
  .attr("text-anchor", "middle")
  .attr("viewBox", `0 0 ${players.length * barWidth} ${maxScore + (fontSize * 2)}`);

const bar = svg.selectAll("g")
  .data(players)
  .join("g")
    .attr("transform", (d, i) => `translate(${i * barWidth}, 0)`);	

bar
  .append("rect")
    .style('fill', "orange")
    .attr('width', barWidth - 1)	  
    .attr('height', d => d.score )	  
    .attr('y', fontSize + 2);;  

bar
  .append("text")
    .style('fill', "gray")
    .attr('dx', barWidth/2 - 0.5 )
    .attr('y', fontSize)
    .text(d => d.name );	

Result:

The best way to determine and construct the width and horizontal positions of the bars, as well as the names and their position, is to use a scale (like a band scale) in D3. Later more about this. For now it is convenient to just join data to an SVG group (bar in the example above) and append a rect and a text to this group. It involves only one data join. The data are passed to both children. However, a data update will not work in this construction. It will update the groups (add/remove groups, in accordance with the "data join" as discussed above), but not its children. Instead, it will add new bars to the existing ones. Of course you can first remove() all children, before appending the new ones, but it is generally better to join data to the elements that actually visualize the data; the rect elements in the example above. Next example demonstrates this.


const minScore = 10;	
const maxScore = 100;
const barWidth = 20;
const fontSize = 6;

let players = createData();

const svg = d3.select("svg")
  .attr("font-family", "sans-serif")
  .attr("font-size", fontSize)
  .attr("text-anchor", "middle")
  .attr("viewBox", `0 0 ${players.length * barWidth} ${maxScore + (fontSize * 2)}`);

let lastUpdate = 0;
requestAnimationFrame(animation);
createXaxis();	  

function animation(timestamp) {
  // call update() every 2 seconds
  if (timestamp - lastUpdate >= 2000) {
    lastUpdate = timestamp;
    update();
  }
  requestAnimationFrame(animation);
};

function createXaxis() {
  svg.selectAll("text")
    .data(players)
    .join("text")		
      .attr("transform", (d, i) => `translate(${i * barWidth}, 0)`)
      .style('fill', "gray")
      .attr('dx', barWidth/2 - 0.5 )
      .attr('y', fontSize)
      .text(d => d.name);
};

function update() {
  players = createData();
  svg.selectAll("rect")
    .data(players)
    .join("rect")
      .attr("transform", (d, i) => `translate(${i * barWidth}, 0)`)
      .style('fill', "orange")
      .attr('width', barWidth - 1)	  
      .attr('y', fontSize + 2 )
      .transition().attr('height', d => d.score ); // *) see below			
};

function createData() {
  return [
    { name: "Joe", score: getRandomNumber(minScore, maxScore)},
    { name: "Abby", score: getRandomNumber(minScore, maxScore)},
    { name: "Casey", score: getRandomNumber(minScore, maxScore)},
    { name: "Max", score: getRandomNumber(minScore, maxScore)}
  ];
};	

function getRandomNumber(min, max) {
  return Math.random() * (max - min) + min;
};	

Result:

*) Transitions will be explained later.

Create an HTML table

Create an HTML table from a matrix:


const matrix = [
  [ 7, 12,  1, 14],
  [ 2, 13,  8, 11],
  [16,  3, 10,  5],
  [ 9,  6, 15,  4]
];

d3.select("body")
  .append("table")
  .selectAll("tr")
  .data(matrix)
  .join("tr") // joins each array to each row
  .selectAll("td")
  .data(d => d)
  .join("td")
    .text(d => d);	

Result:

This example is taken from the D3 documentation.

Create an HTML table from an array of objects:


<table>
  <tr><th>name</th><th>score</th></tr>
</table>
<script>
const players = [
  { name: "Joe", score: 35},
  { name: "Abby", score: 16},
  { name: "Casey", score: 44},
  { name: "Max", score: 62}
];

d3.select("table")
  .selectAll("tr:not(:first-child)")
  .data(players)
  .join("tr") // joins each object to each row
  .selectAll("td")
  .data(d => [d.name, d.score])
  .join("td")
    .text(d => d);
</script>

Result:

namescore