Quantile, quantize and threshold scales

Introduction

Quantile, quantize and threshold scales map a continuous domain to a discrete range. The three scales are similar, but all work a slightly different way. All three scales divide the domain into classes and map each class to a discrete range value. The way the domain values are divided into classes is different for each scale.


const data = [0, 6, 8, 10, 22, 29, 41, 49, 58, 70, 88, 100];
console.log(data.length); // logs: 12

const quantileScale = d3.scaleQuantile()
  .domain(data)
  .range(['steelblue', 'orange', 'lime']);
console.log(quantileScale.quantiles()); // logs: [ 18, 52 ] // 12/3 = 4 values per class. Quantiles between 4th and 5th value and between 8th and 10th value.
console.log(quantileScale.invertExtent('steelblue')); // logs: [ 0, 18 ]
console.log(quantileScale.invertExtent('orange')); // logs: [ 18, 52 ] 
console.log(quantileScale.invertExtent('lime')); // logs: [ 52, 100 ]

// ***

const quantizeScale = d3.scaleQuantize()
  .domain(d3.extent(data))
  .range(['steelblue', 'orange', 'lime']);
console.log(quantizeScale.invertExtent('steelblue')); // logs: [ 0, 33.333333333333336 ] 
console.log(quantizeScale.invertExtent('orange')); // logs: [ 33.333333333333336, 66.66666666666667 ] 
console.log(quantizeScale.invertExtent('lime')); // logs: [ 66.66666666666667, 100 ]

// ***

const thresholdScale = d3.scaleThreshold()
  .domain([10, 50])
  .range(['tan', 'steelblue', 'orange']);
console.log(thresholdScale.invertExtent('tan')); // logs: [ undefined, 10 ] // d < 10
console.log(thresholdScale.invertExtent('steelblue')); // logs: [ 10, 50 ] 
console.log(thresholdScale.invertExtent('orange')); // logs: [ 50, undefined ]  // d > 50

These scales are particularly useful to establish classed (discrete) sequential or diverging scales to build data visualizations where classes are represented by distinguishable colors that can be looked-up on a legend. See the previous section about color scales.

Continuous color scales are generally not suitable for data visualizations where it is desirable that the user can read a color as an actual value, without being informed about the values otherwise. With a continuous color scale it would be virtually impossible for the human eye to distinguish the subtle color differences in the color gradient. With discrete sequential or diverging scales the values are grouped into classes, with each an easy distinguishable color.

Next example uses a continuous linear scale. Can you see what rectangle represents 100, or distinguish values 25, 45 and 100?


const data = [14,25,32,4,10,60,38,19,24,19,2,23,8,100,22,16,11,27,2,13,30,14,45,5,25];		
const colorScale = d3.scaleLinear([0, data.length], ["white", "red"]);
d3.select("#container")
  .selectAll("span")
  .data(data)
  .join("span")
  .style("background", d => colorScale(d));

Result:

Quantile scales

The domain of a quantile scale is considered continuous, however, the domain is treated as a population of discrete sample values. The domain is sorted and then separated into classes with an (roughly) equal number of values. The 'cut points', or quantiles, are calculated as the 'edges' of the classes. The number of classes is equal to the number of values in the range.


const data = [14,25,32,4,10,60,38,19,24,19,2,23,8,100,22,16,11,27,2,13,30,14,45,5,25];
const colors = ["white", "pink", "red", "DarkRed"];		
const colorScale = d3.scaleQuantile()
  .domain(data)
  .range(colors);
d3.select("#container")
  .selectAll("span")
  .data(data)
  .join("span")
  .style("background", d => colorScale(d));
  
const legend = d3.select("#legend");
for (let i = 0; i < colors.length; i++) {
  const div = legend.append("div");
  div.append("span")
    .style("background", colors[i]);
  div.append("text")
    .text(` ${colorScale.invertExtent(colors[i])[0]} - ${colorScale.invertExtent(colors[i])[1]}`);			  
}   

Result:

Legend:

PS. Note that the ranges of the classes are not equal: 2 - 11 is a much smaller range than 27 - 100. This is because the data set is not symmetric about its mean. The legend provides an indication that the distribution is skewed, but the data visualization itself does not show this.

Roughly 25% of the values is between 2 and 11, 25% between 11 and 19, 25% between 19 and 27 and 25% between 27 and 100. 25% of the values might represent the 25% poorest, 25% shortest, 25% happiest etc. of the population.

The domain is considered continuous and thus the scale will accept 'in between' (after being sorted) input values. Both a domain and range must be specified (otherwise the scale will return undefined).


const quantileScale = d3.scaleQuantile()
  .domain([13, 1, 99])
  .range(["first interval", "second interval", "third interval"]);

console.log(quantileScale.quantiles()); // logs: [ 9, 41.66666666666666 ]
console.log(quantileScale(-1)); // logs: "first interval"
console.log(quantileScale(8)); // logs: "first interval"
console.log(quantileScale(9)); // logs: "second interval"
console.log(quantileScale(30)); // logs: "second interval"
console.log(quantileScale(313)); // logs: "third interval"

const invalidScale = d3.scaleQuantile()
  .domain([13, 1, 99]);
console.log(invalidScale(8)); // logs: undefined

D3 also provides scaleSequentialQuantile. The quantiles are the actual domain values and the number of classes equals the number of specified domain values. The range is interpreted as an interpolator (see sequential scales).


const quantileScale = d3.scaleSequentialQuantile()
  .domain([13, 1, 99]);
  
console.log(quantileScale.quantiles(2)); // logs: [ 1, 13, 99 ]
console.log(quantileScale(-1)); // logs: 0
console.log(quantileScale(1)); // logs: 0
console.log(quantileScale(12)); // logs: 0
console.log(quantileScale(13)); // logs: 0.5
console.log(quantileScale(99)); // logs: 1

Quantize scales

Quantize scales are similar to linear scales, except the range is discrete.

Quantize scale graph x (domain) y (range)

const data = [14,25,32,4,10,60,38,19,24,19,2,23,8,100,22,16,11,27,2,13,30,14,45,5,25];
const colors = ["white", "pink", "red", "DarkRed"];		
const colorScale = d3.scaleQuantize()
  .domain(d3.extent(data))
  .range(colors);
d3.select("#container")
  .selectAll("span")
  .data(data)
  .join("span")
  .style("background", d => colorScale(d));
  
const legend = d3.select("#legend");
for (let i = 0; i < colors.length; i++) {
  const div = legend.append("div");
  div.append("span")
    .style("background", colors[i]);
  div.append("text")
    .text(` ${colorScale.invertExtent(colors[i])[0]} - ${colorScale.invertExtent(colors[i])[1]}`);			  
}  

Result:

Legend:

PS. Note that now the classes have roughly equal lengths. The data visualization shows that the data distribution is skewed (data set not symmetric about its mean): only one value sits in the "highest" class, while the vast majority of the values are grouped in the "lowest" class.

Previously we showed an example that used a linear scale and rangeRound. The next example shows the same example, but now executed using scaleQuantize.


const students = [
  { name: "Joe", score: 35},
  { name: "Abby", score: 86},
  { name: "Casey", score: 44},
  { name: "Max", score: 62}
];

const grades = d3.scaleQuantize([0, 1], ["F", "E", "D", "C", "B", "A"]);
  
d3.select("#container")
  .selectAll("span")
  .data(students)
  .join("span")
    .text(d => ` ${d.name} scored a ${grades(d.score / 100)}.`); 

Result:

Threshold scales

Threshold scales are similar to quantize scales, except they allow you to choose the class boundaries.

Threshold scale graph x (domain) y (range)

const data = [14,25,32,4,10,60,38,19,24,19,2,23,8,100,22,16,11,27,2,13,30,14,45,5,25];
const colors = ["white", "pink", "red", "DarkRed"];	
const colorScale = d3.scaleThreshold()
  .domain([25, 50, 75])
  .range(colors);
d3.select("#container")
  .selectAll("span")
  .data(data)
  .join("span")
  .style("background", d => colorScale(d));
  
const legend = d3.select("#legend");
for (let i = 0; i < colors.length; i++) {
  const div = legend.append("div");
  const min = colorScale.invertExtent(colors[i])[0];
  const max = colorScale.invertExtent(colors[i])[1];
  div.append("span")
    .style("background", colors[i]);
  div.append("text")
    .text(` ${min === undefined ? "−∞" : min} - ${max === undefined ? "∞" : max}`);		  
}   

Result:

Legend: