Using barycentric coordinates to plot 3D probabilities and categorical data

Suppose you have some three-dimensional categorical data and you’ve come up with some clever way to model the underlying probabilities. Let’s say, for example, that you have data from a categorization experiment in which there are three possible responses to m stimuli, from each of n experimental subjects. Let’s say that you also have estimates of the probability of giving each response to each stimulus for each subject. Now, suppose further that you would like to present your data and modeled probabilities visually. I’m assuming that you, like me, don’t want to lose any information, and that you want the viewer of your plots to be able to figure out what’s going on as easily as possible.

I’m dealing with a case in which a whole mess of people gave one of three possible responses to each of nine stimuli in each of ten experimental blocks. I’ve got a clever (?) model (more on this at a later date) of these responses, and I want to write it all up for a fancy-pants journal, so I’ve been thinking about how to make informative, accessible figures for a little while now. I think I’ve got a good method for doing so. I won’t be presenting the model or the actual data here, the former since I want to focus on the graphics and the latter since I am not at liberty to do so.

I’ll be focusing on a simplified scenario in which the stimulus and subject aspects of the data are irrelevant, which is to say that I’ll be talking about n generic multinomial data vectors of length three. If you’d like to play along, you can use the following R code to generate random multinomial probabilities and count data. Pick an n that matches your mood today, then copy and paste this code into your R (or, preferably, RStudio) console:

  t = matrix(nrow=n,ncol=3)
  da = matrix(nrow=n,ncol=3)
  db = matrix(nrow=n,ncol=3)
  for(i in 1:n){
    t[i,] = c(runif(1),runif(1),runif(1))
    t[i,] = t[i,]/sum(t[i,])
    da[i,] = t(rmultinom(1,1,t[i,]))
    db[i,] = t(rmultinom(1,10,t[i,]))
  }

Now, we’re going to use barycentric coordinates on a triangle to make a visual representation of these probabilities and count data. Each row of the matrix t has three probabilities. So, for example, row one might be (0.43, 0.28, 0.29). You could, if you so desired, use two of these numbers as x and y coordinates, and then you could use the other to determine the size or color of a plotted symbol. Or you could realize that these numbers, properly transformed, determine a unique point in a triangle. By way of explanation, here’s the top part of the figure from the wikipedia page linked to above:

So, the example given above –  (0.43, 0.28, 0.29) – would be in the top left portion of the triangle, since the first element is largest, and it would be close to equidistant from the top right and bottom corners. By way of contrast, the vector (0.1, 0.6, 0.3) would be closest to the top right corner, and much closer to the bottom than the top left corner. You get the point.

So, here’s a function that takes a matrix of n 3D probability vectors and associated count data and uses the probabilities to determine the coordinates in the triangle and the count data to specify the RGB values for the plotted symbols. If the count data has only one non-zero element in each row, and this element is a 1, then you get red (1, 0, 0), green (0, 1, 0), or blue (0, 0, 1) symbols. If the count data have values larger than 1 and/or more than one non-zero element, you get graded colors determined by transforming the counts into proportions. Okay, so here’s the function:

plot.simplex <- function(theta, data, a = 1, x3 = .5, y3.off = 0,
                         new.plot = T, xlm = c(0,1), ylm = c(0,1)){

  y3 = .5*(1-a*sqrt(3)/2) + y3.off
  x1 = x3-.5*a;
  x2 = x3+.5*a;
  y1 = y3 + a*sqrt(3)/2;
  y2 = y3 + a*sqrt(3)/2;

  nr = nrow(theta)
  nc = ncol(theta)
  rtot = vector(length=nr)
  clrm = vector(length=nr)
  if(new.plot){
    plot.new()
  }
  plot.window(xlim=xlm,ylim=ylm)
  for(i in 1:nr){
    rtot[i] = sum(data[i,])
    if(rtot[i]>1){
      ctemp = data[i,]/sum(data[i,])
    }else{
      ctemp = data[i,]
    }
    clrm[i] = rgb(ctemp[1],ctemp[2],ctemp[3])

    xi = x3-theta[i,1]*a/2+theta[i,2]*a/2
    yi = y3+sum(theta[i,1:2])*a*sqrt(3)/2
    points(xi,yi,pch=19,col=clrm[i])
  }
  lines(c(x3,x1),c(y3,y1),lty=2)
  lines(c(x3,x2),c(y3,y2),lty=2)
  lines(c(x1,x2),c(y1,y2),lty=2)
  # maybe need to make cex values flexible
  text(x1,y1,"(1,0,0)",pos=3,cex=.75)
  text(x2,y2,"(0,1,0)",pos=3,cex=.75)
  text(x3,y3,"(0,0,1)",pos=1,cex=.75)
}

x3 determines the horizontal location of the triangle, a is the length of the sides, and y3.off moves the triangle up (if it’s positive) or down (if it’s negative). The new.plot argument can be set to F to plot multiple triangles on the same plot, and the xlm and ylm arguments can be adjusted (with a, x3, and y3.off) as desired (e.g., so that the edges don’t get cut off).

And here’s a pretty (?) picture produced by this code (and the da matrix from the simulation code pasted above):

It’s noisy, but the red dots tend to be near the top left, the green dots near the top right, and the blue dots near the bottom, as they should be. There are ‘mistakes’, of course. The red dot down near the bottom, for example, a case in which the third element in the probability vector was the largest but the random count datum generated was (1, 0, 0).

Here’s a murkier picture, produced with the same probabilities and the db data, which has rows that sum to 10 rather than 1:

Here the color partially encodes/duplicates the location, since the count data more closely reproduces the underlying probabilities. Hence, the points in the middle tend to be darker red, green, or blue, with the really bright exemplars tending to occur only near the corners.

Finally, I wrote the code so that you can plot multiple triangles (e.g., for different experimental conditions, subjects, stimuli, or whatever). Here’s a set of four triangles plotting four sets of simulated (1-count) data:

It’s a bit more cluttered inside the triangles, but the size of the symbols is easy enough to change for different situations. For that matter, the symbols themselves can be changed. Little triangles inside the big triangles might be nice, for example. And I suppose I’ll probably add in some code to make the text at the corners optional, too, at some point, since they add to the clutter in a multi-triangle plot.

As noted above, I’ll try to post soon about the model (and indirectly about the data) that brought me to work on this. I’m hoping to use it to evaluate the fit of the model to the data, but haven’t done so yet. Constructive and/or amusingly abusive feedback is welcome and greatly appreciated.

This entry was posted in statistical graphics. Bookmark the permalink.

One Response to Using barycentric coordinates to plot 3D probabilities and categorical data

  1. Mike Brady says:

    Very clever. I like it. One thing I’m thinking about is how this method might scale for visualizing data with more than three possible responses.

Comments are closed.