When I decided to start using Python for data analysis and visualization, I was a bit worried about the latter, as I had become fairly fond of the flexibility of the R base graphics. The more I use matplotlib, though, the happier I am with it.

Case in point: today I figured out how to plot ellipses for bivariate normal densities. In much of my work with general recognition theory (GRT), I have focused on a fairly restricted set of models. Specifically, I have focused on cases in which the marginal variances of the modeled perceptual distributions are not identifiable (well, technically speaking, the issue is that the marginal variances and means are not *both* identifiable).

In general, a useful way to illustrate GRT models is by taking a bird’s-eye view and looking down at contours describing the modeled distributions, like so:

In this kind of simple case, you can just pick a level above the plane at which to figuratively slice the modeled densities, and you get contours that are comparable across the perceptual distributions.

However, in certain GRT models and associated data sets (e.g., in which you have more than two response levels on each dimension), the marginal variances are identifiable (along with the means). In this case, slicing each density at the same height produces bad graphics, since densities with larger marginal variances are more spread out and lower down than otherwise comparable densities with smaller marginal variances. Hence, slicing at a height sufficiently high up off the base plane can, counterintuitively, produce *smaller* ellipses for densities with larger marginal variances.

The solution is to plot ellipses that enclose a specified volume (i.e., for which the integral over the region specified by the ellipse takes a particular value) rather than at a specified height.

The code I wrote to plot the height-based figures calculates values for the bivariate normal densities on a grid, then uses built-in ‘contour’ functions to plot the ellipses for a given height. To plot based on volume, this approach would be either very ugly or totally non-functional. Thankfully, I was able to find (and adapt) code that takes an analytical approach based on the density mean and covariance parameters and uses some nice built-in matplotlib features. The adaptation consists mostly of switching from specifying the number of standard deviations to specifying the volume, based on the nice description of the relationships between density parameters and volumes given here.

Here’s a simple example figure (not from a fitted model, just from some semi-randomly chosen covariance and mean values), with each ellipse enclosing half of the volume of each bivariate normal density, and with unit marginal variances for the density located at the origin:

Here’s the adapted code:

def plot_cov_ellipse(cov, pos, volume=.5, ax=None, fc='none', ec=[0,0,0], a=1, lw=2): """ Plots an ellipse enclosing *volume* based on the specified covariance matrix (*cov*) and location (*pos*). Additional keyword arguments are passed on to the ellipse patch artist. Parameters ---------- cov : The 2x2 covariance matrix to base the ellipse on pos : The location of the center of the ellipse. Expects a 2-element sequence of [x0, y0]. volume : The volume inside the ellipse; defaults to 0.5 ax : The axis that the ellipse will be plotted on. Defaults to the current axis. """ import numpy as np from scipy.stats import chi2 import matplotlib.pyplot as plt from matplotlib.patches import Ellipse def eigsorted(cov): vals, vecs = np.linalg.eigh(cov) order = vals.argsort()[::-1] return vals[order], vecs[:,order] if ax is None: ax = plt.gca() vals, vecs = eigsorted(cov) theta = np.degrees(np.arctan2(*vecs[:,0][::-1])) kwrg = {'facecolor':fc, 'edgecolor':ec, 'alpha':a, 'linewidth':lw} # Width and height are "full" widths, not radius width, height = 2 * np.sqrt(chi2.ppf(volume,2)) * np.sqrt(vals) ellip = Ellipse(xy=pos, width=width, height=height, angle=theta, **kwrg) ax.add_artist(ellip)