LESSON: Error bars and measures of dispersion
FOCUS QUESTION: How can I depict uncertainty and variability in data?
This lesson discusses various ways of putting error bars on graphs.
In this lesson you will:

Contents
 DATA FOR THIS LESSON
 SETUP FOR LESSON
 EXAMPLE 1: Load the data about New York contagious diseases
 EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox
 EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and chickenpox
 EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year
 EXAMPLE 5: Plot the SD error bars for measles monthly counts by year
 EXAMPLE 6: Plot the SD error bars on a bar chart for measles
 EXAMPLE 7: Compute median, MAD and IQR by month for measles
 EXAMPLE 8: Plot median monthly measles with IQR for error bars
 EXAMPLE 9: Plot IQR and MAD error bars on the same graph
 SUMMARY OF SYNTAX
DATA FOR THIS LESSON
File  Description 
NYCDiseases.mat 
The data set contains the monthly totals
of the number of new cases of measles, mumps, and chicken pox for
New York City during the years 19311971.
The file is organized into the following variables:
The data was first published in: Yorke, J.A. and London, W.P. (1973). "Recurrent Outbreaks of Measles, Chickenpox and Mumps", American Journal of Epidemiology, Vol. 98, pp. 469. 
SETUP FOR LESSON
 Create an ErrorBars directory on your V: drive and make it your current directory.
 Download the NYCDiseases.mat to your ErrorBars directory.
 Create a ErrorBarLesson script file in your ErrorBars directory.
EXAMPLE 1: Load the data about New York contagious diseases
Create a new cell in which you type and execute:
load NYCDiseases.mat; % Load the disease data
You should see measles, mumps, chickenPox, and years variables in the Workspace Browser.
EXAMPLE 2: Compute the overall mean and standard deviation of measles and chickenpox
Create a new cell in which you type and execute:
measlesAver = mean(measles(:)); % Calculate overall average measles measlesSD = std(measles(:), 1); % Calculate overall std measles chickenPoxAver = mean(chickenPox(:)); % Calculate overall average chickenpox chickenPoxSD = std(chickenPox(:), 1); % Calculate overall std chickenpox
You should see the following variables in your Workspace Browser:
 measlesAver  overall average of measles
 measlesSD  overall standard deviation of measles
 chickenPoxAver  overall average of chickenpox
 chickenPoxSD  overall standard deviation of chickenpox
Note: we used the population estimate of standard deviation, not the sample standard deviation.
EXAMPLE 3: Compare overall compare average and SD of monthly counts of measles and chickenpox
Create a new cell in which you type and execute:
figure hold on errorbar(1, measlesAver./1000, measlesSD./1000, 'rs'); errorbar(2, chickenPoxAver./1000,chickenPoxSD./1000, 'ko'); hold off xlabel('Disease') ylabel('Monthly averages (in thousands)') title('Childhood diseases NYC: 19311971 (SD error bars)') set(gca, 'XTickMode', 'manual', 'XTick', 1:2, ... 'XTickLabelMode', 'manual', 'XTickLabel', {'Measles', 'Chicken Pox'},... 'XLim',[0.5,2.5])
You should see a Figure Window with a labeled error bar plot:
Look at the graph from EXAMPLE 3. What doesn't make sense? (Hint: what does the value mean on the lower measles STD error bar?)
EXERCISE 3: Copy the code in EXAMPLE 3 and modify it to also
include mumps.
EXAMPLE 4: Compute mean and standard deviation of monthly measles cases by year
Create a new cell in which you type and execute:
measlesByYearAver = mean(measles, 2); % Average monthly measles by year measlesByYearSD = std(measles, 1, 2); % Std monthly measles by year
You should see the following varibles in your Workspace Browser:
 measlesByYearAver  a 41 x 1 array of average monthly measles cases by year
 measlesByYearSD  a 41 x 1 array of average standard deviations of measles cases by year
Note: we used the population estimate of standard deviation, not the sample standard deviation.
EXAMPLE 5: Plot the SD error bars for measles monthly counts by year
Create a new cell in which you type and execute:
figure errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks'); xlabel('Year'); ylabel('Monthly averages (in thousands)') title('Measles NYC: 19311971 (SD error bars)') set(gca, 'YLimMode', 'manual', 'YLim', [0, 20])
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 6: Plot the SD error bars on a bar chart for measles
Create a new cell in which you type and execute:
figure hold on errorbar(years, measlesByYearAver./1000, measlesByYearSD./1000, 'ks'); bar(years, measlesByYearAver./1000, 'FaceColor', [0.5, 0.5, 1]) plot(years, measlesByYearAver./1000, 'LineStyle', 'none', ... 'Marker', 's', 'MarkerEdgeColor','k', 'MarkerFaceColor','r') hold off xlabel('Year'); ylabel('Monthly averages (in thousands)') title('Measles NYC: 19311971 (SD error bars)') set(gca, 'YLimMode', 'manual', 'YLim', [0, 20])
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 7: Compute median, MAD and IQR by month for measles
Create a new cell in which you type and execute:
measlesByMonthMedian = median(measles, 1); % Median by month measlesByMonthMAD = mad(measles, 1, 1); % Median by month measlesByMonthIQR = prctile(measles, [25, 75]); % 25th and 75th %tile
You should see the following 3 variables in your Workspace Browser:
 measlesByMonthMedian  the median measles by month
 measlesByMonthMAD  median absolute deviation (MAD) by month
 measlesByMonthIQR  IQR for the measles by month
The rows of measlesByMonthIQR correspond to the percentiles, and the columns correspond to the months.
EXAMPLE 8: Plot median monthly measles with IQR for error bars
Create a new cell in which you type and execute:
xPositions = 1:12; lowerDist = measlesByMonthMedian  measlesByMonthIQR(1, :); % Bottom upperDist = measlesByMonthIQR(2, :)  measlesByMonthMedian; % Top bar figure errorbar(xPositions, measlesByMonthMedian./1000, ... lowerDist./1000, upperDist./1000, 'm*') xlabel('Month'); ylabel('Cases in thousands') title('Measles cases in NYC: 19311971') legend('Median (IQR error bars)', 'Location', 'Northeast') % Upper right
You should see the following 3 variables in your Workspace Browser:
 lowerDist  lengths of lower edges of IQR error bars for median
 upperDist  lengths of upper edges of IQR error bars for median
 xPositions  vector with the values 1..12
You should see a Figure Window with median/IQR error bars:
EXAMPLE 9: Plot IQR and MAD error bars on the same graph
Create a new cell in which you type and execute:
figure hold on errorbar(xPositions0.1, measlesByMonthMedian./1000, ... lowerDist./1000, upperDist./1000, 'm*') errorbar(xPositions+0.1, measlesByMonthMedian./1000, ... measlesByMonthMAD./1000, 'ks') hold off xlabel('Month'); ylabel('Median in thousands') title('Measles cases in NYC: 19311971') legend('IQR error bars', 'MAD error bars', 'Location', 'Northeast')
You should see a Figure Window with two sets of error bars:
SUMMARY OF SYNTAX
MATLAB syntax  Description 
errorbar(Y, E) 
Create a plot of the values of Y similar to
plot(Y) . The corresponding values in E
give the length of each wing of the error bars that extend above and below the
corresponding values in Y .

errorbar(X, Y, E) 
Create a plot similar to errorbar(Y, E) except
that this function uses the values of X for the
horizontal positions rather than using the integers 1, 2, ... .

errorbar(X, Y, L, U) 
Create a plot similar to errorbar(X, Y, E) except
that this function uses the values of L and U
to determine the lengths of the lower and upper wings of the error bars,
respectively. 
mad(X)  Compute the average or mean absolute deviation for the array X across the first nonsingleton dimension. For 2D arrays, this computes the mean absolute deviation across the rows (resulting in the mean absolute deviations of the columns). 
mad(X, 0, 1)  Compute the average or mean absolute deviation for the array X across dimension 1 (resulting in the mean absolute deviations of the columns). Note: If the second argument is 1, we compute the median absolute deviation. 
mad(X, 0, 2)  Compute the average or mean absolute deviation for the array X across dimension 2 (resulting in the mean absolute deviations of the rows). Note: If the second argument is 1, we compute the median absolute deviation. 
Y = prctile(X, p) 
Compute a vector of the percentiles of the vector X .
The vector p specifies the percentiles. When X
is a 2D array, the ith row of Y contains the percentiles
p(i) .

std(X)  Compute the unbiased estimate of the population standard deviation for the array X across the first nonsingleton dimension. For 2D arrays, this computes the standard deviation across the rows (resulting in thestandard deviations of the columns). 
std(X, 0, 1)  Compute the unbiased estimate of the population standard deviation for the array x across dimension 1 (resulting in the standard deviations of the columns). Note: If the second argument is 1, the actual sample standard deviation is computed. 
std(X, 0, 2)  ompute the unbiased estimate of the population standard deviation of the array x across dimension 2 (resulting in thestandard deviations of the rows). Note: If the second argument is 1, the actual sample standard deviation is computed. 
This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified by Dawn Roberson on 26Jan2018. Please contact kay.robbins@utsa.edu with comments or suggestions.The photo shows rate of measles vaccination worldwide (WHO 2007) http://en.wikipedia.org/wiki/File:Measles_vaccination_worldwide.png.