LESSON 3 QUESTIONS: Pie and bar charts

FOCUS QUESTION: How can I show proportions and relative sizes of different data groups?

Contents

EXAMPLE 1: Load the data about New York contagious diseases

   load NYCDiseases.mat;    % Load the disease data

EXAMPLE 2: Compute yearly totals of the three diseases

   totalMeasles = sum(measles(:));   % Find the total number of measles cases
   totalMumps = sum(mumps(:));       % Find the total number of mumps cases
   totalCP = sum(chickenPox(:));     % Find the total number of chicken pox cases
   diseaseTotals = [totalMeasles, totalMumps, totalCP]; % Make a totals vector

Questions Answers
What does measles(:) do? This expression creates a new array by vertically stacking the column(s) of measles.
How can I find out the dimensions of totalMeasles? Look in your Workspace Browser to see the sizes of all the variables and the types of values they are currently holding.
Why is totalMeasles a single number? Since measles(:) is a single column, its sum is a single value.
What do the values of diseaseTotals represent? The variable diseaseTotals is an array with 1 row of 3 columns. The three values are the total numbers of cases over the entire period for the three diseases (measles, mumps, and chicken pox, respectively).
Why is diseaseTotals a row vector? The square brackets indicate the creation of a vector. The comma specifies values that should be placed side-by-side.

EXAMPLE 3: Show a pie chart of the yearly totals of the three diseases

   figure                                      % Create a new figure window
   pie(diseaseTotals, {'Measles', 'Mumps', 'Chicken pox'}) % Draw a pie chart
   title('Contagious childhood diseases in New York City 1931-1971');

Questions Answers
What does a pie chart depict? A pie chart shows the fraction or percentage of the whole as represented by the slices of a pie. The bigger the slice, the larger the fraction. Partial pies are also possible.
Does this pie chart show how many cases of each type of disease there are? No, this chart only shows the percentages of each type of disease.
How can I convey the total number of cases of each type of disease on this chart? You might add an annotation giving the overall total number of cases (the sum of values in diseaseTotals). Viewers can then figure out the number of cases of the individual diseases by using the percentages.

EXAMPLE 4: Make a bar chart of the disease totals

   figure                 % Create a new figure
   bar(diseaseTotals)     % Plot a bar chart in it
   set(gca, ...           % Label the tick marks rather than x-axis
       'XTickLabelMode', 'manual', ...         % Doing tick marks by hand
       'XTickLabel', {'Measles', 'Mumps', 'Chicken pox'})
   ylabel('Cases');       % Label the y-axis
   title('Contagious childhood diseases in NYC: 1931-1971');

Questions Answers
What does the height of each bar show? The height of each bar shows the total number of cases of one of the diseases.
What is gca? The gca identifier is a reference or handle to the current axis. Use gca to get or set graphical properties of the current axis in the current figure. You can save this handle in a variable and refer to this axis later, when you have created other axes and this axis is not the current one.
Why do I need to set the mode for the x-axis tick labels to be manual? The 'manual' setting of the 'XTickLabelMode' property specifies that MATLAB should not try to automatically set the tick mark labels. Instead, MATLAB should use the labels that you have set using the 'XTickLabel' property of the current axis.
What tick labels does MATALB automatically choose for bar graphs? MATLAB uses the bar positions 1, 2, ... if you don't specify your own tick mark labels.
Why are the tick mark labels enclosed with single quotes? The enclosing single quotes specify that Measles is a string or message, not a variable.
Why was the list of tick mark labels enclosed with curly braces rather than square brackets? Use square brackets ([ ]) to create an ordinary array, which must have elements of the same size and type. Use curly braces ({ }) to create a cell array, which may have elements of different sizes and types. In this case each array element is a string, but the strings are of different lengths, so you must use cell arrays.
Based on this graph, estimate the total number measles cases. There were approximately 700,000 cases of measles during the time period. Although the y-axis goes from 0 to 7, the scale is annotated above in scientific notation. The y-axis actually goes from 0 to 7 x 105.

EXAMPLE 5: Calculate individual and overall monthly totals

   measlesByMonth = sum(measles);       % Find the monthly totals of measles
   mumpsByMonth = sum(mumps);           % Find the monthly totals of mumps
   CPByMonth = sum(chickenPox);         % Find the monthly totals of chicken pox
   byMonth = [measlesByMonth', mumpsByMonth', CPByMonth']; %Make 3 columns of totals

Questions Answers
What is the size of sum(measles)? Since measles is an array with 41 rows and 12 columns, the result of sum(measles) is an array with 1 row of 12 columns. (Remember that sum without a second argument calculates the sums of the individual columns.)
What does [a, b, c] mean? The square brackets ([]) indicate a new array formed by placing the arrays a, b, and c side-by-side.
What does the prime (') mean? The prime (') means to take the transpose of the array. The transpose operator flips the array along the main diagonal to produce a new array. (The rows of the original array are the columns of the transposed array.)
What is the size of measlesByMonth'? Since measlesByMonth is a single row with 12 columns, the size of measlesByMonth' is 1 column of 12 rows.
Why does byMonth have 12 rows and 3 columns? MATLAB forms the array by placing side-by-side three arrays, each consisting of a single column of 12 rows.
Why was the transpose operator necessary? Without the transpose operator, the row vectors measlesByMonth, mumpsByMonth and CPByMonth would be placed side-by-side to form a single row vector of 36 elements. Plotting byMonth using a bar graph results in the 12 bars of the measles monthly totals, followed by the 12 bars of the mumps monthly totals, and ending with 12 bars of the chicken pox monthly totals. In order to have a bar graph where the bars representing January for the three diseases are grouped together, the array must have the January values as the first row.

EXAMPLE 6: Make a bar chart of the measles monthly case totals (thousands)

   figure                                        % Create a new figure
   bar(measlesByMonth./1000)   % Plot a bar chart of measles monthly totals
   xlabel('Month')                               % Label the x-axis
   ylabel('Cases (in thousands)')                % Label the y-axis
   title('Measles by month in NYC: 1931-1971');  % Put a title on the graph

Questions Answers
How many bars does this graph have? The graph has 12 bars, representing the 12 months.
Why rescale the data using units of thousands of cases rather than just cases? Tick mark labels of large magnitude are difficult to read and comprehend. MATLAB tries to alleviate this problem by presenting the tick mark labels in scientific notation with the exponent displayed above the axis. However, the exponent is easy to miss. The axis rescaling makes the graph easier for most viewers to understand.

EXAMPLE 7: Make a side-by-side bar chart of monthly totals for the 3 diseases

   figure                                        % Create a new figure
   bar(byMonth./1000)                 % Plot a bar chart of monthly totals
   xlabel('Month')                               % Label the x-axis
   ylabel('Cases (in thousands)')                % Label the y-axis
   title('Childhood diseases by month in NYC: 1931-1971');  % Put a title on the graph
   legend('Measles', 'Mumps', 'Chicken pox')     % Need a legend

Questions Answers
What does each bar in this graph represent? An individual bar represents the total number of cases for a particular disease in a particular month.
How are the bars grouped? MATLAB groups by row number. That is, all of the bars for the first row are displayed at position 1, etc. Thus, the three bars at position 1 correspond to the totals for the three diseases in January.
What is the order of the bars within each group? MATLAB determines the order by the column position. Thus, the bars in group 1 correspond to the January totals for measles, mumps, and chicken pox, respectively.

EXAMPLE 8: Make a stacked bar chart of monthly totals for the 3 diseases

   figure                                        % Create a new figure
   bar(byMonth./1000, 'stack') % Plot a stacked bar chart of monthly totals
   xlabel('Month')                               % Label the x-axis
   ylabel('Cases (in thousands)')                % Label the y-axis
   title('Childhood diseases by month in NYC: 1931-1971');  % Put a title on the graph
   legend('Measles', 'Mumps', 'Chicken pox')     % Need a legend

Questions Answers
What does each bar in this graph represent? The total height of each bar corresponds to the total of a row of byMonth (in thousands). For example, bar 1 is the total number of cases of all three diseases for January.
What does each band in this graph represent? Each element in byMonth corresponds to one band in this graph. The bands in bar 1 correspond to the 3 elements in the first row of byMonth, that is, the totals of the three diseases for January.
How is the order of the bands determined within each bar? MATLAB shows the bands in the column order of elements, with the value in column 1 being shown on the bottom. Thus, the band at the bottom of bar 1 corresponds to the total number of cases of measles for January.
How do the groupings in Examples 7 and 8 differ? Each group plots the same elements, but the 'stack' option draws the bands on top of each other in a single bar, while the other version draws the bands as side-by-side bars.
Why use stacked bar charts? A stacked bar chart allows you to better estimate and compare the overall totals of the groups. You can comprehend a larger number of groups with the stacked bars. The chart of Example 7 is rather crowded.

EXAMPLE 9: Make a horizontal bar chart of monthly totals for the 3 diseases

   figure                                        % Create a new figure
   barh(byMonth./1000)         % Plot a stacked bar chart of monthly totals
   xlabel('Cases (in thousands)')                % Label the x-axis
   ylabel('Month')                               % Label the y-axis
   title('Childhood diseases by month in NYC: 1931-1971');  % Put a title on the graph
   legend('Measles', 'Mumps', 'Chicken pox')     % Need a legend

Questions Answers
How does the horizontal bar chart in this example differ from the vertical bar chart of Example 7 The two types of charts show the same information, but MATLAB draws these charts in a different orientations.
How do the orientations of vertical and horizontal bar charts differ? For the vertical bar chart, MATLAB draws the bars in an up and down orientation and the displays groups from left to right. For the horizontal bar chart, MATLAB draws the bars in a side ways orientation and displays the groups from bottom to top. Many details can be changed by setting properties.

EXAMPLE 10: Make a stacked horizontal bar chart using a different color scheme

   figure                                        % Create a new figure
   colormap summer                               % Use a new color scheme
   barh(byMonth./1000, 'stack') % Plot a stacked bar chart of monthly totals
   xlabel('Cases (in thousands)')                % Label the x-axis
   ylabel('Month')                               % Label the y-axis
   title('Childhood diseases by month in NYC: 1931-1971');  % Put a title on the graph
   legend('Measles', 'Mumps', 'Chicken pox')     % Need a legend

Questions Answers
What is a color map? A color map is a way of assigning colors based on integers. For example, there is a color 1, color 2, etc.
What determines how colors are assigned? MATLAB keeps a table of colors for each figure designated by the Colormap property. You can change this property for a figure by using the colormap command or by directly editing the figure.
What is summer? MATLAB has several standard tables of colors, and summer designates one of those standard tables.

These questions were written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 31-Dec-2010. Please contact krobbins@cs.utsa.edu with comments or suggestions.