This handout provides examples of quantifying values from graphs and tables within the text of your paper.
Type | Examples | Comment |
Value | Example 1: By 2007, 17.9 million Americans had been diagnosed with diabetes, while another 5.7 million remained undiagnosed. Example 2: By 2007, the United States had 23.6 million diabetics, 24% of whom were undiagnosed. Example 3: In 2007, 23.6 million Americans (7.8% of the US population) had diabetes. Example 4: In 2007, diagnosed diabetics incurred approximately $6,649 in excess medical costs, representing 120 billion dollars. Example 5: In 2004, the native American diabetics aged 18-39 had an average BMI of 37.1, substantially above the normal range of 18 to 25 BMI. Example 6: In 2004, the average BMI for the native American diabetics aged 18-39 was 37.1, corresponding to a 5' 6" individual weighing 230. A normal-weight subject of the same height would weigh between 112 and 155 pounds. |
Presenting a value without a context may not allow users to make a judgment unless there is a commonly understood standard against which to judge. Examples 1 and 2 show the breakdown between diagnosed and undiagnosed diabetics in two different ways. Example 3 gives the number of diabetics relative to the population as a whole. Example 4 uses the number of diagnosed with secondary information (excess medical cost for diabetes) to give a different characterization of size. Example 5 compares the value (of BMI) to the normal range. Example 6 uses an exemplar individual to give a sense of what a normal BMI corresponds to in everyday terms. |
Percentage | Example 1: By 2006, 5.8% of the US population had been diagnosed with diabetes. Example 2: By 2006, 58 out of every 1000 people in the United States had been diagnosed with diabetes. Example 3: By 2006, nearly one out of every three black Americans between the ages of 65 and 74 had been diagnosed with diabetes. Example 4: By 2006, the United States had 16.8 million diagnosed diabetics, representing 5.8% of the US population. Example 5: By 2006, 5.8% of the 290 million people in the United States had been diagnosed with diabetes. |
Percentage (or fraction of the whole) gives the proportion of individuals in the group. You can easily convert a percentage to a rate, probability, or odds. If possible, give both a numerical value (Example 4) as well as the percentage. In addition, you are given either the total population or the number corresponding to the percentage, you can calculate the other value as well as the rate (Example 5). |
Rank | Example 1: Americans under the age of 40 had by far the lowest incidence of diabetes over all years in the study. Example 2: In 2006, Americans between the ages of 65 and 74 had the highest incidence of diabetes (17.7%), closely followed those 75 and older (15.5%). Example 3: In the decade 1997-2006, Americans between the ages of 65 and 74 generally had the highest incidence of diabetes, closely followed those 75 and older. Example 4: With only one minor exception, the average BMI was an decreasing function of age, meaning younger individuals had a higher BMI than older individuals. |
Rank alone works well for a small number of values, but doesn't give the magnitude of separation between the ranked items. Example 3 points out a qualitative rank relationship without focusing on the minor exceptions. Example 4 points out a stronger rank relationship (the four age groups mostly preserve their rank over the entire decade), mentioning the minor exception. |
Percentile | Example 1: A female newborn with a head size of 37.65 cm is in the 95^{th} percentile, while a similarly sized male newborn would be fall in the 89^{th} percentile. |
Percentiles are given on a basis of 1% increments. The 10^{th} percentile is the value x such that 10% of the values are less than or equal to x. Other options include dectiles (10% increments), quartiles (25% increments). Example 1 used linear interpolation to find the percentile corresponding to 37.65 cm in the male head-size table. |
Range | Example 1: The number of diagnosed diabetics went from from 5.6 million in 1980 to 16.8 million in 2006, an increase of 11.2 million diagnoses. |
Range gives the values at two endpoints. Often a range statement is combined with a difference. |
Difference | Example 1: In 2006, the CDC reported 16.8 million diagnosed diabetics, an increase of 11.2 million cases over the number reported in 1990. Example 2: The BMI of the youngest group rose from 33.5 in 1995 to 37.1 in 2004, an increase of 3.6 BMI units. In comparison, the BMI of the oldest group went from 30.1 to 31.8 over the same decade. The gap between the two groups widened from 3.4 BMI units in 1994 to 5.3 BMI units in 2004. |
The difference is useful when the size of the difference is meaningful. Even so, try to report an endpoint with the difference. Example 2 illustrates the use of more complicated ranges, such as comparing the gaps between curves and the change in the gaps. |
Relative difference (ratio) | Example 1: The incidence of diagnosed diabetes in the US population more than doubled over the years of the study, going from 2.7% in 1980 to 5.8% in 2006. Example 2: The number of diagnosed diabetics in 1980 was one third the number in 2006. |
In Example 1, the reference value is 2.7%. The actual ratio is 5.8/2.7 = 2.15. However, using words such as "doubled" or "tripled" help readers comprehend the actual size if the ratios are close to two or three. Example 2 uses the greater value as the reference, while Example 1 uses the smaller value. Notice that because of the increase in population, the ratio is only 1/3, not 1/2. |
Percentage difference or percentage change | Example 1:The number of diagonosed diabetics ranged from 5.6 million in 1980 to 16.8 million in 2006, a gain of 200% Example 2:The number of diagonosed diabetics in 1980 was 33% of the number of cases in 2006. Example 3: The BMI of the youngest group rose from 33.5 in 1995 to 37.1 in 2004, indicating an 11% gain in weight over the decade. |
As with ratios, the reporting scheme differs depending on whether you are using the larger value as the reference or the smaller value as the reference. Example 1 uses 5.6 million (the smaller) as the reference. The percentage difference is 100*(16.8-5.6)/5.6. Example 2 uses 16.8 million (the larger) as the reference. The percentage is not reported as a difference, but rather as 100* 5.6/16.8. Example 3 uses 33.5 (the smaller) as the reference. Notice the use of the word "decade" to draw attention to the length of the time period. |
Rate of change | Example 1: "For each 0.01% increase in blood alcohol, performance decreased by 1.16%. Thus, at a mean blood alcohol concentration of 0.10%, mean relative performance on the tracking task decreased, on average, by 11.6%." Dawson and Reid, 1997. |
Rates of change for experimental data often oscillate due to noise. Sometimes researchers will fit a smooth curve (such as the linear fit of Dawson and Reid to blood alcohol versus performance). In Example 1, Dawson and Reid express the slope (change in y over change in x) in terms of change performance resulting from a specified change in blood alcohol. They pick the delta x to be a meaningful unit (0.01%). Dawson and Reid also give the overall change with respect to the endpoints. |
Trend | Example 1: In the decade 1997-2006, the number of diabetics increased, on average, by nearly three quarters of a million people every year. |
Simple linear trends (increasing or decreasing) are the most common. These are often detected by performing a linear fit and looking at the residual (R^{2}). Exponential or logarithmic trends are also common in science and can be verified by plotting the data after logarithmic transformation (Lesson 13). In describing trends, convey not only the direction, but also the type and magnitude. Often experimental data will show monotic behavior over restricted ranges, with blips or dips. You have to use judgment as to how important these exceptions are in deciding what to emphasize in your discussion. |