TYPES OF GRAPHS

2.13 TYPES OF GRAPHS

a. Frequency Polygons

A frequency polygon is a graphical display of a frequency table. The intervals are shown on the X-axis and the number of scores in each interval is represented by the height of a point located above the middle of the interval. The points are connected so that together with the X-axis they form a polygon.

A frequency table and a relative frequency polygon for response times in a study on weapons and aggression are shown below. The times are in hundredths of a second.

Lower

Cumulative Upper Limit Count

50 55 1 32 3.12 100.00 Note: Values in each category are > the lower limit and ≤ to the upper limit.

Frequency polygons can be based on the actual frequencies or the relative frequencies. When based on relative frequencies, the percentage of scores instead of the number of scores in each category is plotted.

In a cumulative frequency polygon, the number of scores (or the percentage of scores) up to and including the category in question is plotted. A cumulative frequency polygon is shown below.

b. Histograms

A histogram is constructed from a frequency table. The intervals are shown on the X-axis and the number of scores in each interval is represented by the height of a rectangle located above the interval. A histogram of the response times from the dataset Target RT is shown below.

Histogram

The shapes of histograms will vary depending on the choice of the size of the intervals. A bar graph is much like a histogram, differing in that a small distance separates the columns from each other. Bar graphs are commonly used for qualitative variables.

c. Stem and Leaf Displays

A stem and leaf plot is much like a histogram except it portrays a little more information. A stem and leaf plot of the tournament players from the dataset "chess" as well as the data themselves are shown to the right.

The largest value, 85.3, is approximated as 10 x 8 + 5. This is represented in the plot as a stem of 8 and a leaf of 5. It is shown as the "5" in the first line of the plot. Similarly, 80.3 is approximated as 10 x 8 + 0; it has a stem of 8 and a leaf of 0. It is shown as the "0" in the first line of the plot.

Depending on the data, each stem is displayed 1, 2, or 5 times. When a stem is displayed only once (as on the plot shown here), the leaves can take on the values from 0-9. When a stem is displayed twice, (as in the example on the right) one stem is associated with the leaves 5-9 and the other stem is associated with the leaves 0-4.

Finally, when a stem is displayed five times, the first has the leaves 8-9, the second 6-7, the third 4-5, and so on. If positive and negative numbers are present, +0 and -0 are used as stems as they are in the plot to the right. A stem of -0 and a leaf of 7 is a value of (-0 x 1) + (-.1 x 7) = -.7.

d. Box Plots

A box plot provides an excellent visual summary of many important aspects of a distribution. The box stretches from the lower hinge (defined as the 25th percentile) to the upper hinge (the 75th percentile) and therefore contains the middle half of the scores in the distribution.

The median is shown as a line across the box. Therefore 1/4 of the distribution is between this line and the top of the box and 1/4 of the distribution is between this line and the bottom of the box.

The "H-spread" is defined as the difference between the hinges and a "step" is defined as 1.5 times the H-spread.

Inner fences are 1 step beyond the hinges. Outer fences are 2 steps beyond the hinges.

There are two adjacent values: the largest value below the upper inner fence and the smallest value above the lower inner fence. For the data plotted in the figure, the minimum value is above the lower inner fence and is therefore the lower adjacent value. The maximum value is the inner fences so it is not the upper adjacent value.

As shown in the figure, a line is drawn from the upper hinge to the upper adjacent value and from the lower hinge to the lower adjacent value.

Every score between the inner and outer fences is indicated by an "o"; a score beyond the outer fences is indicated by a "*". It is often useful to compare data from two or more groups by viewing box plots from the groups side by side. Plotted are data from Example 2a and Example 2b. The data from 2b are higher; more spread out, and have a positive skew. That the skew is positive can be determined by the fact that the mean is higher than the median and the upper whisker is longer than the lower whisker. Some computer programs present their own variations on box plots. For example, SPSS does not include the mean. JMP distinguishes between "outlier" box plots which are the same as those described here and quantile box plots that show the 10th, 25th, 50th, 75th, and 90 th Percentiles.