﻿ ﻿Presentation and display of quantitative data; distributions - Research methods

# Presentation and display of quantitative data; distributionsResearch methods

Description

Quantitative data occur as numbers. They are often presented through graphs and tables, giving viewers an easily understandable visual interpretation of the findings from a study.

Graphs

Graphs should be fully and clearly labelled, on both the x-axis and the y-axis, and be appropriately titled. They are best presented if the y-axis (vertical) is three-quarters the length of the x-axis (horizontal). Only 1 graph should be used to display a set of data. Inappropriate scales should not be used, as these convey misleading, biased impressions. Different types of graphs exist for different forms of data.

Bar charts display data as separate, comparable categories, for example findings from young and old participants. The columns of the bars should be the same width and separated by spaces to show that the variable on the x-axis is not continuous. Data are ’discrete’, occurring, for example, as the mean scores of several groups. Percentages, totals and ratios can also be displayed.

Histograms display continuous data, such as test scores, and these are displayed as they increase in value along the x-axis, without spaces between them to show their continuity. The frequency of the data is presented on the y-axis. The column width for each value on the x-axis is the same width per equal category interval so that the area of each column is proportional to the number of cases it represents on the histogram.

Fig 7.4 An example of a bar chart

Fig 7.5 An example of a histogram

Frequency polygons (line graphs) are similar to histograms in that the data presented on the x-axis are continuous. A frequency polygon is constructed by drawing a line from the mid-point top of each column in a histogram to allow 2 or more frequency distributions to be displayed on the same graph, thus allowing them to be directly compared with each other.

Fig 7.6 An example of a frequency polygon

Pie charts are used to show the frequency of categories of data as percentages. The pie is split into sections, each one representing the frequency of a category. Each section is colour coded, with an indication given as to what each section represents and its percentage score.

Fig 7.7 An example of a pie chart

Correlational data are plotted on scattergrams, which show the degree to which 2 co-variables are related.

Fig 7.8 Scattergrams and strength of correlation

TABLES

Tables do not present raw, unprocessed data such as individual scores. Rather they are used to present an appropriate summary of processed data, such as totals, means and ranges. The unprocessed data are given in the appendices of a study as a data table (i.e. a presentation of the raw scores). As with graphs, tables should be clearly labelled and titled.

Table 7.2 An example of a table

 The average number of aggressive acts a week in children attending different hours of day care Number of hours’ day care a week Average number of aggressive acts per week 0—5 1 6—10 3 11—15 2 16—20 4 21—25 2 26—30 3 31—35 9

Measures of central tendency

Measures of central tendency display the ’mid-point’ values of sets of data.

• The mean is calculated by totalling scores and dividing by the number of scores. For example: 1 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 = 37; 37/9 = 4.2. Its strengths are that it is the most accurate measure of central tendency and includes all scores. Its weaknesses are that it is skewed by extreme scores and the mean score may not actually be one of the scores.

• The median is the central value of scores in rank order. For example: for the set of data 1, 1, 2, 3, 4, 5, 6, 7, 8 — the median is 4. With an odd number of scores this is the middle number, while with an even number of scores it is the average of the 2 middle scores. Its strengths are that it is not affected by extreme scores and is easier to calculate than the mean. Its weaknesses are that it lacks the sensitivity of the mean and can be unrepresentative in a small set of data.

• The mode is the most common value. For example: for the set of data 2, 3, 6, 7, 7, 7, 9, 15, 16, 16, 20 — the mode is 7. Its strengths are that it is less affected by extreme scores and, unlike the mean, is always a whole number. Its weaknesses are that there can be more than one mode and it does not use all scores.

MEASURES OF DISPERSION

Measures of dispersion are measures of variability in a set of data.

• The range is calculated by subtracting the lowest from the highest value. Its strengths are that it is easy to calculate and includes extreme values, while its weaknesses are that it is distorted by extreme scores and does not indicate if data are clustered or spread evenly around the mean.

• The interquartile range displays the variability of the middle 50 per cent of a set of data. Its strengths are that it is easy to calculate and is not affected by extreme scores, while its weaknesses are that it does not include all scores and is inaccurate if there are big intervals between scores.

Standard deviation measures the variability (spread) of a set of scores from the mean. Its strengths are that it is more sensitive than the range, as all values are included and it allows the interpretation of individual values, while its weaknesses are that it is more complex to calculate and is less meaningful if data are not normally distributed.

Normal distribution

Normal distribution occurs when data have an even amount of scores either side of the mean. Normally distributed data are symmetrical — when such data are plotted on a graph they form a bell-shaped curve with as many scores below the mean as above. (See also page 58.)

Fig 7.9 Normal distribution of IQ scores

Checking data for normal distribution

Examine visually — inspect the data to see if scores are mainly around the mean.

Calculate measure of central tendency — work out the mean, median and mode to see if they are similar.

Plot the frequency distribution — put the data into a histogram to see if they form a bell-shaped curve.

SKEWED DISTRIBUTION

If data do not have a symmetrical distribution, the resulting graph is skewed and does not have an even amount of scores either side of the norm. Outliers (’freak’ scores) can cause skewed distributions.

• A positive skewed distribution occurs when there is a high extreme score or group of scores.

• A negative skewed distribution occurs when there is a low extreme score or group of scores.

So a positively skewed distribution has more high than low scores in it, while a negatively skewed distribution has more low than high scores in it.

Checking data for skewed distribution

The same ways that data are checked for normal distribution are used. Plotting data on a histogram will show if a skew is negative or positive.

Fig 7.10 Skewed distributions

﻿