The Study of Sex Differences - What's the Difference Anyway?

Gender, Nature, and Nurture - Richard A. Lippa 2014

The Study of Sex Differences
What's the Difference Anyway?

To understand research on sex differences, it is important first to understand a bit about the statistical methods used to study sex differences. When psychologists study variations in human traits such as height, intelligence, or personality, they often plot people's scores in the form of frequency distributions. Such distributions show the proportion of people who take on various values for a given trait. Figure 1.1, for example, displays the distribution of height in a particular group of people.

For large populations, trait distributions often take the approximate form of an idealized curve called the normal distribution (see Fig. 1.1). This distribution takes the shape of the familiar bell-shaped curve. The normal distribution has a precise mathematical definition, but that need not concern us here. Normal distributions often arise in nature when a trait—height, for instance—results from many small, random factors that add up to produce the trait. For example, an individual's height depends on many factors, such as the effects of individual genes, nutritional factors, exposure to infectious diseases and environmental chemicals, and so on.


FIG. 1.1 Height as a normally distributed trait.

A normal distribution can be characterized by two important numbers: its mean and its standard deviation. The mean is the average value of the distribution. Because normal distributions are symmetric (the right side is the mirror image of the left side), the mean of a normal distribution is at its center. The standard deviation is a measure of how narrow or spread out a distribution is; in a rough sense, it can be thought of as the average distance individuals are from the mean of the distribution. Distributions that are very spread out have large standard deviations, whereas distributions that are very narrow have small standard deviations. In a normal distribution, about two thirds of all values are in a range between one standard deviation below the mean and one standard deviation above the mean (see Fig. 1.1).

Consider the following example. In a recent study (Lippa, 2003b), I found the average height: of a sample of 313 Californian men to be 69.5 inches (5 feet, 9.5 inches), and the average height of a sample of 433 Californian women to be 64.5 inches (5 feet, 4.5 inches). Of course, these are just averages. Some women (half, to be exact) were taller than the average woman, and half were shorter than the average woman. In my study, the standard deviation (which, you will recall, is a measure of the spread of a distribution) was 3.15 inches for men and 2.74 inches for women. Because men's and women's heights were approximately normally distributed, about two thirds of all men were between 66.4 and 72.7 inches in height (between 5 feet, 6.4 inches and 6 feet, 0.7 inches), and two thirds of all women were between 61.8 and 67.2 inches in height (between 5 feet, 1.8 inches and 5 feet, 7.2 inches). If you iook at the idealized normal distributions of men and women's heights, which are shown in Fig. 1.2, you will see that most men were taller than most women.


FIG. 1.2 Distributions of men's and women's height in inches.

The difference between the height of men and women can be quantified in the following way, which will prove to be very useful in subsequent discussions of sex differences: Subtract the mean of the women's height from the mean of the men's height, then divide this difference by the standard deviation of each of the distributions (if they're not equal, use the weighted mean of the two standard deviations). The resulting number is called the d statistic (or sometimes, Cohen's d statistic, in honor of the statistician Jacob Cohen, who advocated its use; Cohen, 1977). In my study, d = (men's mean height - women's mean height) / the weighted mean standard deviation — (69.5 — 64.5) / 2.91 = 1.73.

Note, the d statistic takes into account two things when estimating how big the difference is between two distributions: (a) the difference between the means of the two distributions, and (b) the standard deviations of the two distributions. Stated a bit differently, the d statistic considers the difference in the means of two distributions in relation to the standard deviations of those distributions.



FIG. 1.3 Male and female distributions for a hypothetical test of "baby pacification ability."

Why is if important to take the standard deviation (i.e., the spread of the distributions) into account? The following example should make this clear. Suppose I develop a new test that tries to measure how successful people are at pacifying crying babies. Each person who completes my test is given, in succession, five squalling babies to rock and cuddle, and I measure with a stopwatch how long it takes each comforted baby to stop crying. The person's score is the average time it takes the five babies to stop crying. After collecting data for 500 men and 500 women, I am interested in determining whether there is a meaningful sex difference in baby pacification ability. Suppose I find that, on average, women pacify babies more quickly than men do—30 seconds more quickly, to be precise. Is this a big or a small difference? The key point to understand is that this difference does not mean much until it is compared to the standard deviations of the distributions (Fig. 1.3).

If the standard deviations are small (i.e., the distributions are narrow about their means), then a 30-second difference might be quite large and meaningful (Fig. 1.3, left). If the standard deviations are large (the distributions are spread out), however, the observed 30-second difference may not mean much at all (Fig. 1.3, right). In the first case, the two distributions do not overlap much and are quite distinct. The difference between them is quite apparent to the naked eye. In the second case, the two distributions overlap substantially and are not very different at all. In a sense, the d statistic assesses how much the two distributions overlap, not simply the degree to which the means of the two distributions differ.




FIG. 1.4 Small, moderate, and large differences between two groups. Note: To simplify the discussion, we have assumed that the normal distributions for both men and women have equal standard deviations. This assumption is not always warranted, however. For example, in measures of intellectual abilities, men's scores often have a greater spread (larger standard deviation) than women's scores do—that is, there are more very low-scoring and very high-scoring men than women. However, d can still be computed for such distributions.

In the study in which I measured the height of samples of Californian men and women, d proved to be 1.73. Is this large or small? Jacob Cohen (1977), the statistician who first promoted the use of the d statistic, offered the following rough guidelines for psychological research: values of around 0.2 are small, values of around 0.5 are moderate, and values of around 0.8 are large. (See Fig. 1.4 for an illustration of these different values of d). Here is another way to think about this. When d = 0.2, the two distributions overlap substantially, and although the difference between the means maybe statistically significant (i.e., not due to chance), the difference may nonetheless be small in terms of practical significance, and if is unlikely to be very noticeable in everyday life. When d = 0.5, however, the difference becomes large enough to be noticed in everyday life, and when d = 0.8, the difference is grossly apparent in everyday life; you don't have to do fancy studies to be aware of it. By Cohen's guidelines, the difference between men's and women's height in my study (d = 1.73) is very large, and you will probably agree that the height difference between men and women is readily apparent in everyday life. You do not need to be a scientist to know that men are generally taller than women.

Why is the d statistic important to researchers who study sex differences? First, it provides a standard way to compute sex differences. As we shall see, this statistic provides a way to average sex differences from different studies. Despite its usefulness to statisticians, however, the meaning of the d statistic may not always be obvious to lay people. Therefore, it is often useful to translate the d statistic into more commonsense kinds of information. In this chapter, I often do the following translation. I convert d statistics into the percentage of men who score higher than the average woman or the percentage of women who score higher than the average man on a particular trait or behavior.

How would this translation work for men and women's heights? For a d value of 1.73, we want to know what percentage of women are taller than the average man. Assuming that height is normally distributed for men and for women, the answer for my sample is that only about 4% of women would be taller than the average man. Conversely, 96% of men would be taller than the average woman. (These statements are consistent with the fact that the two distributions do not overlap very much; see Fig. 1.2).

As this chapter reviews evidence on sex differences in personality traits, aggression, interests, and cognitive abilities, you are presented with many d statistics. As I shall show, most psychological sex differences will prove to be much smaller than differences between men and women's height.