﻿ ﻿Inferential testing - Research methods

# Inferential testingResearch methods

Probability and significance

Probability involves deciding whether results are significant by producing a cut-off point that determines whether findings are beyond chance factors or not. Psychology uses a significance level of p ≤ 0.05, giving a 95 per cent assurance of findings being beyond chance. This means that 5 per cent of the time that significant results are found (seemingly beyond the boundaries of chance) they are in fact not significant and are actually due to chance factors. This is seen as being an acceptable level of error. In some instances, such as when testing new drugs, a more stringent level of p ≤0.01 is used, entailing a 99 per cent certainty of findings being significant (this means there is only a 1 per cent chance of insignificant results that are actually explainable by chance factors being seen as significant and beyond the boundaries of chance). Type I errors occur when findings are accepted as significant but are not, as the significance level was too low, while Type II errors occur when findings are accepted as insignificant but are not, as the significance level was too high.

Inferential testing

Research produces data which are analysed by inferential statistical tests to see whether differences and relationships found between sets of data are significant or not. Three criteria need to be considered when choosing an appropriate statistical test:

1 What design has been used — whether an independent groups design or a repeated measures design (including a matched pairs design) has been used.

2 What types of outcome are being tested for — is a difference or a relationship between two sets of data being sought?

3 What level of measurement has been used — were the data produced of nominal, ordinal or interval/ratio level?

Table 7.3 Choosing an appropriate statistical test

 Nature of hypothesis Level of measurement Type of research design Independent (unrelated) Repeated (related) Difference Nominal data Ordinal data Interval data Chi-squared Mann-Whitney U test Independent t-test Sign test Wilcoxon (signed-matched ranks) Related t-test Correlation Ordinal data Interval data Spearman’s rho Pearson product moment

LEVELS OF MEASUREMENT

Nominal data — consist of frequencies, for example how many days of a week were rainy or not. Nominal data are relatively uninformative, for instance they would not tell us how rainy any particular day was.

Ordinal data — involve putting data into rank order, for example finishing places of runners in a race. This is not fully informative, as although we know who are the better runners, we do not know by how much they are better.

Interval/ratio data — involve data with standardised measuring distances, such as time. This is the most informative type of data. Interval data have an arbitrary zero point, for instance zero degrees temperature does not mean there is no temperature. Ratio data have an absolute zero point, for instance someone with zero pounds in their bank account has no money.

Statistical tests

Sign test — used when a difference is predicted between two sets of data, data are of at least nominal level and an RMD/MPD has been used. The sign test works by calculating the value of s (the less frequent sign) and comparing this value to those in a critical values table to see whether the result is significant or not.

Chi-squared — used when a difference is predicted between two sets of data, data are of at least nominal level and an IGD has been used. Chi-squared can also be used as a test of association (relationship).

Mann-Whitney — used when a difference is predicted between two sets of data, data are at least of ordinal level and an IGD has been used.

Wilcoxon signed-matched ranks — used when a difference between two sets of data has been predicted, data are of at least ordinal level and an RMD/MPD has been used.

Independent (unrelated) t-test — used when a difference is predicted between two sets of data, data are normally distributed and of interval/ratio level, and an IGD has been used.

Repeated (related) t-test — used when a difference is predicted between two sets of data, data are normally distributed and of interval/ratio level, and an RMD/MPD has been used.

Spearman’s rho — used when a relationship (correlation) is predicted between two sets of data, data are of at least ordinal level and consist of pairs of scores from the same person or event.

Pearson product moment — used when a relationship is predicted between two sets of data, data are normally distributed, of interval/ratio level and consist of pairs of scores from the same person or event.

Interpretation of significance

Statistical analysis produces an observed value and this is compared to a critical value to see whether it is significant or not. To do this, critical value tables need to be referenced, taking into account whether a hypothesis was one- or two-tailed, how many participants or participant pairs were involved and what level of significance was used. The Mann-Whitney, Wilcoxon and sign tests require an observed value to be equal to or less than a critical value to be significant. The Chi-squared, independent t-test, dependent t-test, Spearman’s rho and Pearson product moment require an observed value to be equal to or greater than a critical value to be significant. ﻿