Reliability and validity
Reliability concerns the extent to which a test or measurement produces consistent results. To be reliable, if a study was repeated exactly, the same results should be obtained. Reliability can be improved by developing more consistent forms of measurement.
Types of reliability
There are two main types of reliability:
1 Internal validity concerns whether findings are consistent within themselves, for example, a measurement of height should measure the same distance between 2 metres and 4 metres as between 5 metres and 7 metres.
2 External reliability concerns whether findings are consistent over time, for example, an IQ test should produce the same level of intelligence for an individual on different occasions, as long as their level of intelligence remains the same.
WAYS OF ASSESSING RELIABILITY
• The split-half method measures internal reliability by dividing a test in two and having the same participant do both halves. If the two halves of the test provide similar results, then the test is seen as having internal reliability.
• The test—re-test method measures external reliability by giving the same test to the same participants on at least two occasions. If similar results are obtained, then external reliability is established.
• Inter-observer reliability measures whether different observers are viewing and rating behaviour similarly. This can be assessed by correlating the observers’ scores, with a high correlation indicating they are observing and categorising similarly. Inter-observer reliability can be improved by developing clearly defined and separate categories of behavioural criteria.
The connection between reliability and validity
To be valid, results must first be reliable. However, results can be reliable without being valid. For example, if 1 + 1 was added up on several occasions and each time the answer was 3, then the findings would be reliable (consistent) but not valid (accurate). If 1 + 1 is added up on several occasions and the answer is always 2, then the results are both reliable and valid.
Validity concerns accuracy, the degree to which something measures what it claims to. Therefore validity refers to how accurately a study measures what it claims to and the extent to which findings can be generalised beyond research settings as a consequence of a study’s internal and external validity (see below). Validity can be improved by increasing reliability and by improving internal and external validity.
Types of validity
There are two main types of validity:
1 Internal validity concerns whether findings are due to the manipulation of the IV or confounding variables. Internal validity can be improved by reducing investigator effects, minimising demand characteristics and by use of standardised instructions and a random sample. The more a study is controlled, the more sure we are that findings are due to the effect of the IV and not to poor methodology.
2 External validity concerns the extent to which findings from a study have ecological validity (can be generalised to other settings), population validity (can be generalised to other people) and temporal validity (can be generalised over time). External validity is improved by carrying out studies in more naturalistic settings.
WAYS OF ASSESSING VALIDITY
• Face validity involves ’eyeballing’ items to assess the extent to which they look like what a test claims to measure.
• Concurrent validity assesses validity by correlating scores on a test with another test that is accepted as being valid.
• Predictive validity assesses validity by seeing how well a test predicts future behaviour — for instance, do school entrance tests accurately predict later exam results?
• Temporal validity evaluates to what extent research findings remain true over time.