Quantitative psychology - reliability and validity

?

Validity

Validity is the extent to which a procedure measures what it is intended to measure.

Internal - the degree to which an experiment is methodologically sound and confound - free. In an internally valid study, the researcher feels confident that the results, as measured by the DV, are directly associated with the IV and are not the result of some other uncontrolled factor.

External - the degree to which research findings generalise beyong the specific context of the experiment being conducted. Results should generalise in three way: to other populations, to other experiments, and to other times.

1 of 22

Internal validity

- Ability of the researcher to state that the relationship he or she predicted between the IV and DV does exist and is due to the effects of the IV not the extraneous variables.

- The effetcs of these uncontrolled variables is known as "secondary variance"

- In other words, if the study is well designed and well controlled and if alternative explanations can be rules out it is said to be internally valid.

2 of 22

External validity

The degree to which results generalise to other subjects and situations.

Need to select a representative sample

3 of 22

Interval vs External validity

By strengthening internal validity, external validity may be compromised.

More careful control = higher internal validity.

But, the more control with lab setting the less resemblance to real life and less generalisable - low external validity.

Needs to be a balance and depends upon aims of the study (basic vs applied research)

4 of 22

Factors that can affect validity

1) Non-random assignment: of participants to conditons (all competent participants may end up in one condition)

2) Maturation: participants may change over the course of the study (practice, boredom)

3) Order effects: performance on a first task is affected by performance on a second (motivation, fatigue)

4) Regression towards the mean: baseline differences decrease on follow-up

5) Time between multiple observation

6) Participant drop-out and low response rates

5 of 22

Factors that can affect validity

7) Different time and place.

8) Differential treatment: instructions, practice trials

9) Pre-test sensitization / demand charaterisitics.

10) Reactive effects: effects of being observed.

11) Multiple treatments effects: contamination across conditions.

6 of 22

Dealing with secondary variance

Eliminate it

- The best way to deal with it.

- Eliminate possible variance due to participant expectations via placebo (Single-blind)

- Eliminate possible variance due to researcher expectations by double-blinding.

7 of 22

Dealing with secondary variance

Holding it constant

- Where elimination is not possible.

- Treat all participants the same and test under the same conditions.

8 of 22

Potential sources of secondary variance

- Expectation to improve?(placebo)

- Were the researchers also 'expecting' an effect?

- Double-blind placebo design - both researcher and particpants are blind to the procedure.

- Single-blind placebo design - just the participant is blind to the procedure.

9 of 22

Well designed experiments

Tend to have great internal validity.

May have less external validity as sample and conditions are very specific to experiment

Strongest test of causal hypothesis.

10 of 22

Well designed experiments

Tend to have great internal validity.

May have less external validity as sample and conditions are very specific to experiment

Strongest test of causal hypothesis.

11 of 22

Poorly designed to quasi-experiments

Quasi-experiments: conditions are based upon some inherent subjective characteristic

No random assignment of subjects to all conditions possible.

This greatly reduces internal validity, because the (quasi-) IV may be confounded with extraneous variables.

12 of 22

Interval vs observational studies

- Correlation does not imply causation.

- Intervention studies are the only way to approach cause and effect

- To test causal hypothesis

- Design principle:

    - Control all variables except IV

    - IV is manipulated by the investigator to determine effects on DV.

13 of 22

WHIMS - The Women's Health Initiative Memory study

Design, Setting, and Participants: A randomized, double-blind, placebo-controlled clinical trial. 4532 postmenopausal women free of probable dementia, aged 65+ years were enrolled.

Intervention: Either 1 daily tablet of 0.625 mg of conjugated equine estrogen plus 2.5 mg of medroxyprogesterone acetate (n = 2229), or a matching placebo (n = 2303).

Results 61 women were diagnosed with probable dementia, 40 (66%) in the estrogen plus progestin group compared with 21 (34%) in the placebo group. The hazard ratio (HR) for probable dementia was 2.05 (95% CI, 1.21-3.48; P = .01). This increased risk would result in an additional 23 cases of dementia per 10 000 women per year.

Conclusions Estrogen plus progestin therapy increased the risk for probable dementia in postmenopausal women aged 65 years or older.

14 of 22

Observational studies

§To determine if one factor is related to another factor §Often the background for intervention designs §Design principle §Observe natural variation in independent variable (IV) and dependent variable (DV) §Test correlation between both, controlling for other known determinants of DV, e.g. by including them as co-variates in multiple regression

15 of 22

Associations between circulating sex steroid hormo

OBJECTIVE: Exploratory analyses of associations between levels of sex hormones and cognitive performance in elderly women.

METHOD: Measured circulating sex hormone levels in 39 highly educated, nondemented, elderly women. Levels were correlated with neuropsychological performance, controlling for age, education, use of estrogen replacement, and depression.

RESULTS: High estradiol levels were associated with better delayed verbal memory and retrieval efficiency, whereas low levels were associated with better immediate and delayed visual memory. Levels of testosterone were related positively to verbal fluency. Levels of progesterone and androstenedione were unrelated to cognitive performance.

CONCLUSIONS: Both estrogen and testosterone showed associations with cognitive performance.

16 of 22

Reliability and measurement variance

Within groups variance reduces key effect strength.

Some unavailable variance e.g. inherent limits to measurement precision and natural variance between people.

Possible sources of avoidable variance 

 - inappropriate or inconsistant measuring

 - incorrect data recording or analysis

 - inconsistent treatment of particiapants.

Always calibrate and test equipment

Be precise with measurement

Statistical procedures account for variance.

17 of 22

Reliability

Reliability is the extent to which a particular measure is consistant and reproducable

Measures can include questionnaires, tests, a computer that records responses, a stop watch etc

Reliability depends of measurement error.

Reliability means that the test will yield comparable results whrn repeated measures are taken under matched conditions.

Test-retest reliability.

Inter-rater reliability

Based on correlation coefficients.

18 of 22

Observations

Naturalistic observation - allow to occur without interference or intervention by the researcher.

Weaknesses: often not easy to observe without being intrusive.

                    Subject to observer effects

                    Less externally reliable (problems with generalisation)

Strengths: Study behaviour in real setting - no observer effects

                Lab observation - more structured, usually have observation schedule - internally valid.

19 of 22

Case study

In-depth investigation on one individual, used to reconstruct major aspects of a person's life. Attempts to see what events led to current situations. Usually involves: interview, observation, examine records, and psych testing.

Weaknesses: Very subjective. Like piecing together a puzzle, often there are gaps - relies off memory of the individual, medical records, valid observational techniques.

Stengths: Good for assessing psychological disroders - can see history and development.

20 of 22

Survey

- Either a written questionnaire, verbal interview, or combination of the two, used to gather information about specific aspects of behaviours

Weaknesses: self-report data (honesty is questionable)

Strengths: Gather a lot of information in a short time. 

                Gather information on issues that are not easily observable.

21 of 22

Questionnaires

Usually paper based test and then score the answers to draw conclusions.

Weaknessess: Validity is always questionable; honesty/social desirability.

Strenghts: Can be very predictive and useful if valid.

22 of 22

Comments

No comments have yet been made

Similar Psychology resources:

See all Psychology resources »See all Visual System resources »