Experimental design, types of data, sampling, displaying data: histograms

Basic experimental designs

Within and between subjects designs

- When we deisgn an experiment, we must decide whether to use a within subjects design, or a between subjects design.

Between subjects: Participants only take part in one condition of the study and the results are compared between the two groups.

Wuthin subjects: Particoants take part in both conditions over consecutive days and the results are compared between each other.

1 of 18

Drawbacks in between subjects design

- More participants needs with BS design to get enough power for meaningful stats

- Differences in performance may be due to group difference and not related to the IV (randomisation, baseline differences)

- There may be a problem in interpreting your results if the 2 groups are treated ‘differently’. The first group may react differently.

2 of 18

Drawbacks in within subjects design

Order effects: practice learning effects, boredom and fatigue.

Counter-balancing: Assign half of group to condition A and half to condition B.This was practice/fatigue effects will be 'cancelled out' and can be assessed statistically if recorded.

3 of 18

Sampling techniques

- ‘sample’ – individuals participating in your study

- ‘Population’ – the wider groups about which you wish to learn

- Important for external validity

- If sample is not ‘representative’ from the population – cannot generalise.

Example:

- Attitudes of Irish people to have sex before marriage

- A sample of women from a Catholic church would not be representative of the population.

Example:

- University student sample – are they representative of the population of the UK? No, age, economic status etc differ from the population.

4 of 18

Sampling

Methods to select a study population (sample) that approximates a specific larger population that you aim to make inferences about.

5 of 18

Sampling types

Probability sampling: using methods to ensure that sample is representative of the population we want to infer about.

Random: Sample is not structured to approx population.

Statified random sampling: select sample that matches important population charaterisitics.

Quota sampling - A sampling method of gathering representative data from a group.

Non-probability sampling: more widely used. Based on convenience. External validity may be reduced.

Opportunity sampling: The 'take whoever comes along' approach. Certain topics will attract certain people. Volunteers more likely to be: female, higher educated, high SES, higher intelligence, more curious.

6 of 18

Types of data

Studies yield results or data

- Data are usually number/category counts – quantitative

- Some words, interpretation – Qualitative

- Appropriate analysis techniques (e.g stats) depend on study design and data characteristics

7 of 18

Types

Quantitative data (measurement data): data obtained by assigning numbers to events or objects in a systematic way (measurement)

Categorical data:

- Unordered (nomial) versus ordered (ordinal)

- E.g. counts or numbers of observations in each of a number of categories

Degree

Male

Female

Psychology

31

74

Maths

268

71

Law

24

27

Quant data – levels of measurement

Categorical (discrete values)

8 of 18

Nominal

Nominal

- The use of number as names for the category that an object or event belongs to.

- E.g., in a studt, we may decide to label females as 1, and males as 2

- These number have nothing to do with the object itself; boys are not twice the value of girls, footballers number 2 is not half as good as number 4.

- Numbers are used only to distinguish between objects where all we know if that they are different

9 of 18

Ordinal

Ordinal

- The size of the number does represent something about the quantity of whatever is being measured.

- However, neither the size of the difference between numbers, nor the ratio of the number, is informative.

- Number used only to place objects in order.

- E.g, the ranking of degree results: a first is ‘better’ than a third, but it is impossible to quantify ‘how much better’ the first class degree is.

10 of 18

Interval

CONTINUOUS DATA

Interval

- The differences between the numbers are equal, and that indicates that there are equal differences in whatever is being measured.

- Ratios between differences are meaningful

- Ratios between values are not as zero point is arbitrary.

- E.g., degrees centigrade. We know that the difference between 0 and 10 degrees is the same as the difference between 10 and 20 degrees. Lowest value can be less than 0 – range depends on measurement type and zero point can be arbitrary.

- Kelvin scale would be ratio data as 0K is true zero point (0K = -273.15C)

- Equal intervals between objects represent equal differences; the differences are meaningful.

11 of 18

Ratio

CONTINUOUS DATA

Ratio

- If something is measured at 100cm, it is ten time as long as something that is 10cm.

- There is also a true zero point, where there is complete absence of whatever is being measured.

- Measurements of length, time, height, and weight etc, are true ratio scales.

- A ratio scales is one with a true zero point, where ratios between numbers are meaningful.

12 of 18

Frequency distributions

- Counts of values with categories in a variable

- Can be shown as table or in a bar chart

- Example: test scores of a sample of 10 students

13 of 18

Frequency, percentage and probability

- Absolute counts of values can be turned into probability by dividing total (n)

- Percentage =probability * 100

14 of 18

Histograms

Special bar chart for continuous variables to see the shape of data distribution

- Number of peaks

- Mode (most frequent measurement in a variable, next lecture)

- Spread of the data (variance)

- Symmetry:

- Is the main peak roughly in the middle?

- Do the positive and negative tails look similar?

- Related to skew

15 of 18

Histogram artefacts: bin size

- Number of categories (bins) determines the number of bars in the histogram

- Stats program sets defaults but you can override

- More fine grain and thus usually more deviation from symmetry becomes apparent as we use more bins.

16 of 18

Special rules for creating histograms

- Where categories are continuous (e.g., increasing continuous values), bars should be joined

- Bar height represents frequency or %

- Bar width represents width of a category, so equal categories have equal widths.

- Comparing two distributions with different numbers means you should use %, not n, to make scales directly comparable.

17 of 18

More general rules for creating graohs

- Use the correct type of graph for the type of data

- E.g., bar charts for categorical data and histograms for continuous variables.

- Label the graph clearly and include units of measurement – title, axes

- Make the plot neat and clear – use same colours for same categories across graphs, include a key, don’t have too many categories.

- Include sources for external data

- Have a (numbered) caption that explains the graph concisely.

18 of 18

Get Revising