Experimental design, types of data, sampling, displaying data: histograms
- Created by: Sam_dearnx
- Created on: 24-01-17 21:37
Basic experimental designs
Within and between subjects designs
- When we deisgn an experiment, we must decide whether to use a within subjects design, or a between subjects design.
Between subjects: Participants only take part in one condition of the study and the results are compared between the two groups.
Wuthin subjects: Particoants take part in both conditions over consecutive days and the results are compared between each other.
Drawbacks in between subjects design
- More participants needs with BS design to get enough power for meaningful stats
- Differences in performance may be due to group difference and not related to the IV (randomisation, baseline differences)
- There may be a problem in interpreting your results if the 2 groups are treated ‘differently’. The first group may react differently.
Drawbacks in within subjects design
Order effects: practice learning effects, boredom and fatigue.
Counter-balancing: Assign half of group to condition A and half to condition B.This was practice/fatigue effects will be 'cancelled out' and can be assessed statistically if recorded.
Sampling techniques
- ‘sample’ – individuals participating in your study
- ‘Population’ – the wider groups about which you wish to learn
- Important for external validity
- If sample is not ‘representative’ from the population – cannot generalise.
Example:
- Attitudes of Irish people to have sex before marriage
- A sample of women from a Catholic church would not be representative of the population.
Example:
- University student sample – are they representative of the population of the UK? No, age, economic status etc differ from the population.
Sampling
Methods to select a study population (sample) that approximates a specific larger population that you aim to make inferences about.
Sampling types
Probability sampling: using methods to ensure that sample is representative of the population we want to infer about.
Random: Sample is not structured to approx population.
Statified random sampling: select sample that matches important population charaterisitics.
Quota sampling - A sampling method of gathering representative data from a group.
Non-probability sampling: more widely used. Based on convenience. External validity may be reduced.
Opportunity sampling: The 'take whoever comes along' approach. Certain topics will attract certain people. Volunteers more likely to be: female, higher educated, high SES, higher intelligence, more curious.
Types of data
Studies yield results or data
- Data are usually number/category counts – quantitative
- Some words, interpretation – Qualitative
- Appropriate analysis techniques (e.g stats) depend on study design and data characteristics
Types
Quantitative data (measurement data): data obtained by assigning numbers to events or objects in a systematic way (measurement)
Categorical data:
- Unordered (nomial) versus ordered (ordinal)
- E.g. counts or numbers of observations in each of a number of categories
Degree
Male
Female
Psychology
31
74
Maths
268
71
Law
24
27
Quant data – levels of measurement
Categorical (discrete values)
Nominal
Nominal
- The use of number as names for the category that an object or event belongs to.
- E.g., in a studt, we may decide to label females as 1, and males as 2
- These number have nothing to do with the object itself; boys are not twice the value of girls, footballers number 2 is not half as good as number 4.
- Numbers are used only to distinguish between objects where all we know if that they are different
Ordinal
Ordinal
- The size of the number does represent something about the quantity of whatever is being measured.
- However, neither the size of the difference between numbers, nor the ratio of the number, is informative.
- Number used only to place objects in order.
- E.g, the ranking of degree results: a first is ‘better’ than a third, but it is impossible to quantify ‘how much better’ the first class degree is.
Interval
CONTINUOUS DATA
Interval
- The differences between the numbers are equal, and that indicates that there are equal differences in whatever is being measured.
- Ratios between differences are meaningful
- Ratios between values are not as zero point is arbitrary.
- E.g., degrees centigrade. We know that the difference between 0 and 10 degrees is the same as the difference between 10 and 20 degrees. Lowest value can be less than 0 – range depends on measurement type and zero point can be arbitrary.
- Kelvin scale would be ratio data as 0K is true zero point (0K = -273.15C)
- Equal intervals between objects represent equal differences; the differences are meaningful.
Ratio
CONTINUOUS DATA
Ratio
- If something is measured at 100cm, it is ten time as long as something that is 10cm.
- There is also a true zero point, where there is complete absence of whatever is being measured.
- Measurements of length, time, height, and weight etc, are true ratio scales.
- A ratio scales is one with a true zero point, where ratios between numbers are meaningful.
Frequency distributions
- Counts of values with categories in a variable
- Can be shown as table or in a bar chart
- Example: test scores of a sample of 10 students
Frequency, percentage and probability
- Absolute counts of values can be turned into probability by dividing total (n)
- Percentage =probability * 100
Histograms
Special bar chart for continuous variables to see the shape of data distribution
- Number of peaks
- Mode (most frequent measurement in a variable, next lecture)
- Spread of the data (variance)
- Symmetry:
- Is the main peak roughly in the middle?
- Do the positive and negative tails look similar?
- Related to skew
Histogram artefacts: bin size
- Number of categories (bins) determines the number of bars in the histogram
- Stats program sets defaults but you can override
- More fine grain and thus usually more deviation from symmetry becomes apparent as we use more bins.
Special rules for creating histograms
- Where categories are continuous (e.g., increasing continuous values), bars should be joined
- Bar height represents frequency or %
- Bar width represents width of a category, so equal categories have equal widths.
- Comparing two distributions with different numbers means you should use %, not n, to make scales directly comparable.
More general rules for creating graohs
- Use the correct type of graph for the type of data
- E.g., bar charts for categorical data and histograms for continuous variables.
- Label the graph clearly and include units of measurement – title, axes
- Make the plot neat and clear – use same colours for same categories across graphs, include a key, don’t have too many categories.
- Include sources for external data
- Have a (numbered) caption that explains the graph concisely.
Comments
No comments have yet been made