Univariate statistics

Univariate analysis involves examining each explanatory variable in the dataset separately. Most traditional methods of analysing data can be used for univariate analysis, including analysis of variance (ANOVA), tests for goodness-of-fit, and correlation and regression. In contrast, the aim of multivariate analyses is to find patterns of relationships in a dataset between two or more variables simultaneously. Some of the tests listed below can be described as univariate or multivariate, depending on the number of variables being examined. The patterns identified using univariate or multivariate statistics can then be used to generate additional hypotheses, which may be tested with further experimental or field survey work.

Learning materials for Design & Analysis of Biological Studies SBI209 provides an overview of the techniques mentioned in this section. You can refer to these at (http://www.cdu.edu.au/faculties/science/sbes/resources
/SBI209/S209L3-Topics8-11/index.htm).

Goodness-of-fit tests
Goodness-of-fit tests are used to compare observed frequency distributions of data with theoretical or expected distributions. Frequency distributions analysed using these tests include the distribution of population age classes, the distribution of phenotypes within a population, or the distribution of individuals in space within or between habitats. You may have heard of two widely used goodness-of-fit tests; the Chi-squared test and the Komolgrov-Smirnov tests.

In most cases, to conduct a goodness-of-fit test the data must be expressed in counts or frequencies (not percentage data).

Analysis of variance (ANOVA)
ANOVA is used to test hypotheses about differences in the mean among different groups. One- or mulit-factor ANOVA allows differences in mean values of response variables (e.g. animal abundance or species richness) to be compared for different factors (e.g. vegetation type, season or site). For example, mean species richness (the response or dependent variable) could be compared among different vegetation types (the explanatory or independent variable), e.g. 'woodland', 'grassland' and 'rainforest'. These different vegetation categories are known as 'treatments' in the ANOVA. In comparing the means between treatments, the amount of variation in the distribution of the data about the mean is taken into account. In addition to the relatively straightforward ANOVA described above, more complex ANOVA designs also exist. For example, 'Repeated measures' ANOVA can be used to analyse data where samples have been drawn from the same plot on more than one occasion (e.g. monitoring change in a variable over time).

To conduct an ANOVA, the response variable data must be continuous, and the explanatory variable data must be categorical.

NB. the t-test is calculated as for ANOVA, however it can only test for differences between between two means.

Correlation
Correlation is a measure of the strength and direction of a linear association between two or more continuous variables. For example, it can be used to explore relationships between species abundance and an environmental attribute such as rainfall.

Regression
Regression analysis also explores the relation between two or more continuous variables. However, regression also provides a mathematical description of the relationship between those variables (in the form of a regression equation). This allows predictions to be made about the response variable.

Regression analysis is related to correlation analysis, but allows for a measure of the extent to which the value of an explanatory variable explains the value of another (response) variable.

back to "Types of statistical analysis"