The statistical methodology of design of experiments has a long history starting back with the work of Fisher, Yates and other researchers. One of the main motivating factors is to make good use of available resources and to avoid making decisions that cannot be corrected during the analysis stage of an investigation.
The statistical methodology is based on a systematic approach to investigate the causes of variation of a system of interest and to control the factors that can be while taking some account of nuisance factors that can be measured but not controlled by an experimenter. As with most things there are some general principles and common considerations for experiments run in a variety of different areas.
The following considerations are required for running an experiment:
- Absence of Systematic Error: when running an experiment the aim is to obtain a correct estimate of the metric of interest, e.g. treatment effect or difference. The design selected should avoid the introduction of bias into the subsequent analysis.
- Adequate Precision: an experimental design is chosen to allow estimation and comparison of effects of interest, e.g. differences between treatments, so there should be sufficient replication in the experiment to allow these effects to be precisely estimated and also for meaningful differences to be detected.
- Range of Validity: the range of variables considered in the experiment so cover the range of interest so that the results can be generalised without need to rely on extrapolation.
- Simplicity: ideally the choice of design should be simple to implement to ensure that it can be run as intended and to reduce the chance of missing data which could impact on the analysis of the results.
There are various objectives from using design of experiments methodology and these include (a) screening of a large number of factors to reduce this set to a more manageable subset that can be investigated in greater detail, (b) response surface methodology to understand the behaviour or a system and (c) optimisation of a process or reduction of uncontrollable variation (or noise) in a process.
There are not a large number of packages relating to design of experiments in R but those of interest are covered by the Task View on CRAN.