Design of Experiments – Full Factorial Designs

December 1st, 2009

In designs where there are multiple factors, all with a discrete group of level settings, the full enumeration of all combinations of factor levels is referred to as a full factorial design. As the number of factors increases, potentially along with the settings for the factors, the total number of experimental units increases rapidly. Read the rest of this entry »

Design of Experiments – Optimal Designs

November 29th, 2009

When designing an experiment it is not always possible to generate a regular, balanced design such as a full or fractional factorial design plan. There are usually restrictions of the total number of experiments that can be undertaken or constraints on the factor settings both individually or in combination with each other. Read the rest of this entry »

Design of Experiments – Power Calculations

November 18th, 2009

Prior to conducting an experiment researchers will often undertake power calculations to determine the sample size required in their work to detect a meaningful scientific effect with sufficient power. In R there are functions to calculate either a minimum sample size for a specific power for a test or the power of a test for a fixed sample size. Read the rest of this entry »

Graph Examples from Visualizing Data by William Cleveland

November 12th, 2009

The trellis graphics approach was pioneered by various statistical researchers and the ideas are used extensively in the book “Visualizing Data” by William Cleveland. There are various resources on the website for trellis graphics including S code for creating the majority of the graphs that appear in the book. Inspired by efforts on the Learning R blog to recreate the examples from Deepayan Sarkar’s book on lattice using ggplot2 I have decide to undertake a similar exercise based on the scripts that have been made available for creating the graphs from the book. Read the rest of this entry »

Using Faceting in ggplot2 to create Trellis-like Plots

November 9th, 2009

One of the main strengths of the Trellis graphics paradigm is the use of panelling to divide data into subsets to investigate whether patterns are consistent as the conditioning variables change. In the ggplot2 package the terminology for specifying these separate panels is faceting and can be used to create similar displays. Read the rest of this entry »

Creating scatter plots using ggplot2

November 6th, 2009

The ggplot2 package can be used as an alternative to lattice for producing high quality graphics in R. The package provides a framework and hopefully simple interface to producing graphs and is inspired by the grammar of graphics. Read the rest of this entry »

Investigation the relationship between two variables using a scatter plot

October 13th, 2009

The relationship between two variables can be visually represented using a scatter plot and will provide some insight into the correlation between the variables and possible models to describe the relationship. There are different ways to produce scatter plots in R making use of either the base graphics system, the lattice graphics library, ggplot2 or other packages. Read the rest of this entry »

Book Review – Interactive and Dynamic Graphics for Data Analysis: With R and GGobi by Dianne Cook and Deborah F. Swayne (Springer 2007)

October 8th, 2009

[amazonshowcase_8e4d1053a4b601bb3654d3a5d4e8ed6d]

This book covers interactive graphics and their role in data analysis and covers the GGobi software package, which is an open source project for data visualisation, and the book is written by the two authors as well in addition to the R statistical environment. Read the rest of this entry »

Cleveland’s Dot Plots for Plotting Data

September 26th, 2009

The dot plot was introduced by Cleveland to provide a powerful visual display to compare groups of data and a function for this type of graphical display is available in the lattice library for R. Data is divided into groups and dots are used to indicate the value of a particular variable with the groups arranged either horizontally or vertically to allow a visual comparison of the distributions for the groups. Read the rest of this entry »

Solving Linear Programming Problems with the Gnu Linear Programming Kit

September 20th, 2009

The GNU Linear Programming Kit (GLPK) can be accessed in R and used for solving large-scale linear programming (LP), mixed integer programming (MIP) and other scenarios. The Rglpk package can be installed and provides a simple interface to using the GLPK. Read the rest of this entry »