The ggmap package can be used to access maps from the Google Maps API and there are a number of examples on various statistics related blogs. These include here, here and here. Read the rest of this entry »
Google Maps and ggmap
December 22nd, 2013Theme Elements in ggplot2
May 3rd, 2012Generalized Linear Models – Poisson Regression
June 26th, 2011The Generalized Linear Model (GLM) allows us to model responses with distributions other than the Normal distribution, which is one of the assumptions underlying linear regression as used in many cases. When data is counts of events (or items) then a discrete distribution is more appropriate is usually more appropriate than approximating with a continuous distribution, especially as our counts should be bounded below at zero. Negative counts do not make sense. Read the rest of this entry »
Plotting Time Series data using ggplot2
September 30th, 2010There are various ways to plot data that is represented by a time series in R. The ggplot2 package has scales that can handle dates reasonably easily. Read the rest of this entry »
Charting the performance of cricket all-rounders – IT Botham
August 16th, 2010Cricket is a sport that generates a large volume of performance data and corresponding debate about the relative qualities of various players over their careers and in relation to their contemporaries. The cricinfo website has an extensive database of statistics for professional cricketers that can be searched to access the information in various formats. Read the rest of this entry »
Summarising data using box and whisker plots
April 25th, 2010A box and whisker plot is a type of graphical display that can be used to summarise a set of data based on the five number summary of this data. The summary statistics used to create a box and whisker plot are the median of the data, the lower and upper quartiles (25% and 75%) and the minimum and maximum values. Read the rest of this entry »
Summarising data using scatter plots
April 18th, 2010A scatter plot is a graph used to investigate the relationship between two variables in a data set. The x and y axes are used for the values of the two variables and a symbol on the graph represents the combination for each pair of values in the data set. This type of graph is used in many common situations and can convey a lot of useful information. Read the rest of this entry »
Summarising data using histograms
April 11th, 2010The histogram is a standard type of graphic used to summarise univariate data where the range of values in the data set is divided into regions and a bar (usually vertical) is plotted in each of these regions with height proportional to the frequency of observations in that region. In some cases the proportion of data points in each region is shown instead of counts. Read the rest of this entry »
Summarising data using dot plots
March 26th, 2010A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories. The dot plot can be arranged with the categories either on the vertical or horizontal axis of the display to allow comparising between the different categories as well as comparison within categories where there are multiple symbols used to denote say different years. Read the rest of this entry »