Vector Calculations to avoid Explicit Loops

May 23rd, 2009

The S programming language has facilities for applying a function to all the individual elements of a vector, matrix or data frame which avoid the need to make explicit use of loops. In fact using loops in R is not recommended as this will slow down the calculations, but there will of course be some situations where it is unavoidable. Read the rest of this entry »

Transformations to Create New Variables

May 18th, 2009

There are many situations where we might be interested in creating a new variable by transforming one of the variables already in the data frame. The R programming language can be used for either simple transformations or more complicated mathematical expressions where necessary. Read the rest of this entry »

Cross-tabulation of Data

May 15th, 2009

The contingency table is used to summarise data when there are factors in the data set and we are interested in counting the number of occurrences of each combination of factor variables. In R there are different ways that these types of table can be produced and manipulated as required. Read the rest of this entry »

Producing Data Summaries

May 11th, 2009

The first stage of most investigations is to produce summaries of the data to identify any unusual records and to get a overall feel for the contents of the data. This initial data analysis usually involves tabulation and plotting of data and there are a variety of functions available in R to generate the required summaries of interest. Read the rest of this entry »

Working with Subsets of Data

May 8th, 2009

There are often situations where we might be interested in a subset of our complete data and there are simple mechanisms for viewing and editing particular subsets of a data frame or other objects in R. Read the rest of this entry »

Importing Data from other Statistical Software Packages

May 1st, 2009

There are a large of number of software packages that are available for data analysts and the foreign package in R has functions defined to read data from some of the most commonly used packages that have their own proprietary data format. For other packages the user can often export data to a delimited text file which can then be handled easily by R. Read the rest of this entry »

Exporting Data from R to Text Files

April 27th, 2009

Exporting small or medium sized data sets from the R environment to text files is a straightforward task. The two functions that are most useful for this operation are write.csv and write.table which export data to comma separate variable format or a text format with a different character used to indicate separate columns of data. Read the rest of this entry »

Importing Data into R from Text Files

April 25th, 2009

The task of reading data into a statistical software package is not always a straight forward task and there are many varied file formats that are in use by different software systems. Text files are popular for sharing small or medium sized data sets, while full blown relational databases are more appropriate for larger data sets. The R Environment has functions that handle importing data that is stored in text format and it is also possible to interact with external database systems. Read the rest of this entry »