One of the benefits of using R for statistical analysis is the programming language which allows users to define their own functions, which is particularly useful for analysis that needs to be repeated. For example, a monthly output from a database may be provided in a pre-determined format and we might be interested in running the same initial analysis on the data.
Fast Tube by Casper
The function keyword is used to define a function and there is an optional list of function arguments that can be specified. Unlike some programming languages R provides a certain degree of flexibility with setting defaults for particular arguments and the way that the arguments are matched can sometimes cause unexpected behaviour. As such it is sensible to explicitly match a value to a particular argument, e.g. data = mydata, so that the matching is done as expected.
Consider a simple example of a function that we could write to calculate the volume of a cylinder. The cylinder itself has a radius and height, which will be the two arguments to our function. The basic definition of our function is as follows:
cylinder.volume = function(height, radius) { } |
The volume of a cylinder is pi * raidus * radius * height which we add to our function and save as an object that is returned at the end of the function calculations. (Edited based on comments – thanks for pointing out my blunder!). The last line of code in a function, by default, is assumed to be the return value.
cylinder.volume = function(height, radius) { volume = pi * radius * radius * height volume } |
This is a very simple example of function and if we call the function with a radius of 5 units and height of 10 units then the answer that is returned is:
> cylinder.volume(10, 5) [1] 785.3982 |
There are a number of things that we can do to the function to improve it. For example, how should the function react if the user does not specify a height and/or radius? Also what happens if a negative value is submitted to either argument?
Other useful resources are provided on the Supplementary Material page.
Obviously, one way to improve the function is to use the correct formula. The volume of a cylinder is given by pi * (radius ^2) * height.
A tiny correction: the volume of a cylinder is proportional to radius^2, not radius.
You’re on a roll, Ralph, but I think that formula needs to have the radius squared…
What happens if the user specifies VECTORS for height and radius?
First way to improve the function would be to use the correct formula, no?
V = pi * radius^2 * height
Trying some simple examples of vectorisation gives the following:
> cylinder.volume(10, 5)
[1] 785.3982
> cylinder.volume(10, 6)
[1] 1130.973
then
> cylinder.volume(10, c(5,6))
[1] 785.3982 1130.9734
Using a vector for the height as well we get:
> cylinder.volume(10, 5)
[1] 785.3982
> cylinder.volume(15, 6)
[1] 1696.46
and
> cylinder.volume(c(10, 15), c(5,6))
[1] 785.3982 1696.4600
So unless I have made any further schoolboy errors with my coding it looks like the function works as expected with vectors.