Using Diff Function in R

Calculating the difference between elements is a fundamental operation. We can also calculate the difference between consecutive elements easily, but for huge sets of data achieving this manually, line-by-line is not efficient.

In R programming, the diff() computes the difference between consecutive elements of the vector, which is passed to the function. The final result is also a vector. For example:

x <- c(5,3,4,3,8,9,4,8,1)
diff(x)
[1] -2  1 -1  5  1 -5  4 -7

As you can see in the above example, the diff() functions returns the difference between consecutive elements (3 - 5 = -2, 4 - 3 = 1,….). Also, notice the resultant vector has one element less; this is because it cannot calculate the last element’s difference.

We can also add two parameters to the diff() function. These are the lag and the differences parameters.

The lag parameter can specify the gap between the elements whose difference is calculated. It is 1 by default. When the lag parameter is 2, the diff() function will calculate the difference between the first and third element, the second and fourth element, etc. The following example will clear things up:

diff(x, lag = 2)
[1] -1  0  4  6 -4 -1 -3

The differences parameter is used to specify the order of differences. For example, we set it to 2; then it will first calculate the difference between the given vector elements, then it will again calculate the difference of consecutive elements of the resultant vector. The following code snippet will explain this:

diff(x)
[1] -2  1 -1  5  1 -5  4 -7
diff(x, differences = 2)
[1]   3  -2   6  -4  -6   9 -11

We can also have both these parameters set to some specific value at once. For example, in the code below, we have set the lag as 2 and differences as 2.

diff(x, differences = 2, lag = 2)
[1]  5  6 -8 -7  1

Related Article - R Math

  • E in R