Calculating the difference between elements is a fundamental operation. We can also calculate the difference between consecutive elements easily, but for huge sets of data achieving this manually, line-by-line is not efficient.
In R programming, the
diff() computes the difference between consecutive elements of the vector, which is passed to the function. The final result is also a vector. For example:
x <- c(5,3,4,3,8,9,4,8,1) diff(x)  -2 1 -1 5 1 -5 4 -7
As you can see in the above example, the
diff() functions returns the difference between consecutive elements (3 - 5 = -2, 4 - 3 = 1,….). Also, notice the resultant vector has one element less; this is because it cannot calculate the last element’s difference.
We can also add two parameters to the
diff() function. These are the
lag and the
lag parameter can specify the gap between the elements whose difference is calculated. It is 1 by default. When the
lag parameter is 2, the
diff() function will calculate the difference between the first and third element, the second and fourth element, etc. The following example will clear things up:
diff(x, lag = 2)  -1 0 4 6 -4 -1 -3
differences parameter is used to specify the order of differences. For example, we set it to 2; then it will first calculate the difference between the given vector elements, then it will again calculate the difference of consecutive elements of the resultant vector. The following code snippet will explain this:
diff(x)  -2 1 -1 5 1 -5 4 -7 diff(x, differences = 2)  3 -2 6 -4 -6 9 -11
We can also have both these parameters set to some specific value at once. For example, in the code below, we have set the
lag as 2 and
differences as 2.
diff(x, differences = 2, lag = 2)  5 6 -8 -7 1