Pandas Rolling Apply

Pandas Rolling Apply

  1. rolling.apply Using Pandas in Python
  2. rolling.apply With Lambda
  3. rolling.apply Without Lambda

We will learn how to fix an error only length-1 arrays can be converted to Python scalars with the help of Pandas in Python.

rolling.apply Using Pandas in Python

Python has been instrumental in assisting thousands of communities in developing solutions to the challenges they face in their everyday lives.

It has proven to be one of the most versatile programming languages because it contains thousands of modules that can be used.

You can install almost every dependency to gain control of your issues with the help of straightforward module installations that only require a single command. Numpy is one of those essential modules.

The only length-1 arrays can be converted to Python scalars appear when performing specific calculations using the different methods provided by the numpy module.

Consider the following code:

from sklearn.preprocessing import StandardScaler
import pandas as pd
import numpy as np

def test(df):
    return np.mean(df)

sc=StandardScaler()

tmp=pd.DataFrame(np.random.randn(2000,2)/10000,index=pd.date_range('2001-01-01',periods=2000),columns=['A','B'])

print("Test 1: ")
print(tmp.rolling(window=5,center=False).apply(lambda x: test(x)))

print("SC_Fit Transform: ")
tmp.rolling(window=5,center=False).apply(lambda x: sc.fit_transform(x))

Output:

Test 1:
                   A         B
2001-01-01       NaN       NaN
2001-01-02       NaN       NaN
2001-01-03       NaN       NaN
2001-01-04       NaN       NaN
2001-01-05 -0.000011  0.000027
...              ...       ...
2006-06-19 -0.000058 -0.000027
2006-06-20 -0.000069  0.000022
2006-06-21 -0.000058 -0.000002
2006-06-22 -0.000024  0.000035
2006-06-23 -0.000010  0.000043

[2000 rows x 2 columns]

SC_Fit Transform:

Traceback (most recent call last):
  ...
 
only length-1 arrays can be converted to Python scalars

The only length-1 arrays can be converted to Python scalars is a typical error that appears when passing an array to a parameter that only accepts a single value. A scalar value is the only acceptable parameter in numerous numpy methods.

Therefore, if a single-dimensional or multidimensional array is passed to the method, it will throw this error. With more and more methods accepting a single parameter, this error is likely to occur frequently.

The function test was made to ensure that functions containing mean and deviations would work or not, which they did.

rolling.apply With Lambda

Consider the following code:

from sklearn.preprocessing import StandardScaler
import pandas as pd
import numpy as np

def test(df):
    return np.mean(df)

sc=StandardScaler()

tmp=pd.DataFrame(np.random.randn(2000,2)/10000,index=pd.date_range('2001-01-01',periods=2000),columns=['A','B'])

print("Test 1: ")
print(tmp.rolling(window=5,center=False).apply(lambda x: test(x)))

print("SC_Fit: ")
print(tmp.rolling(window=5,center=False).apply(lambda x: (x[-1] - x.mean()) / x.std(ddof=1)))

Output:

Test 1:
                   A         B
2001-01-01       NaN       NaN
2001-01-02       NaN       NaN
2001-01-03       NaN       NaN
2001-01-04       NaN       NaN
2001-01-05 -0.000039  0.000053
...              ...       ...
2006-06-19  0.000022 -0.000021
2006-06-20  0.000005 -0.000027
2006-06-21  0.000024 -0.000060
2006-06-22  0.000023 -0.000038
2006-06-23  0.000014 -0.000017
[2000 rows x 2 columns]

SC_Fit:

                   A         B
2001-01-01       NaN       NaN
2001-01-02       NaN       NaN
2001-01-03       NaN       NaN
2001-01-04       NaN       NaN
2001-01-05 -0.201991  0.349646
...              ...       ...
2006-06-19  1.035835 -0.688231
2006-06-20 -0.595888  1.057016
2006-06-21 -0.640150 -1.399535
2006-06-22 -0.535689  1.244345
2006-06-23  0.510958  0.614429

[2000 rows x 2 columns]

Since x in the lambda function represents a (rolling) series/ndarray, the function can be written as follows (where x[-1] refers to the current rolling data point).

lambda x: (x[-1] - x.mean()) / x.std(ddof=1)

rolling.apply Without Lambda

Consider the following code:

from sklearn.preprocessing import StandardScaler
import pandas as pd
import numpy as np

def test(df):
    return np.mean(df)

sc=StandardScaler()

tmp=pd.DataFrame(np.random.randn(2000,2)/10000,index=pd.date_range('2001-01-01',periods=2000),columns=['A','B'])

print("Test 1: ")
print(tmp.rolling(window=5,center=False).apply(lambda x: test(x)))

print("SC_Fit: ")
print((tmp - tmp.rolling(5).mean()) / tmp.rolling(5).std())

Output:

Test 1:
                   A         B
2001-01-01       NaN       NaN
2001-01-02       NaN       NaN
2001-01-03       NaN       NaN
2001-01-04       NaN       NaN
2001-01-05 -0.000063  0.000006
...              ...       ...
2006-06-19 -0.000047  0.000027
2006-06-20 -0.000059  0.000092
2006-06-21 -0.000008  0.000067
2006-06-22  0.000003  0.000069
2006-06-23  0.000030  0.000079

[2000 rows x 2 columns]

SC_Fit:
                   A         B
2001-01-01       NaN       NaN
2001-01-02       NaN       NaN
2001-01-03       NaN       NaN
2001-01-04       NaN       NaN
2001-01-05 -1.463814  0.168712
...              ...       ...
2006-06-19  0.466605  0.740628
2006-06-20  1.203184 -0.447355
2006-06-21  0.636700 -1.084805
2006-06-22  0.044646 -0.616070
2006-06-23 -0.784739  0.907959

[2000 rows x 2 columns]

Looking at the lambda function, we had declared before.

tmp.rolling(window=5,center=False).apply(lambda x: sc.fit_transform(x))

This translates to:

lambda x: (x - x.mean()) / x.std()

As for x.mean() and x.std(), they will be reduced, but x will not be reduced since it’s an array. This will result in the whole expression becoming an array, thus invoking the only length-1 arrays can be converted to Python scalars error.

The solution is to perform the roll only on the portions of the z-score calculation that require it and not on the problematic parts.

(tmp - tmp.rolling(5).mean()) / tmp.rolling(5).std()
Salman Mehmood avatar Salman Mehmood avatar

Hello! I am Salman Bin Mehmood(Baum), a software developer and I help organizations, address complex problems. My expertise lies within back-end, data science and machine learning. I am a lifelong learner, currently working on metaverse, and enrolled in a course building an AI application with python. I love solving problems and developing bug-free software for people. I write content related to python and hot Technologies.

LinkedIn

Related Article - Pandas Rolling

  • Use of rolling().apply() on Pandas Dataframe and Series