Welcome to 16892 Developer Community-Open, Learning,Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am working on a timeseries dataset of a retailers transactions the previous 3 years. I want to get rid of any trend lines and seasonality before I use machine learning.

These are the columns and the DType of the datafreame (ds1):

Int64Index: 538435 entries, 0 to 538642
Data columns (total 20 columns):

   Column                   Non-Null Count   Dtype         
---  ------                   --------------   -----         
 0   WEEK_END_DATE            538435 non-null  datetime64[ns]
 1   UNITS                    538435 non-null  int64         
 2   PRICE                    538435 non-null  float64       
 3   FEATURE                  538435 non-null  int64         
 4   DISPLAY                  538435 non-null  int64         
 5   TPR_ONLY                 538435 non-null  int64         
 6   DESCRIPTION              538435 non-null  object        
 7   MANUFACTURER             538435 non-null  object        
 8   CATEGORY                 538435 non-null  object        
 9   SUB_CATEGORY             538435 non-null  object        
 10  PRODUCT_SIZE             538435 non-null  object        
 11  ADDRESS_STATE_PROV_CODE  538435 non-null  object        
 12  MSA_CODE                 538435 non-null  int64         
 13  SEG_VALUE_NAME           538435 non-null  object        
 14  PARKING_SPACE_QTY        538435 non-null  float64       
 15  SALES_AREA_SIZE_NUM      538435 non-null  int64         
 16  AVG_WEEKLY_BASKETS       538435 non-null  float64       
 17  Units_a_visit            538435 non-null  float64       
 18  Visits_per_hhs           538435 non-null  float64       
 19  DISCOUNT                 538435 non-null  float64       
dtypes: datetime64[ns](1), float64(6), int64(6), object(7)
memory usage: 86.3+ MB

I have tried the following:

from pandas import Series
from matplotlib import pyplot
import statsmodels.api as sm 
from statsmodels.tsa.seasonal import seasonal_decompose
ds1.index = ds1.WEEK_END_DATE
result = seasonal_decompose(ds1, model='additive')
result.plot()
pyplot.show()

as well as

res = sm.tsa.seasonal_decompose(ds1.interpolate())
res.plot()

For both I get the following error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-76-322daa0c1fd6> in <module>()
      6 ds1.index = ds1.WEEK_END_DATE
      7 
----> 8 result = seasonal_decompose(ds1)
      9 result.plot()
     10 pyplot.show()

/usr/local/lib/python3.6/dist-packages/statsmodels/tsa/seasonal.py in seasonal_decompose(x, model, filt, freq, two_sided, extrapolate_trend)
    113     nobs = len(x)
    114 
--> 115     if not np.all(np.isfinite(x)):
    116         raise ValueError("This function does not handle missing values")
    117     if model.startswith('m'):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I have also tried to only analyse the int and floats:

numeric = ['UNITS','PRICE','FEATURE','DISPLAY','TPR_ONLY','PARKING_SPACE_QTY','SALES_AREA_SIZE_NUM','AVG_WEEKLY_BASKETS','Units_a_visit','Visits_per_hhs','DISCOUNT']
from pandas import Series
from matplotlib import pyplot
import statsmodels.api as sm 
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(ds1[numeric], model='additive')
result.plot()
pyplot.show()

As well as change all the object's into dummy variables before using the code.

None works. Anyone any suggestions?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
3.3k views
Welcome To Ask or Share your Answers For Others

1 Answer

等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to 16892 Developer Community-Open, Learning and Share
...