business-science / Timetk
Programming Languages
Projects that are alternatives of or similar to Timetk
timetk
Mission
To make it easy to visualize, wrangle, and feature engineer time series data for forecasting and machine learning prediction.
Installation
Download the development version with latest features:
remotes::install_github("business-science/timetk")
Or, download CRAN approved version:
install.packages("timetk")
Getting Started
-
Full Time Series Machine Learning and Feature Engineering Tutorial: Showcases the (NEW)
step_timeseries_signature()
for building 200+ time series features usingparsnip
,recipes
, andworkflows
. -
Visit the timetk website documentation for tutorials and a complete list of function references.
Package Functionality
There are many R packages for working with Time Series data. Here’s
how timetk
compares to the “tidy” time series R packages for data
visualization, wrangling, and feature engineeering (those that leverage
data frames or tibbles).
Task | timetk | tsibble | feasts | tibbletime |
---|---|---|---|---|
Structure | ||||
Data Structure | tibble (tbl) | tsibble (tbl_ts) | tsibble (tbl_ts) | tibbletime (tbl_time) |
Visualization | ||||
Interactive Plots (plotly) | ✅ | ❌ | ❌ | ❌ |
Static Plots (ggplot) | ✅ | ❌ | ✅ | ❌ |
Time Series | ✅ | ❌ | ✅ | ❌ |
Correlation, Seasonality | ✅ | ❌ | ✅ | ❌ |
Anomaly Detection | ✅ | ❌ | ❌ | ❌ |
Data Wrangling | ||||
Time-Based Summarization | ✅ | ❌ | ❌ | ✅ |
Time-Based Filtering | ✅ | ❌ | ❌ | ✅ |
Padding Gaps | ✅ | ✅ | ❌ | ❌ |
Low to High Frequency | ✅ | ❌ | ❌ | ❌ |
Imputation | ✅ | ✅ | ❌ | ❌ |
Sliding / Rolling | ✅ | ✅ | ❌ | ✅ |
Feature Engineering (recipes) | ||||
Date Feature Engineering | ✅ | ❌ | ❌ | ❌ |
Holiday Feature Engineering | ✅ | ❌ | ❌ | ❌ |
Fourier Series | ✅ | ❌ | ❌ | ❌ |
Smoothing & Rolling | ✅ | ❌ | ❌ | ❌ |
Padding | ✅ | ❌ | ❌ | ❌ |
Imputation | ✅ | ❌ | ❌ | ❌ |
Cross Validation (rsample) | ||||
Time Series Cross Validation | ✅ | ❌ | ❌ | ❌ |
Time Series CV Plan Visualization | ✅ | ❌ | ❌ | ❌ |
More Awesomeness | ||||
Making Time Series (Intelligently) | ✅ | ✅ | ❌ | ✅ |
Handling Holidays & Weekends | ✅ | ❌ | ❌ | ❌ |
Class Conversion | ✅ | ✅ | ❌ | ❌ |
Automatic Frequency & Trend | ✅ | ❌ | ❌ | ❌ |
What can you do in 1 line of code?
Investigate a time series…
taylor_30_min %>%
plot_time_series(date, value, .color_var = week(date),
.interactive = FALSE, .color_lab = "Week")
Visualize anomalies…
walmart_sales_weekly %>%
group_by(Store, Dept) %>%
plot_anomaly_diagnostics(Date, Weekly_Sales,
.facet_ncol = 3, .interactive = FALSE)
Make a seasonality plot…
taylor_30_min %>%
plot_seasonal_diagnostics(date, value, .interactive = FALSE)
Inspect autocorrelation, partial autocorrelation (and cross correlations too)…
taylor_30_min %>%
plot_acf_diagnostics(date, value, .lags = "1 week", .interactive = FALSE)
Acknowledgements
The timetk
package wouldn’t be possible without other amazing time
series packages.
-
stats - Basically
every
timetk
function that uses a period (frequency) argument owes it tots()
.-
plot_acf_diagnostics()
: Leveragesstats::acf()
,stats::pacf()
&stats::ccf()
-
plot_stl_diagnostics()
: Leveragesstats::stl()
-
-
lubridate:
timetk
makes heavy use offloor_date()
,ceiling_date()
, andduration()
for “time-based phrases”.- Add and Subtract Time (
%+time%
&%-time%
):"2012-01-01" %+time% "1 month 4 days"
useslubridate
to intelligently offset the day
- Add and Subtract Time (
- xts: Used to calculate periodicity and fast lag automation.
-
forecast (retired):
Possibly my favorite R package of all time. It’s based on
ts
, and it’s predecessor is thetidyverts
(fable
,tsibble
,feasts
, andfabletools
).- The
ts_impute_vec()
function for low-level vectorized imputation using STL + Linear Interpolation usesna.interp()
under the hood. - The
ts_clean_vec()
function for low-level vectorized imputation using STL + Linear Interpolation usestsclean()
under the hood. - Box Cox transformation
auto_lambda()
usesBoxCox.Lambda()
.
- The
-
tibbletime
(retired): While
timetk
does not importtibbletime
, it uses much of the innovative functionality to interpret time-based phrases:-
tk_make_timeseries()
- Extendsseq.Date()
andseq.POSIXt()
using a simple phase like “2012-02” to populate the entire time series from start to finish in February 2012. -
filter_by_time()
,between_time()
- Uses innovative endpoint detection from phrases like “2012” -
slidify()
is basicallyrollify()
usingslider
(see below).
-
-
slider: A powerful R
package that provides a
purrr
-syntax for complex rolling (sliding) calculations.-
slidify()
usesslider::pslide
under the hood. -
slidify_vec()
usesslider::slide_vec()
for simple vectorized rolls (slides).
-
-
padr: Used for padding time
series from low frequency to high frequency and filling in gaps.
- The
pad_by_time()
function is a wrapper forpadr::pad()
. - See the
step_ts_pad()
to apply padding as a preprocessing recipe!
- The
-
TSstudio: This is the
best interactive time series visualization tool out there. It
leverages the
ts
system, which is the same system theforecast
R package uses. A ton of inspiration for visuals came from usingTSstudio
.
Learning More
My Talk on High-Performance Time Series Forecasting
Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.
High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:
- Time Series Machine Learning (cutting-edge) with
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more) - NEW - Deep Learning with
GluonTS
(Competition Winners) - Time Series Preprocessing, Noise Reduction, & Anomaly Detection
- Feature engineering using lagged variables & external regressors
- Hyperparameter Tuning
- Time series cross-validation
- Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
- Scalable Forecasting - Forecast 1000+ time series in parallel
- and more.