0% found this document useful (0 votes)
149 views

Time Series Forecasting With DeepAR

Uploaded by

ante mitar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
149 views

Time Series Forecasting With DeepAR

Uploaded by

ante mitar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 6
Time Series Forecasting with DeepAR ee) Nov 20,2020: minread With enormous source and volume of time-series data, detecting timely patterns in data is becoming a crucial part of analyzing and decision making in many businesses. Knowing the future beforehand helps decision-makers to plan their strategies in accordance with the intent of customers gaining a massive advantage. Betterment with forecasting models has increased in complexity along with the capabilities to capture previously unnoticed correlations. In this blog, we are going to discuss the Deep Autoregressive model (DeepAR), which is one of the built-in algorithms for Amazon Sagemaker. Amazon SageMaker First, let us get some information on Amazon SageMaker before we jump into DeepAR. Every developer and data scientist goes through hours just installing frameworks and libraries required for specific projects however end up yet wrangling the installation manager. To take away these hurdles in ML development steps, Amazon Sagemaker has been around the corner. Amazon SageMaker is a fully managed service that provides the ability to build, train, and deploy models quickly. It is a web-based visual interface that makes ML development steps easier and faster at a lower cost. Amazon SageMaker notebooks are Jupyter notebooks in an EC2 instance dedicated to run your environments, and codes for feature engineering to building a model. The notebook instance is automatically configured with AWS configuration for reaching access from notebook instances to other AWS services. ‘The EC2 instance can be selected and resized according to the feature engineering and training jobs demands. The AWS console logs all training jobs and we can see where it is, where the model is, and also the dataset used. SageMaker creates RESTful API no matter where the model was trained, the pre-trained model can host on SageMaker using Docker containers. Amazon has come with Amazon SageMaker Studio that provides experiment management to easily view and track progress against projects. DeepAR In this blog post, we are going to forecast time-series based on the past trends of multiple factors with the help of the DeepAR algorithm. AWS’s DeepAR algorithm is a time-series forecasting using Recurrent Neural Network (RNN) having the capability of producing point and probabilistic forecasts. ‘The dataset we will be using is the electricity load profile of Nepal consumed for the year 2016, recorded at an interval of an hour. This is an actual dataset taken from Nepal Electricity Authority, where the load is in Megawatt units. Since our dataset has only one time-series, we will not be dealing with a “scale” problem but DeepAR is also capable of handling multiple time-series in a single model creating a global model. It even has the potential to solve a cold start problem i.e to generate forecasts for new time-series that do not have historical time series but are similar to the ones it has been trained on. Let’s get started, Loading the dataset The electricity dataset found was stored monthly in a .csv format according to the Nepali calendar, After cleaning the data and reshaping the granularity to hourly, the preprocessed data is stored in a pickle format. Let’s take a quick look at our dataset. data = pd.read_pickle(*elec_df.pkl") data.nead() Load Date 2016.04-1302:00:00 717.49 2016-04-13 05:00:00 707.64 2016-04-13 04:00:00 706.44 2016-04-13 06:00:00 737.34 Splitting the dataset Splitting a dataset is different when working with time-series. Unlike other datasets, we are not going to split 80-20 as we would. start_dataset = pd.Timestamp("2010-04-13 @1:¢0:00", freq=freq) end_training = pd. Timestamp("2017-@4-@6 01:20:00", freq-freq) training data - [ { "start": str(start_dataset), “target”: data[start_dataset:end_training - 1].Load.values.round(2).tolist() The training dataset is prepared by removing certain data points from each sample. While for the test set a full dataset can be used. testdata = [ { start": str(start_dataset), “target”: data[start_dataset:end_training + prediction_length].Load.values.round(2).tolist() Writing in JSON format ‘The DeepAR accepts input in the form of JSON lines: one sample per line, such as. [{'start’: 2016-04-13 01:00:00", ‘target’: [682.14, 717.49, 707.64, 706.44, 737.34, 1520.08, 1568.08,......1}] def unite_dicts_to_file(path, data): with open(path, ‘wb") as fp: for din data: 4p.write(json.dunps(d) encode("ut¥-8")) fp.write(”\n" .encede(‘ut#-8")) soet Ane write dicts to file("train hourly. json", training data) write dicts to file("test_hourly.json", test_data) The JSON format is then uploaded to the s3 bucket. Training Job Configuration For built-in algorithms, we need to select containers as the same region we run in and then create an estimator. Inage_nane ~ sagemsker amazon. anazon_estimator.get_inage_uri(region, "forecasting-deepar", “letest") ae eee Ssagensker_session-sogenaker session, fae Tecate tees role-roie, train_instance_count-2,] ‘train_instance_type='m.né.xlarge’, base_job_nane-"deepar-electricity-nea-hourly" , s_output_path Training the model Just fit the input data into the estimator that we recently configured. After the training is completed, the training model is provided with different metrics like RMSE, mean_absolute_QuantileLoss, a loss for different quantiles, ete. atime in/” format (s3_data_path), format (s3_data_path) 2 "{}/test! estimator. fit(inputs-data_channels, wait-True) 2020-26-03 2020-26-03 e2 Starting - Starting the training joo. :05 Starting - Launching requested ML instances 2020-06-03 14 Starting - Preparing the instances for training... 2020-06-03 58 Dounloading - Downloading input data... 2020-@6-03 05:55:21 Training - Downloading the training image... Deploying the model Now we will be required to create an endpoint hosting our model and create a predictor to send requests to. predictor = estimator deploy( initial_instance_count=1, instancé_type="ml.m4.xlarge’, predictor_cls-DeeparPredictor) Getting Prediction To get the prediction we are sending JSON formatted samples and a list of quantiles as an optional configuration. The inference format for DeepAR can be found here. context_ts, target_ts, forecast_day = get_data for_inference(test, ©) predictions - predictor.predict(ts-context ts, quantiles-[@.1, 8.2, 0.3, @.4, 8.5, 0.6, 8.7, @.8, 0.9]) Plotting predicted results Plotting is a way that everyone loves and understands what is happening. Let’s plot our prediction along with ground truth value to see how well the trained model is predicting. 2017-08-06 2017-0407 2017-08-08 2017-0809 2017-0410 «2017-0411 2017.0812 2017-04-13. Timestamp Conclusion Using deepAR we can focus on experimenting with our time series to get the best possible results, without worrying about the internal infrastructure. We can get the job done very quickly as there is no need to write any training code. All we need to do is prepare the data and do the necessary tunings to refine the model if needed. The results are even better when we have hundreds of related time series, and we will also have a single model for all the time series (although this is not the case here, but that’s how it is in the real world data). Tcouldn’t have written this article without the support of Ayush Bhattarai. Tanks! Originally posted on https://Awww. gritfeat.com/time-series-forecasting-with-deepar/ Sign up for Analytics Vidhya News Bytes By Analytics Vidhya Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look. Emails willbe sent to ante.mitar@gmailcom. Not you? AWS — MachineLearning Forecasting -Timeseries_-Deepar About Help Legal Get the Medium app wo eee an

You might also like