Time Series Forecasting Internet Traffic in Greece with Prophet, PyCaret

Dimitris Chortarias
4 min readMar 30, 2022

In this story you can find the forecast of internet traffic based on data that you can find in https://data.gov.gr/ from Greek Internet Exchange. The models that have been used are Prophet from facebook and the beta version of Time Series Forecasting from Pycaret.

Facebook Prophet

PyCaret Time Series Forecasting

Both models are extreme easy to use with the only limitation of facebook prophet which doesn’t support Windows. Although in Windows 11 you can use the WSL2 with VSCode.

Dataset

The data can be retrieved using the documentation of data.gov.gr

import requests   
url = 'https://data.gov.gr/api/v1/query/internet_traffic?date_from=2022-03-23&date_to=2022-03-30'
headers = {'Authorization':'Token --'}
response = requests.get(url, headers=headers)
print(response.json())

Due to the limitation of the days that we can download in each response I created a loop for every 30 Days.

The only data processing it have been applied is to reformat dates in order pandas dataframe to read the column of date as date and a (pandas) rolling with window=7 to normalize the data. (I assume that outliers in the original dataset are True. :D )

Not Normalized Data
Normalized Data (7 Rolling Window)

Facebook Prophet

This package is one of the most “famous” in the category of time series forecasting, in the last years I read everywhere about the abilities of this package. Below is the code of applying Prophet in the dataset and the Results.

Code
Trends
Forecast
MAPE on 100 days horizon

PyCaret

One of the most advanced features of PyCaret is the comparison of the models and the selection of the best one. (In the next week I am going to write another article for Classification).

Although the results from PyCaret did not appear accurate. There have been many tests based on the input parameters fh, fold but in high fh and full dataset the code does not return any value. I assume that in the release version there are going to be fixes. Below is the relevant code with the most accurate results.

Because like I have mentioned before the results were not meeting expectations, I reformat the data into Months. In this case there was a limitation in periods (only 30) but you can see the results below.

PyCaret Monthly Forecast

Results

From the visualizations already we can say that Facebook’s Prophet fits better into our dataset. although below is the MAPE,SMAPE by model.

Prophet Forecast for March 2023: ~380B with MAPE=0.036 and SMAPE=0.037 for 100 days.

PyCaret with Daily Data: ~440B with MAPE=0.046 and SMAPE=0.047 with fh=28

PyCaret with Monthly Data: ~296B with MAPE=0.19 and SMAPE=0.21 with fh =7

~60% Higher is going to be the internet traffic of Greece in one year!

Reminder 🙂

MAPE

The mean absolute percentage error (MAPE) is a measure of how accurate a forecast system is. It measures this accuracy as a percentage, and can be calculated as the average absolute percent error for each time period minus actual values divided by actual values.

Full code can be found in my GitHub repository below

--

--