By Pranay Dave, Principal Data Scientist | Youtuber | Creator of experiencedatascience.com.

Photo by Bryce Barker on Unsplash.

Time is present in most of the data around us. From retail product sales data to financial stock prices to IoT sensor data, all have a notion of time in it. So mastering time-series analytics is going to make you master of the data science world.

The top 5 analytics that I will demonstrate here are:

  • Seasonality Detection to find peaks in retail product sales
  • Dynamic Time Warp to find products with similar sales patterns
  • Auto-correlation to identify up-trending financial stocks
  • Change-point detection to better understand changes in stock price trends
  • Fast Fourier Transformation to make workload planning

 

Seasonality Detection

 

In this example, I will be using data on retail product sales to demonstrate seasonality detection. Seasonality means some specific months have high sales than others.

The data consist of product, date, and sales quantity:

Retail sales data (Image by author).

Let us take as an example the product called RABBIT NIGHT LIGHT. This graph shows the sales of this product by month.

Sales data for product rabbit night light (Image by author).

The data is for 2019 and 2020. Now, as we see that this product had peak sales in November 2019, and then this peak repeated in Nov 2020. This means that there is seasonality in the sales data for this product RABBIT NIGHT LIGHT.

This way of manually looking at sales graphs and determining seasonality is good, but when you have thousands of products, you need an automated way to determine seasonality. And this is where seasonality detection algorithms can help.

Here is the result of the seasonality detection algorithm applied to multiple products:

Seasonality Detection for multiple products (image by author).

On the X-axis, you have time taken for sales peak to repeat itself. And the Y-axis gives an indication of how high the peak is. So our product RABBIT NIGHT LIGHT is at the top right. And the seasonality detection has estimated that the sales peak repeats itself every 12 to 13 months, which means that there is one sales peak in a year.

Now let us look at some other products, such as WORLD WAR 2 GLIDERS in the visualization above. This product has a sales peak every 6 months. This means that there will be two sales peaks in a year.

Here is the sales trend for the product WORLD WAR 2 GLIDERS ASSTD DESIGNS:

Sales data for product WW2 glider (Image by author).

We see that this product had sales peaks in April 2019 and Oct 2019, and again the sales peaks repeat in April and Oct 2020.

So as you can see that seasonality detection is a very powerful way to detect a number of sales peaks automatically and without looking at individual sales graphs for each product. It also helps from incorrectly planning stock levels for the products.

 

Dynamic Time Warp (DTW)

 

The name of the algorithm may sound like a time machine, but we are going to use it to find products that have similar sales patterns in our retail data.

The data consist of product, month, and sales for the month. Sample data are shown here:

Sample retail sales data (image by author).

Here is the result of Dynamic Time warp applied to the retail data. Each dot represents two products. If the dot is more on the left, it means that the two products have similar sales patterns. The products related to the two extreme points are shown below:

DTW Multi-product (image by author).

As per the algorithm, the products WOODLAND_CHARLOTTE_BAG and JUMBO_BAG_BAROQUE have similar sales patterns. And if we move to the right, we can see the products with completely dissimilar sales patterns. In this case, we have CAKE CASE and RABBIT NIGHT LIGHT, which have completely dissimilar sales patterns.

Now let us first look into the sales for WOODLAND_CHARLOTTE_BAG and JUMBO_BAG_BAROQUE. Here are the sales for WOODLAND_CHARLOTTE_BAG and JUMBO_BAG_BAROQUE.

Sales of products with a similar pattern (image by author).

Both the products have similar sales patterns. Both products have peaked in the month of March, June and in August. And then the sales reduce.

Now even if the sales quantity of the two products are not exact, the DTW algorithm is able to correctly identify these two products having similar sales patterns.

Now lets us look into sales for CAKE CASE and RABBIT NIGHT LIGHT, which have been identified by the algorithm as products that have completely dissimilar sales patterns. The RABBIT NIGHT LIGHT, shown here in grey, has got a sales peak just once a year, but the cake case, shown in orange, has got multiple sales peaks.

Sales of products with a not similar pattern (image by author).

So as you can see that DTW is highly intelligent and justifies its sophisticated name.

 

Auto-correlation

 

In this section, we will talk about auto-correlation and let us start with a conceptual example. Let us say we have a stock whose stock price chart looks something like this.

Stock graph (image by author).

Now, as you can see that the price on day 9 has increased, and the price on day 8 has decreased. This means that there is a negative correlation between day 9 and the previous day. Now if you can also see that the price on day 9 has increased, and the price on day 7 has also increased. This means that there is a positive correlation between the day 9 and day-2.

Now let us move to financial trading. In this example, we will take stocks from CAC40- which is the French Stock Market. Here is the result of auto-correlation applied to some of the stocks in the French stock market.

Auto-correlation Multi-stock symbol (image by author).

Now let us take dots situated at the very top, which are in the color dark blue. They correspond to the stock ABIO.

The correlation value between any period and the previous period is 0.99. A correlation value close to 1 means that there is a positive correlation. This means if the price for any day has increased, the price for previous days also had increased. This shows that the price for ABIO has generally been increasing. In general, the higher the stock in the visualization means they are up-trending stocks.

Now the more toward the bottom you go in this visualization means there is less correlation between the different periods of the stock. So, for example, if you take AMUN.PA, which is situated at the bottom of the visualization. The correlation values are around 0.30s. This means there is not much correlation of stock price between periods. This indicates that there is no clear trend in this stock

Let us now see the price trend for ABIO.PA and AMUN.PA. Here is the price trend for ABIO. As you can see that this is an up-trending stock.

Stock chart for ABIO.PA (image by author).

Here is the price of AMUN.PA, as you can see that this is stock has gone up and then down and up. There is no clear trend.

Stock chart for AMUN.PA (image by author).

 

Change Point Detection

 

Change-point detection helps detect changes in trends in time-series data. Here you will show how we can use it to analyze changes in stock market trends.

In this example, we will take stocks from CAC40, which is the French Stock Market. Here is the price for the stock symbol AMUN between Feb 2020 and March 2021. We can see that this stock had a downtrend, then an uptrend, and again a downtrend. With the change point detection algorithm, you can plot exactly where the change in trend happened.

Change point detection (image by author).

The algorithm has detected four change points and which are indicated with these vertical lines. With these vertical lines, we can easily identify the time periods for each trend.

We can see the timing of the change point. We had a change point in March, then a change point around June, and then a change point around November. This means that the duration of the trend is around 3 to 4 months. And with the last change point around Nov 2020, we should expect a trend change in the coming month.

 

Fast Fourier Transformation

 

Fast Fourier Transformation is one of the most widely used algorithms in the world. Find out how about it through a use-case of workload planning. We will take an example of 911 phone calls data from Montgomery County in the USA. The data has a date, hour, and number of calls received. This is time-series data, as the number of calls depends upon the time.

Here is sample data:

911 sample data (image by author).

If you plot a line curve, it will look like this.

Time chart for 911 data (image by author).

On the X-axis, we have time expressed in an hour for each day. And on Y-axis, we have the number of calls receivedNow for any workload planning, we need to know when is the peak workload is expected so that we can plan sufficient resources.

Now just looking at this graph above, it is difficult to make sense out of it, as it is not very smooth, and there are a lot of variations

Now let us talk about applying the Fast Fourier magic. The Fast Fourier transformation helps discover a repeating and smooth signal in the time series data. Here is the result of FFT, which shows the repeating signals which the algorithm has found.

FFT signal (image by author).

The repeating signal starts at 6 to 7 in the morning, goes up, reaches max around 18 to 19h, and then goes down and then again restarts at 6 to 7 the next morning.

This signal is much smoother compared to the original data, and we can use it to make workload planning. As per this signal, we should have a minimum number of people around 5 to 7. We then ramp up our capacity to reach a maximum capacity between 17h to 20h. And then we can reduce the number of people taking the calls, and then again we restart in the morning between 5 to 7

So, as you can see that FFT is very useful to convert noisy data into smooth signals that are easy to interpret and take action on.

 

Original. Reposted with permission.

Related:



Source link

Leave a Reply

Your email address will not be published.