Kurtosis — Definition and Attention to Low Kurtosis (Platykurtic)

Duy Lam Nguyen
3 min readFeb 17, 2021

What is Kurtosis?

Better looking at this picture:

or

Both skewness and kurtosis describe shape of the distribution of data. Skewness tells us whether distribution is normal, right-skewed or left-skewed due to outliers. In the other hand, kurtosis is all about tails of distribution.

With platykurtic, we’ll have light tails which means the data lacks outliers. While leptokurtic has heavy tails because of large outliers.

Why is kurtosis important?

As mentioned above, leptokurtic indicates large outliers in data set. We know that outliers will affect model’s performance (read this article for more details) especially regression-based model. In this case, we should investigate these outliers before taking further actions.

But is platykurtic a good guy?

Looking at another picture:

A regression model can predict quite good if data has mesokurtic (normal distribution), and worst with leptokurtic, no doubt. But do not assure with platykurtic, just because it lacks outliers. The model could miss distribution’s tails.

This picture also reminds us, to review both skewness and kurtosis. Our distribution might look normal but is not necessary mesokurtic.

How to handle high / low kurtosis?

High kurtosis occurs due to large outliers. There are several ways to handle outliers. You can read this article.

What about low kurtosis?

One thing for sure: forget above flowchart. It simply cannot apply for this case because you do not have outliers. What we can do is have a good look at records those are responsible for a low kurtosis and ask: Why?

Our job is to understand the data and get insights from it, not to bend it to our will.

For example: your data includes many types of fruits, and you’re trying to predict their prices (y variable) using sizes (x variable). The bigger the higher price a fruit can be and your regression model has r-square of 68%. After a thorough investigation, you figured out that some types of fruits, size is not a factor of changing in price. Should you remove, separate these fruits from the current data set or else?

In conclusion, normal distribution is good and we try to mold a bell from the data. But understanding the data is the most important.

References

Kurtosis by Anders Kallner

Skew and Kurtosis: 2 Important Statistics terms you need to know in Data Science by Diva Dugar

Kurtosis by CFI

Low kurtosis in a large sample, how it can be corrected?

Platykurtic by JASON FERNANDO

3 methods to deal with outliers by Alberto Quesada, Artelnics

Skewness — Definition, Problem and Reducing Methods by Duy Lam Nguyen

--

--

Duy Lam Nguyen
0 Followers

Sales Performance Analyst. Eager to learn Data Science.