Nishant Nagpal

Build more accurate decision tree models with Tsallis Entropy

Published Jan 22, 2021

We have been using decison trees for regression and classification problems for good amount of time. In the training process, growth of the tree depends on the split criteria after random selection of samples and features from the training data. We have been using Gini Index or Shannon Entropy as the split criteria across techniques developed around decision tree. And its well accepted decison criteria across time and domain.

Its has been suggested that choosing between Gini Index and Shannon Entropy does not make significant different. In practice we choose Gini Index over Shanon Entropy just to avoid logarithmic computations.

The most methodical part of decision tree is spliting the nodes. We can understand the criticality of the meaurement we choose for the split. Gini Index has worked out for most of the solutions but whats the harm in getting additional few points of accuracy.

The very near by alternative to Gini Index and Shannon Entropy is Tsallis Entropy. Actually Tsallis is not alternative but the parent of Gini and Entropy. Lets see how -

Tsallis Entropy

The formula for Tsallis Entropy is as follow, where p(xi) is the probability of class. The tuning parameter for Tsallis entropy is denoted by q.

No alt text provided for this image
Now answering to the open question we had earlier that how Tsallis is the parent index to Gini Index and Shannon Entropy.

Tsallis entropy is generalized parametric form of Gini Index and Entropy. If we put value of q approaching to 1, it leads to Shannon Entropy as explained below -

No alt text provided for this image
And if the value of q is 2, the expression represents the Gini Index as showed below -

No alt text provided for this image
Althought Gini Index and Shannon Entropy seems to be specific cases of Tsallis Entropy but there is slight catch. There is dissimilarity in the additive nature of the meaures. The Gini Index and Shannon Entropy are additive in nature as presented in the follwoing equation.

No alt text provided for this image
While the Tsallis Entropy is pseudo additive in nature as mentioned below -

No alt text provided for this image
Since q belongs to Real value domain, finding optimal q for the model relies on the multiple iterations. There is not standard way of finding optimal value of q which gives the maximum accuracy. Generally the accuracy and complexicity plots are create across different values of q to find the optimal one. This is the part which creates hinderance in adoption of Tsallis Entropy, its comptation extensive regime.

Since now we have developend multiple ways to make such iterative processes fast, we can move forward toward adapting the Tsallis Entropy.

Machine learning

Report

Enjoy this post? Give Nishant Nagpal a like if it's helpful.

Nishant Nagpal

I am professional data scientist with 8 years of experience and currently working with a consultancy firm. Worked on wide tech stack including R, Python, Node.js, Javascript, Tableau etc. Designed and executed projects end-end.

Discover and read more posts from Nishant Nagpal

get started