#### Exponential Functions

Exponential Functions Since the Independent variable occurs in the Exponent...

Before we could go in understanding what is Decision Tree Algorithm, it is imperative to refresh the Basics of Machine Learning first, when it comes to the types of problems handled.

There are three basic categories of Problems that you will encounter in Machine learning. They are** Classification, Regression and Clustering**

**Classification problems**: In Classification problems you would have to make instantaneous decisions like, “Yes” or “No”, “good” or “bad”, “accepted” or “rejected”, “true” or “false”

**Regression problems**: You know the stock value of a product today, you should be able to predict the stock value of the product 2 weeks from now. In other words, the data is continuous in nature.

**Clustering problems.** Here the data ought to be clustered. Example, most visited discount shelves in a supermarket. In online shops, people who buy Ties, also bought, cuff links and tie pins. The products are organized in specific pattern, example: like purchase orders or choice preference of customers.

- Naive Bayes
- Logistic Regression
- Decision Tree
- Random Forest

The first two are employed for **non complex data sets,** while the other two are used for** complex data sets**.

In this post we shall primarily focus on Decision Tree Algorithms, starting with its definition

Decision Tree is a tree shaped Diagram used to assist in finalizing the course of Action intended. A branch in a decision tree represents a decision.

**Types of Problems that Decision Tree can solve**

Decision Trees can solve Classification problems and Regression problems.

The **Classification Decision tree** will determine the outcomes for If-Then condition. For example, If you work hard, Then you shall pass your exam. Determining the best Race car based on the 1 km race timings.

Th**e Regression Decision tree**: This model used when the target variable is continuous or numerical in nature.

Below is a simple example of a Decision Tree Algorithm. You have intention to start a business. You have two proposals. One to start a business that sells Ladies Hand Bags, and the other would be to start selling Ladies Shoes. If you were to Sell Hand bags, then the amount of money made on this Model would be $1000. On the other hand, if you were to sell Shoes, then the amount of money made on this model, would be $900.

Which of these Models would you choose? Obviously, Selling Hand Bags, why? Because the returns are more. But is this the Right Decision?

The above figure just illustrates the basics, what if Selling Hand Bags, has 50% Chance of success and a 50% Chance of failure and similarly, selling Shoes, has a 50% Chance of success and a 50% Chance of failure, then how would the Decision Tree Algorithm look like.

The Decision is based on the following formula

Obviously, it would be Selling Shoes

The values that these formula produces is called as the **Expected Value**

The Expected value does not mean that every time you will make a profit of $400, in the shoes selling business. It only means, if you did the Identical Shoes selling business very many times, then your Average earnings will probably be, $400 per time. Note the word, Probably.

They are simple to use

They provide a lucid understanding to complex routines

The model works on Visualization, thus it captivates, both

the learner and the implementer.

Doesn’t require complex data preparation

Categorical data and Numerical data is handled with ease.

Even if a data doesn’t fit, still it can be used to effect the prediction.

**Overfitting**

The focus is just on one particular situation instead of a

generalized solution.

**High
Variance**

The Decision Tree model can get unstable due to small

changes in the data. The balance will be lost, and this in turn will impact the

decision arrived at.

**Low
Bias**

This would impair the decision tress due to its inability to work with new incoming data.

**Entropy**

This is the measure that defines the unpredictability in the

data set.

**Information gain**

This is the measure that defines the decrease in

unpredictability after the data set is split.

**Leaf Node**

This carries the decision

**Root Node**

The top most decision node is known as the root node

## 0 responses on "Decision Tree Algorithm"