Essays.club - Get Free Essays and Term Papers
Search

Decision Trees - Machine Learning

Autor:   •  April 1, 2018  •  1,954 Words (8 Pages)  •  562 Views

Page 1 of 8

...

Gains for Nodes

[pic 25]

Node

Gain

Node

N

Percent

N

Percent

Response

Index

1

394

61.0%

179

66.1%

45.4%

108.3%

2

252

39.0%

92

33.9%

36.5%

87.0%

Growing Method: CHAID

Dependent Variable: BINARY LOYALTY

---------------------------------------------------------------

Risk

Estimate

Std. Error

.420

.019

Growing Method: CHAID

Dependent Variable: BINARY

LOYALTY

---------------------------------------------------------------

Classification

Predicted

LOW

HIGH

Percent

Observed

LOYALTY

LOYALTY

Correct

LOW LOYALTY

375

0

100.0%

HIGH LOYALTY

271

0

0.0%

Overall

100.0%

0.0%

58.0%

Percentage

Growing Method: CHAID

Dependent Variable: BINARY LOYALTY

[pic 26][pic 27]

---------------------------------------------------------------

TAKEAWAYS

Gain-Percentile graph shows that the cumulative gain % for node1 for High Loyalty: 179/271=66.1% whereas gain % for node2: 92/271= 33.9%. Moreover, since there are 2 nodes the cumulative gain % doesn't go above 66.1% (@ 60 percentile) and goes to 100% (66.1+33.9) (@ 100 percentile) makes the area under the curve not considerably greater than .5, which is area of the curve based on naive model(50:50 probability of each of the 2 response categories).

Gain is the percentage of total cases in the target category in each node, computed as: (node target n / total target n) x 100. The gains chart is a line chart of cumulative percentile gains, computed as: (cumulative percentile target n / total target n) x 100.

Response-Percentile graph shows that the response for the targeted category of High Loyalty doesn't deviate much from the response of the entire sample on the whole. For a model having good predictive power, the higher the response of any of the nodes of the decision tree created than the response of node 0, the better is the tree. In this case the parent node gives response of 42% for high loyalty whereas node 1&2 give 45% & 37% resp., not good enough response prediction.

Response is the percentage of cases in the node in the specified target category. The response chart is a line chart of cumulative percentile response, computed as: (cumulative percentile target n/ cumulative percentile total n) x 100.

Index-Percentile graph shows that the ratio of response of each of the two terminal nodes to the parent node(node 0) is close to 1. 108% for node1 and 87% for node2. For a model having good predictive power, the ratio percentage should be as big as possible that shows that the tree has good response predictability in yielding the class of interest for a set of predictor values.

Index is the ratio of the node response percentage for the target category compared to the overall target category response percentage for the entire sample. The index chart is a line chart of cumulative percentile index values. Cumulative percentile index is computed as: (cumulative percentile response percent / total response percent) x 100.

Each of the 2 terminal nodes: 1&2 are predicted as Low Loyalty since misclassification costs are same for each of the 2 response categories as well as more than 50% of the customers categorized under each node have Low Loyalty. As a result the classification has 0% predictability for High Loyalty customers with a low correct prediction percentage of 58% and a high risk percentage/misclassification percentage of 42%.

Hence, not a good model.

---------------------------------------------------------------

Marketing Model for Brand Advocacy

Customer Loyalty Assessment based on Demographics & Behavioral Responses

[pic 28]

Model Summary

[pic 29][pic 30][pic 31][pic 32][pic 33][pic 34][pic 35][pic 36][pic 37][pic 38][pic 39][pic 40][pic 41][pic 42][pic 43][pic 44][pic 45]

Specifications

...

Download:   txt (18.2 Kb)   pdf (89.4 Kb)   docx (36.1 Kb)  
Continue for 7 more pages »
Only available on Essays.club