Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 [ 74 ] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

space, the corresponding shapes are rectangular solids, and in any multidimensional space, there are hyper-rectangles.

The problem is that some things dont fit neatly into rectangular boxes. Figure 6.12 illustrates the problem: The two regions are really divided by a diagonal line; it takes a deep tree to generate enough rectangles to approximate it adequately.

In this case, the true solution can be found easily by allowing linear combinations of the attributes to be considered. Some software packages attempt to tilt the hyperplanes by basing their splits on a weighted sum of the values of the fields. There are a variety of hill-climbing approaches for selecting the weights.

Of course, it is easy to come up with regions that are not captured easily even when diagonal lines are allowed. Regions may have curved boundaries and fields may have to be combined in more complex ways (such as multiplying length by width to get area). There is no substitute for the careful selection of fields to be inputs to the tree-building process and, where necessary, the creation of derived fields that capture relationships known or suspected by domain experts. These derived fields may be functions of several other fields. Such derived fields inserted manually serve the same purpose as automatically combining fields to tilt the hyperplane.

А Д

д л д д д

А А

* Лд

А уО

А д У

у °

* О

у°

u u о

о о 0 о о

° о о и

о о

о о

о° °( о о

Figure 6.12 The upper-left and lower-right quadrants are easily classified, while the other two quadrants must be carved up into many small boxes to approximate the boundary between the regions.



Neural Trees

One way of combining input from many fields at every node is to have each node consist of a small neural network. For domains where rectangular regions do a poor job describing the true shapes of the classes, neural trees can produce more accurate classifications, while being quicker to train and to score than pure neural networks.

From the point of view of the user, this hybrid technique has more in common with neural-network variants than it does with decision-tree variants because, in common with other neural-network techniques, it is not capable of explaining its decisions. The tree still produces rules, but these are of the form F(w1x1, w2x2,w3x3, . . .) < N, where F is the combining function used by the neural network. Such rules make more sense to neural network software than to people.

Piecewise Regression Using Trees

Another example of combining trees with other modeling methods is a form of piecewise linear regression in which each split in a decision tree is chosen so as to minimize the error of a simple regression model on the data at that node. The same method can be applied to logistic regression for categorical target variables.

Alternate Representations for Decision Trees

The traditional tree diagram is a very effective way of representing the actual structure of a decision tree. Other representations are sometimes more useful when the focus is more on the relative sizes and concentrations of the nodes.

Box Diagrams

While the tree diagram and Twenty Questions analogy are helpful in visualizing certain properties of decision-tree methods, in some cases, a box diagram is more revealing. Figure 6.13 shows the box diagram representation of a decision tree that tries to classify people as male or female based on their ages and the movies they have seen recently. The diagram may be viewed as a sort of nested collection of two-dimensional scatter plots.

At the root node of a decision tree, the first three-way split is based on which of three groups the survey respondents most recently seen movie falls. In the outermost box of the diagram, the horizontal axis represents that field. The outermost box is divided into sections, one for each node at the next level of the tree. The size of each section is proportional to the number of records that fall into it. Next, the vertical axis of each box is used to represent the field that is used as the next splitter for that node. In general, this will be a different field for each box.



Last Movie in Group

age > 27

LD (TJ

LD Ш

LD (ЕГ

Ш® ld

Ш ®

Last Movie in Group 1

age < 27

Last Movie in Group 2

О LD □

Ш LU О LD

LD I

Last Movie in Group 3

age > 41 , >

ffl LD

Last Movie in Group 3

age < 41 age > 27

DO О

Last Movie in Group 3

age < 41 age < 27

Figure 6.13 A box diagram represents a decision tree. Shading is proportional to the purity of the box; size is proportional to the number of records that land there.

There is now a new set of boxes, each of which represents a node at the third level of the tree. This process continues, dividing boxes until the leaves of the tree each have their own box. Since decision trees often have nonuniform depth, some boxes may be subdivided more often than others. Box diagrams make it easy to represent classification rules that depend on any number of variables on a two-dimensional chart.

The resulting diagram is very expressive. As we toss records onto the grid, they fall into a particular box and are classified accordingly. A box chart allows us to look at the data at several levels of detail. Figure 6.13 shows at a glance that the bottom left contains a high concentration of males.

Taking a closer look, we find some boxes that seem to do a particularly good job at classification or collect a large number of records. Viewed this way, it is natural to think of decision trees as a way of drawing boxes around groups of similar points. All of the points within a particular box are classified the same way because they all meet the rule defining that box. This is in contrast to classical statistical classification methods such as linear, logistic, and quadratic discriminants that attempt to partition data into classes by drawing a line or elliptical curve through the data space. This is a fundamental distinction: Statistical approaches that use a single line to find the boundary between classes are weak when there are several very different ways for a record to become



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 [ 74 ] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222