Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [ 81 ] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

Neural Networks for Directed Data Mining

The previous example illustrates the most common use of neural networks: building a model for classification or prediction. The steps in this process are:

1. Identify the input and output features.

2. Transform the inputs and outputs so they are in a small range, (-1 to 1).

3. Set up a network with an appropriate topology.

4. Train the network on a representative set of training examples.

5. Use the validation set to choose the set of weights that minimizes the error.

6. Evaluate the network using the test set to see how well it performs.

7. Apply the model generated by the network to predict outcomes for unknown inputs.

Fortunately, data mining software now performs most of these steps automatically. Although an intimate knowledge of the internal workings is not necessary, there are some keys to using networks successfully. As with all predictive modeling tools, the most important issue is choosing the right training set. The second is representing the data in such a way as to maximize the ability of the network to recognize patterns in it. The third is interpreting the results from the network. Finally, understanding some specific details about how they work, such as network topology and parameters controlling training, can help make better performing networks.

One of the dangers with any model used for prediction or classification is that the model becomes stale as it gets older-and neural network models are no exception to this rule. For the appraisal example, the neural network has learned about historical patterns that allow it to predict the appraised value from descriptions of houses based on the contents of the training set. There is no guarantee that current market conditions match those of last week, last month, or 6 months ago-when the training set might have been made. New homes are bought and sold every day, creating and responding to market forces that are not present in the training set. A rise or drop in interest rates, or an increase in inflation, may rapidly change appraisal values. The problem of keeping a neural network model up to date is made more difficult by two factors. First, the model does not readily express itself in the form of rules, so it may not be obvious when it has grown stale. Second, when neural networks degrade, they tend to degrade gracefully making the reduction in performance less obvious. In short, the model gradually expires and it is not always clear exactly when to update it.



The solution is to incorporate more recent data into the neural network. One way is to take the same neural network back to training mode and start feeding it new values. This is a good approach if the network only needs to tweak results such as when the network is pretty close to being accurate, but you think you can improve its accuracy even more by giving it more recent examples. Another approach is to start over again by adding new examples into the training set (perhaps removing older examples) and training an entirely new network, perhaps even with a different topology (there is further discussion of network topologies later). This is appropriate when market conditions may have changed drastically and the patterns found in the original training set are no longer applicable.

The virtuous cycle of data mining described in Chapter 2 puts a premium on measuring the results from data mining activities. These measurements help in understanding how susceptible a given model is to aging and when a neural network model should be retrained.

MvflTTlilliM A neural network is only as good as the training set used to generate it. The model is static and must be explicitly updated by adding more recent examples into the training set and retraining the network (or training a new network) in order to keep it up-to-date and useful.

What Is a Neural Net?

Neural networks consist of basic units that mimic, in a simplified fashion, the behavior of biological neurons found in nature, whether comprising the brain of a human or of a frog. It has been claimed, for example, that there is a unit within the visual system of a frog that fires in response to fly-like movements, and that there is another unit that fires in response to things about the size of a fly. These two units are connected to a neuron that fires when the combined value of these two inputs is high. This neuron is an input into yet another which triggers tongue-flicking behavior.

The basic idea is that each neural unit, whether in a frog or a computer, has many inputs that the unit combines into a single output value. In brains, these units may be connected to specialized nerves. Computers, though, are a bit simpler; the units are simply connected together, as shown in Figure 7.3, so the outputs from some units are used as inputs into others. All the examples in Figure 7.3 are examples of feed-forward neural networks, meaning there is a one-way flow through the network from the inputs to the outputs and there are no cycles in the network.



-input 1 -

-input 2-

-input 3-

-output-

-input 4-

This simple neural network takes four inputs and produces an output. This result of training this network is equivalent to the statistical technique called logistic regression.


This network has a middle layer called the hidden layer, which makes the network more powerful by enabling it to recognize more patterns.


Increasing the size of the hidden layer makes the network more powerful but introduces the risk of overfitting. Usually, only one hidden layer is needed.


output 1-

A neural network can produce output 2-+- multiple output values.

output 3-

Figure 7.3 Feed-forward neural networks take inputs on one end and transform them into outputs.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 [ 81 ] 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222