Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 [ 90 ] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

Notice that the time-series network is not limited to data from just a single time series. It can take multiple inputs. For instance, to predict the value of the Swiss franc to U.S. dollar exchange rate, other time-series information might be included, such as the volume of the previous days transactions, the U.S. dollar to Japanese yen exchange rate, the closing value of the stock exchange, and the day of the week. In addition, non-time-series data, such as the reported inflation rate in the countries over the period of time under investigation, might also be candidate features.

The number of historical units controls the length of the patterns that the network can recognize. For instance, keeping 10 historical units on a network predicting the closing price of a favorite stock will allow the network to recognize patterns that occur within 2-week time periods (since exchange rates are set only on weekdays). Relying on such a network to predict the value 3 months in the future may not be a good idea and is not recommended.

Actually, by modifying the input, a feed-forward network can be made to work like a time-delay neural network. Consider the time series with 10 days of history, shown in Table 7.5. The network will include two features: the day of the week and the closing price.

Create a time series with a time lag of three requires adding new features for the historical, lagged values. (Day-of-the-week does not need to be copied, since it does not really change.) The result is Table 7.6. This data can now be input into a feed-forward, back propagation network without any special support for time series.

Table 7.5 Time Series

DATA ELEMENT

DAY-OF-WEEK

CLOSING PRICE

$40.25

$41.00

$39.25

$39.75

$40.50

$40.50

$40.75

$41.25

$42.00

$41.50



Table 7.6 Time Series with Time Lag

DATA ELEMENT

DAY-OF-WEEK

CLOSING PRICE

PREVIOUS

CLOSING PRICE

PREVIOUS-1

CLOSING PRICE

$40.25

$41.00

$40.25

$39.25

$41.00

$40.25

$39.75

$39.25

$41.00

$40.50

$39.75

$39.25

$40.50

$40.50

$39.75

$40.75

$40.50

$40.50

$41.25

$40.75

$40.50

$42.00

$41.25

$40.75

$41.50

$42.00

$41.25

How to Know What Is Going on Inside a Neural Network

Neural networks are opaque. Even knowing all the weights on all the nodes throughout the network does not give much insight into why the network produces the results that it produces. This lack of understanding has some philosophical appeal-after all, we do not understand how human consciousness arises from the neurons in our brains. As a practical matter, though, opaqueness impairs our ability to understand the results produced by a network.

If only we could ask it to tell us how it is making its decision in the form of rules. Unfortunately, the same nonlinear characteristics of neural network nodes that make them so powerful also make them unable to produce simple rules. Eventually, research into rule extraction from networks may bring unequivocally good results. Until then, the trained network itself is the rule, and other methods are needed to peer inside to understand what is going on.

A technique called sensitivity analysis can be used to get an idea of how opaque models work. Sensitivity analysis does not provide explicit rules, but it does indicate the relative importance of the inputs to the result of the network. Sensitivity analysis uses the test set to determine how sensitive the output of the network is to each input. The following are the basic steps:

1. Find the average value for each input. We can think of this average value as the center of the test set.



2. Measure the output of the network when all inputs are at their average value.

3. Measure the output of the network when each input is modified, one at a time, to be at its minimum and maximum values (usually -1 and 1, respectively).

For some inputs, the output of the network changes very little for the three values (minimum, average, and maximum). The network is not sensitive to these inputs (at least when all other inputs are at their average value). Other inputs have a large effect on the output of the network. The network is sensitive to these inputs. The amount of change in the output measures the sensitivity of the network for each input. Using these measures for all the inputs creates a relative measure of the importance of each feature. Of course, this method is entirely empirical and is looking only at each variable independently. Neural networks are interesting precisely because they can take interactions between variables into account.

There are variations on this procedure. It is possible to modify the values of two or three features at the same time to see if combinations of features have a particular importance. Sometimes, it is useful to start from a location other than the center of the test set. For instance, the analysis might be repeated for the minimum and maximum values of the features to see how sensitive the network is at the extremes. If sensitivity analysis produces significantly different results for these three situations, then there are higher order effects in the network that are taking advantage of combinations of features.

When using a feed-forward, back propagation network, sensitivity analysis can take advantage of the error measures calculated during the learning phase instead of having to test each feature independently. The validation set is fed into the network to produce the output and the output is compared to the predicted output to calculate the error. The network then propagates the error back through the units, not to adjust any weights but to keep track of the sensitivity of each input. The error is a proxy for the sensitivity, determining how much each input affects the output in the network. Accumulating these sensitivities over the entire test set determines which inputs have the larger effect on the output. In our experience, though, the values produced in this fashion are not particularly useful for understanding the network.

Neural networks do not produce easily understood rules that explain how they arrive at a given result. Even so, it is possible to understand the relative importance of inputs into the network by using sensitivity analysis. Sensitivity can be a manual process where each feature is tested one at a time relative to the other features. It can also be more automated by using the sensitivity information generated by back propagation. In many situations, understanding the relative importance of inputs is almost as good as having explicit rules.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 [ 90 ] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222