Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ 21 ] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

Completing the Cycle

With the help of data mining, the right group of prospects was contacted for the new product offering. That is not the end of the story, though. Once the results of the new campaign were in, data mining techniques could help to get a better picture of the actual responders. Armed with a buyer profile of the buyers in the initial test market, and a usage profile of the first several months of the new service, the company was able to do an even better job of targeting prospects in the next five markets where the product was rolled out.

Neural Networks and Decision Trees Drive SUV Sales

In 1992, before any of the commercial data mining tools available today were on the market, one of the big three U.S. auto makers asked a group of researchers at the Pontikes Center for Management at Southern Illinois University in Carbondale to develop an expert system to identify likely buyers of a particular sport-utility vehicle. (We are grateful to Wei-Xiong Ho who worked with Joseph Harder of the College of Business and Administration at Southern Illinois on this project.)

Traditional expert systems consist of a large database of hundreds or thousands of rules collected by observing and interviewing human experts who are skilled at a particular task. Expert systems have enjoyed some success in certain domains such as medical diagnosis and answering tax questions, but the difficulty of collecting the rules has limited their use.

The team at Southern Illinois decided to solve these problems by generating the rules directly from historical data. In other words, they would replace expert interviews with data mining.

The Initial Challenge

The initial challenge that Detroit brought to Carbondale was to improve response to a direct mail campaign for a particular model. The campaign involved sending an invitation to a prospect to come test-drive the new model. Anyone accepting the invitation would find a free pair of sunglasses waiting at the dealership. The problem was that very few people were returning the response card or calling the toll-free number for more information, and few of those that did ended up buying the vehicle. The company knew it could save itself a lot of money by not sending the offer to people unlikely to respond, but it didnt know who those were.



How Data Mining Was Applied

As is often the case when the data to be mined is from several different sources, the first challenge was to integrate data so that it could tell a consistent story.

The Data

The first file, the mail file, was a mailing list containing names and addresses of about a million people who had been sent the promotional mailing. This file contained very little information likely to be useful for selection.

The mail file was appended with data based on zip codes from the commercially available PRIZM database. This database contains demographic and psychographic characterizations of the neighborhoods associated with the zip codes.

Two additional files contained information on people who had sent back the response card or called the toll-free number for more information. Linking the response cards back to the original mailing file was simple because the mail file contained a nine-character key for each address that was printed on the response cards. Telephone responders presented more of a problem since their reported name and address might not exactly match their address in the database, and there is no guarantee that the call even came from someone on the mailing list since the recipient may have passed the offer on to someone else.

Of 1,000,003 people who were sent the mailing, 32,904 responded by sending back a card and 16,453 responded by calling the toll-free number for a total initial response rate of 5 percent. The auto makers primary interest, of course, was in the much smaller number of people who both responded to the mailing and bought the advertised car. These were to be found in a sales file, obtained from the manufacturer, that contained the names, addresses, and model purchased for all car buyers in the 3-month period following the mailing.

An automated name-matching program with loosely set matching standards discovered around 22,000 apparent matches between people who bought cars and people who had received the mailing. Hand editing reduced the intersection to 4,764 people who had received the mailing and bought a car. About half of those had purchased the advertised model. See Figure 2.5 for a comparison of all these data sources.

Down the Mine Shaft

The experimental design called for the population to be divided into exactly two classes-success and failure. This is certainly a questionable design since it obscures interesting differences. Surely, people who come into the dealership to test-drive one model, but end up buying another should be in a different class than nonresponders, or people who respond, but buy nothing. For that matter, people who werent considered good enough prospects to be sent a mailing, but who nevertheless bought the car are an even more interesting group.




Figure 2.5 Prospects in the training set have overlapping relationships.

Be that as it may, success was defined as received a mailing and bought the car and failure was defined as received the mailing, but did not buy the car. A series of trials was run using decision trees and neural networks. The tools were tested on various kinds of training sets. Some of the training sets reflected the true proportion of successes in the database, while others were enriched to have up to 10 percent successes-and higher concentrations might have produced better results.

The neural network did better on the sparse training sets, while the decision tree tool appeared to do better on the enriched sets. The researchers decided on a two-stage process. First, a neural network determined who was likely to buy a car, any car, from the company. Then, the decision tree was used to predict which of the likely car buyers would choose the advertised model. This two-step process proved quite successful. The hybrid data mining model combining decision trees and neural networks missed very few buyers of the targeted model while at the same time screening out many more nonbuyers than either the neural net or the decision tree was able to do.

The Resulting Actions

Armed with a model that could effectively reach responders the company decided to take the money saved by mailing fewer pieces and put it into improving the lure offered to get likely buyers into the showroom. Instead of sunglasses for the masses, they offered a nice pair of leather boots to the far



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [ 21 ] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222