Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 [ 214 ] 215 216 217 218 219 220 221 222

auxiliary information, 569-571 availability of data, determining, 515-516

average member technique, neural

networks, 252 averages, estimation, 81

back propagation, feed-forward

neural networks, 228-232 backfitting, defined, 170 bad customers, customer relationship

management, 18 bad data formats, data

transformation, 28 balance transfer programs, industry

revolution, 18 balanced datasets, model sets, 68 balanced sampling, 68 bathtub hazards, 397-398 behaviors behavioral segments, marketing

campaigns, 111-113 behavior-based variables

ad hoc questions, 585

aggression, 18

convenience users, 580, 587-589 declining usage, 577-579 estimated revenue, segmenting,

581-583 ideals, comparisons to, 585-587 potential revenue, 583-585 purchasing frequency, 575-576 revolvers, 580 transactions, 580 future customer behaviors, predicting, 10 bell-shaped distribution, 132 benefit, point of maximum, 101 Bernoulli, Jacques (binomial

formula), 191 biased sampling confidence intervals, statistical analysis, 146

neural networks, 227 response, methods of, 146 untruthful learning sources, 46-47 BILL MASTER file, customer

signatures, 559 binary churn models, 119 binary classification decision trees, 168 misclassification rates, 98 binary data, 557 binning, 237, 551 binomial formula (Jacques

Bernoulli), 191 biological neural networks, 211 births, house-hold level data, 96 bizocity scores, 112-113 Bonferroni, Carlo (Bonferronis

correction), 149 box diagrams, as alternative to

decision trees, 199-201 brainstorming meetings, 37 branching nodes, decision trees, 176 budgets, fixed, marketing campaigns, 97-100

building models, data mining, 8, 77 Building the Data Warehouse (Bill

Inmon), 474 Business Modeling and Data Mining

(Dorian Pyle), 60 businesses challenges of, identifying, 23-24 customer relationship management, 2-6 customer-centric, 514-515 forward-looking, 2 home-based, 56

large-business relationships, 3-4 opportunities, identifying

virtuous cycle, 27-28

wireless communication industries, 34-35 product-focused, 2 recommendation-based, 16-17 small-business relationships, 2



calculations, probabilities, 133-135 call detail databases, 37 call-center records, useful data

sources, 60 campaigns, marketing. See also advertising acquisitions-time data, 108-110 canonical measurements, 31 champion-challenger approach, 139 credit risks, reducing exposure to,

113-114 cross-selling, 115-116 customer response, tracking, 109 customer segmentation, 111-113 differential response analysis,

107-108 discussed, 95 fixed budgets, 97-100 loyalty programs, 111 new customer information,

gathering, 109-110 people most influenced by, 106-107 planning, 27 profitability, 100-104 proof-of-concept projects, 600 response modeling, 96-97 as statistical analysis acuity of testing, 147-148 confidence intervals, 146 proportion, standard error of,

139-141 results, comparing, using

confidence bounds, 141-143 sample sizes, 145 targeted acquisition campaigns, 31 types of, 111 up-selling, 115-116 usage stimulation, 111 candidates, link analysis, 333 canonical measurements, marketing

campaigns, 31 capture trends, data transformation, 75

car ownership, house-hold level data, 96

CART (Classification and Regression Trees) algorithm, decision trees, 185, 188-189 case studies automatic cluster detection, 374-378 chi-square tests, 155-158 decision trees, 206, 208 generic algorithms, 440-443 link analysis, 343-346 MBR (memory-based reasoning),

259-262 neural networks, 252-254 catalogs response models, decision trees

for, 175 retailers, historical customer behavior data, 5 categorical variables automatic cluster detection, 359 data correction, 73 marriages, 239-240 measures of, 549 neural networks, 239-240 propensity, 242 splits, decision trees, 174 censored data hazards, 399-403 statistics, 161 census data proportional scoring, 94-95 useful data sources, 61 Central Limit Theorem, statistics,

129-130 central repository, 484, 488, 490 centroid distance, automatic cluster

detection, 369 C5 pruning algorithm, decision trees, 190-191

CHAID (Chi-square Automatic Interaction Detector), 182-183

challenges, business challenges, identifying, 23-24



champion-challenger approach,

marketing campaigns, 139 change processes, feedback, 34 charts

concentration, 101

cumulative gains, 101

lift charts, 82, 84

time series, 128-129 CHIDIST function, 152 child nodes, classification, 167 children, number of, house-hold level

data, 96 chi-square tests

case study, 155-158

CHAID (Chi-square Automatic Interaction Detector), 182-183

CHIDIST function, 152

degrees of freedom values, 152-153

difference of proportions versus, 153-154

discussed, 149

expected values, calculating, 150-151

splits, decision trees, 180-183 churn

as binary outcome, 119

customer longevity, predicting, 119-120

EBCF (existing base churn forecast), 469

expected, 118

forced attrition, 118

importance of, 117-118

involuntary, 118-119, 521

recognizing, 116-117

retention and, 116-120

voluntary, 118-119, 521 class labels, probability, 85 classification

accuracy, 79

binary decision trees, 168 misclassification rates, 98

business goals, formulating, 605

child nodes, 167

correct classification matrix, 79 data transformation, 57 decision trees, 166-168 directed data mining, 57 discrete outcomes, 9 estimation, 9 leaf nodes, 167

memory-based reasoning, 90-91 overview, 8-9 performance, 12 Classification and Regression Trees (CART) algorithm, decision trees, 185,188-189 classification codes discussed, 266

precision measurements, 273-274 recall measurements, 273-274 clustering automatic cluster detection

agglomerative clustering, 368-370

case study, 374-378

categorical variables, 359

centroid distance, 369

complete linkage, 369

data preparation, 363-365

dimension, 352

directed clustering, 372

discussed, 12, 91, 351

distance and similarity, 359-363

divisive clustering, 371-372

evaluation, 372-373

Gaussian mixture model, 366-367

geometric distance, 360-361

hard clustering, 367

Hertzsprung-Russell diagram, 352-354

luminosity, 351

scaling, 363-364

single linkage, 369

soft clustering, 367

SOM (self-organizing map), 372

vectors, angles between, 361-362

weighting, 363-365

zone boundaries, adjusting, 380



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 [ 214 ] 215 216 217 218 219 220 221 222