Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 [ 215 ] 216 217 218 219 220 221 222

business goals, formulating, 605

customer attributes, 11

data transformation, 57

overview, 11

profiling tasks, 12

undirected data mining, 57 coding, special-purpose code, 595 collaborative filtering

estimated ratings, 284-285

grouping customers, 90

predictions, 284-285

profiles, building and comparing, 283-284

social information filtering, 282

word-of-mouth advertising, 283 collections, credit risks, 114 columns, data

cost, 548

derived variables, 542 discussed, 542 identification, 548 ignored, 547 input, 547

with one value, 544-546 target, 547

with unique values, 546-547

weight, 548 combination function

attrition history, 280

MBR (memory-response reasoning), 258, 265

neural networks, 222

weighted voting, 281-282 commercial software products, 15 communication channels,

prospecting, 89 companies. See businesses comparisons

comparing models, using lift ratio, 81-82

data, 83

statistical analysis, 148-149 competing risks, hazards, 403

competitive advantage, information as, 14

complete linkage, automatic cluster

detection, 369 computational issues, customer

signatures, 594-596 concentration

concentration charts, 101

cumulative response, 82-83 confidence intervals

hypothesis testing, 148

statistical analysis, 146, 148-149 confusion

aggregation and, 48

confusion matrix, 79

data transformation, 28 conjugate gradient, 230 constant hazards

changing over time hazards versus, 416-417

discussed, 397 continuous variables

data preparation, 235-237

neural networks, 235-237

statistics, 137-138 control group response

marketing campaigns, 106

target market response versus, 38 controlled experiments, hypothesis

testing, 51 convenience users, behavior-based

variables, 580, 587-589 cookies, Web servers, 109 correct classification matrix, 79 correlation ranges, statistics, 139 costs

cost columns, 548

decision tree considerations, 195 countervailing errors, 81 counts, converting to proportions, 75-76

coverage of values, neural networks, 232-233

Cox proportional hazards, 410-111



creative process, data mining as, 33 credit credit applications classification tasks, 9 prediction tasks, 10 useful data sources, 60 credit risks, reducing exposure to, 113-114

crossover, generic algorithms, 430 cross-selling opportunities affinity grouping, 11 customer relationships, 467 marketing campaigns, 111, 115-116 reasons for, 17 cross-tabulations, 136, 567-568 cumulative gains, 36, 101 cumulative response concentration, 82-83 results, assessing, 85 customers attributes, clustering, 11 behaviors of, gaining insight, 56 customer relationships bad customers, weeding out, 18 building businesses around, 2 customer acquisition, 461-464 customer activation, 464-466 customer-centric enterprises, 3 data mining role in, 5-6 data warehousing, 4-5 deep intimacy, 449, 451 event-based relationships, 458-459 good customers, holding on to, 17-18

in-between relationships, 453

indirect relationships, 453-454

interests in, 13-14

large-business relationships, 3-4

levels of, 448

life stages, 455-456

lifetime customer value, 32

mass intimacy, 451-453

retention, 467-469

service business sectors, 13-14

small-business relationships, 2

stages, 457

strategies for, 6

stratification, 469

subscription-based relationships, 459-460

survival analysis, 413-415

transaction processing systems, 3-4

up-selling, 467

winback approach, 470 customer-centric businesses,

514-515, 516-521 demographic profiles, 31 grouping, collaborative filtering

and, 90

interactions, learning opportunities,

520-521 loyalty, 520

marginal, 553

new customer information

gathering, 109-110

memory-based reasoning, 277

profiles, building, 283 prospective customer value, 115 responses

to marketing campaigns, 109

prediction, MBR, 258 retrospective customer value, 115 segmentation, marketing campaigns, 111-113

sequential patterns, identifying, 24 signatures

assembling, 68

business versus residential customers, 561

columns, pivoting, 563

computational issues, 594-596

considerations, 564

customer identification, 560-562

data for, cataloging, 559-560

discussed, 540-541

model set creation, 68

snapshots, 562

time frames, identifying, 562 single views, 517-518

Team-Fly®



sorting, by scores, 8

telecommunications, market based analysis, 288 cutoff scores, 98 cyclic graphs, 330-331

data

acquisition-time, 108-110 as actionable information, 516 availability, determining, 515-516 binary, 557

business versus scientific, statistical

analysis, 159 censored, 161 by census tract, 94 central repository, 484, 488, 490 columns cost, 548

derived variables, 542 discussed, 542 identification, 548 ignored, 547 input, 547

with one value, 544-546 target, 547

with unique values, 546-547 weight, 548 comparisons, 83

for customer signatures, cataloging,

559-560 data correction

categorical variables, 73

encoding, inconsistent, 74

missing values, 73-74

numeric variables, 73

outliners, 73

overview, 72

skewed distributions, 73

values with meaning, 74 data exploration

assumptions, validating, 67

descriptions, comparing values with, 65

discussed, 64

distributions, examining, 65

histograms, 565-566

intuition, 65

question asking, 67-68 data marts, 485, 491-192 data selection

contents of, outcomes of interest, 64

data locations, 61-62

density, 62-63

history of, determining, 63

scarce data, 61-62

variable combinations, 63-64 data transformation

capture trends, 75

counts, converting to proportions, 75-76

discussed, 74

information technology and user

roles, 58-60 problems, identifying, 56-57 ratios, 75

results, deliverables, 58

results, how to use, 57-58

summarization, 44

virtuous cycle, 28-30 dirty, 592-593 dumping, flat files, 594 enterprise-wide, 33 ETL (extraction, transformation, and

load) tools, 487 gigabytes, 5 as graphs, 337 historical

customer behaviors, 5

MBR (memory-based reasoning), 262-263

neural networks, 219

prediction tasks, 10 house-hold level, 96 imperfections in, 34 inconsistent, 593-594 as information, 22 metadata repository, 484, 491



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 [ 215 ] 216 217 218 219 220 221 222