Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 [ 217 ] 218 219 220 221 222

expected churn, 118 experimentation hypothesis testing, 51 statistics, 160-161 exploration tools, decision trees as,

203-204 exponential decay, retention,

389-390, 393 expressive power, descriptive

models, 78 extraction, transformation, and load

(ETL) tools, 487, 595

F tests (Ronald A. Fisher), 183-184 fax machines, link analysis, 337-341 Federal Express, transaction

processing systems, 3-4 feedback

change processes, 34

operational, 485, 492

relevance feedback, MBR, 267-268 feed-forward neural networks

back propagation, 228-232

hidden layer, 227

input layer, 226

output layer, 227 field values, statistics, 128 Fisher, Ronald A. (F tests), 183-184 fixed budgets, marketing campaigns, 97-100

fixed positions, generic algorithms, 435 fixed-length character strings, 552-554 flat files, dumping data, 594 forced attrition, 118 forecasting

EBCF (existing base churn forecast), 469

NSF (new start forecast), 469

survival analysis, 415-116 former customers, customer

relationships, 457 forward-looking businesses, 2 fraud detection, MBR, 258

fraudulent insurance claims,

classification, 9 free text response, memory-based

reasoning, 285 functionality, lack of, data

transformation, 28 functions activation, 222 CHIDIST, 152 combination attrition history, 280 MBR (memory-based reasoning),

258, 265 neural networks, 272 weighted voting, 281-282 density, 133 distance defined, 271-272 discussed, 258, 265 hidden distance fields, 278 identity distance, 271 numeric fields, 275 triangle inequality, 272 zip codes, 276-277 hyperbolic tangent, 223 NORMDIST, 134 NORMSINV, 147 sigmoid, 225 summation, 272 tangent, 223 transfer, 223 future attrition, 49 future customer behaviors, predicting, 10

gains, cumulative, 36, 101 Gaussian mixture model, automatic

cluster detection, 366-367 gender

as categorical value, 239

profiling example, 12 generalized delta rules, 229



genetic algorithms case study, 440-143 crossover, 430

data representation, 432-433 genome, 424 implicit parallelism, 438 maximum values, of simple

functions, 424 mutation, 431-132 neural networks and, 439-140 optimization, 422 overview, 421-422 resource optimization, 433-435 response modeling, 440-443 schemata, 434, 436-438 selection step, 429

statistical regression techniques, 423 Genetic Algorithms in Search,

Optimization, and Machine Learning

(Goldberg), 445 geographic attributes, market based

analysis, 293 geographic information system

(GIS), 536 geographical resources, 555-556 geometric distance, automatic cluster

detection, 360-361 gigabytes, 5

Gini, Corrado (Gini splitting criterion,

decision trees), 178 GIS (geographic information

system), 536 goals, formulating, 605-606 Goldberg (Genetic Algorithms in

Search, Optimization, and Machine

Learning), 445 good customers, holding on to, 17-18 good prospects, identifying, 88-89 Goodman, Marc (projective

visualization), 206-208 graphical user interface (GUI), 535 graphs acyclic, 331 cyclic, 330-331

data as, 337 directed, 330 edges, 322

graph-coloring algorithm, 340-341 Hamiltonian path, 328 linkage, 77 nodes, 322 planar, 323

traveling salesman problem, 327-329

vertices, 322 grouping. See clustering GUI (graphical user interface), 535

Hamiltonian path, graph theory, 328 hard clustering, automatic cluster

detection, 367 hazards bathtub, 397-398 censoring, 399-403 constant, 397, 416-417 probabilities, 394-396 proportional Cox, 410-411 discussed, 408 examples of, 409 limitations of, 411-412 real-world example, 398-399 retention, 404-405 stratification, 410 Hertzsprung-Russell diagram, automatic cluster detection, 352-354 hidden distance fields, distance

function, 278 hidden layer, feed-forward neural

networks, 221, 227 hierarchical categories, products, 305 histograms data exploration, 565-566 discussed, 543 statistics and, 127 historical data customer behaviors, 5 documentation as, 61



MBR (memory-based reasoning), 262-263

neural networks, 219

predication tasks, 10 hobbies, house-hold level data, 96 holdout groups, marketing

campaigns, 106 home-based businesses, 56 house-hold level data, 96 hubs, link analysis, 332-334 hyperbolic tangent function, 223 hypothesis testing

confidence levels, 148

considerations, 51

decision-making process, 50-51

generating, 51

market basket analysis, 51

null hypothesis, statistics and, 125-126

IBM, relational database management

software, 13 ID and key variables, 554 ID3 (Iteractive Dichotomiser 3), 190 identification columns, 548

customer signatures, 560-562

good prospects, 88-89

problem management, 43

proof-of-concept projects, 599-601 identified versus anonymous

transactions, association rules, 308 identity distance, distance function, 271 ignored columns, 547 images, binary data, 557 imperfections, in data, 34 implementation

neural networks, 212

proof-of-concept projects, 601-605 implicit parallelism, 438 in-between relationships, customer

relationships, 453 income, house-hold level data, 96

inconclusive survey responses, 46 inconsistent data, 593-594 index-based scores, 92-95 indicator variables, 554 indirect relationships, customer

relationships, 453-454 industry revolution, 18 inexplicable rules, association rules,

297-298 information competitive advantages, 14 data as, 22 infomediaries, 14 information brokers, supermarket

chains as, 15-16 information gain, entropy, 178-180 information technology, data

transformation, 58-60 as products, 14

recommendation-based businesses, 16-17

Inmon, Bill (Building the Data

Warehouse), 474 input columns, 547 input layer, free-forward neural

networks, 226 input variables, target fields, 37 inputs/outputs, neural networks, 215 insourcing data mining, 524-525 insurance claims, classification, 9 interactive systems, response times, 33 Internet resources customer response to marketing

campaigns, tracking, 109 RuleQuest, 190 U.S. Census Bureau, 94 interval variables, 549, 552 interviews business opportunities,

identifying, 27 proof-of-concept projects, 600 intrinsic information, splits, decision

trees, 180 introduction, of products, 27



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 [ 217 ] 218 219 220 221 222