Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 [ 219 ] 220 221 222

nearest neighbor techniques

classification, 9

collaborative filtering estimated ratings, 284-285 grouping customers, 90 predictions, 284-285 profiles, building and comparing,

283-284 social information filtering, 282 word-of-mouth advertising, 283

memory-based reasoning (MBR) case study, 259-262 challenges of, 262-265 classification codes, 266, 273-274 combination function, 258, 265 customer classification, 90-91 customer response prediction, 258 democracy approach, 279-281 distance function fraud detection, 258 free text responses, 258 historical records, selecting, 262-263

medical treatment applications, 258

new customers, 277

relevance feedback, 267-268

similarity measurements, 271-272

training data, 263-264

weighted voting, 281-282 negative correlation, 139 neighborliness parameters, neural

networks, 250 neural networks activation function, 222 AND value, 222 automation, 213 average member technique, 252 bias sampling, 227 biological, 211 building models, 8 case study, 252-254 categorical variables, 239-240

classification, 9 combination function, 222 components of, 220-221 continuous values, features with,

235-237 coverage of values, 232-233 data preparation

categorical values, 239-240

continuous values, 235-237 decision trees, 199 discussed, 211 estimation tasks, 10, 215 feed-forward

back propagation, 228-232

hidden layer, 227

input layer, 226

output layer, 227 generic algorithms and, 439-440 hidden layers, 221, 227 historical data, 219 history of, 212-213 implementation, 212 inputs/outputs, 215 neighborliness parameters, 250 nonlinear behaviors, 222 OR value, 222 overfitting, 234 parallel coordinates, 253 prediction, 215 real estate appraisal example,

213-217 results, interpreting, 241-243 sensitivity analysis, 247-248 sigmoid action functions, 225 SOM (self-organizing map), 249-251 time series analysis, 244-247 training sets, selection consideration,

232-234 transfer function, 223 validation sets, 218 variable selection problem, 233 variance, 199



new customer information

gathering, 109-110

memory-based reasoning, 277

profiles, building, 283 new start forecast (NSF), 469 nodes, graphs, 322 nonlinear behaviors, neural

networks, 222 non-response models, mass

mailings, 35 normal distribution, statistics, 130-132 normalization, numeric variables, 550 normalized absolute value, distance

function, 275 NORMDIST function, 134 NORMSINV function, 147 NSF (new start forecast), 469 null hypothesis, statistics and, 125-126 NULL values, missing data, 590 numeric variables

data correction, 73

distance function, 275

measure of, 550-551

splits, decision trees, 173

Occams Razor, 124-125 ODBC (Open Database

Connectivity), 496 one-tailed distribution, 134 Online Analytic Processing (OLAP)

additive facts, 501

data mining and, 507-508

decision-support summary data, 477-478

dimension tables, 502-503

discussed, 31

levels of, 475

logical schema, 478

metadata, 483-484, 491

operational summary data, 477

physical schema, 478

reporting requirements, 495-196

transaction data, 476-477

Open Database Connectivity

(ODBC), 496 operational errors, 159 operational feedback, 485, 492 operational summary data, OLAP, 477 opportunistic sample, defined, 25 opportunities, good response

scores, 34 optimization generic algorithms, 422 resources, generic algorithms,

433-435 training as, 230 OR value, neural networks, 222 Oracle, relational database

management software, 13 order characteristics, market based

analysis, 292 ordered lists, 239

ordered variables, measure of, 549 organizations. See businesses out of time tests, 72 outliners

data correction, 73

data transformation, 74 output layer, feed-forward neural

networks, 227 outputs, neural networks, 215 outsourcing data mining, 522-524 overfitting, neural networks, 234

parallel coordinates, neural

networks, 253 parsing variables, 569 patterns

meaningful discoveries, 56

prediction, 45

untruthful learning sources, 45-46 peg values, 236 penetration, proportion, 203 percent variations, 105 perceptrons, defined, 212



performance, classification, 12

physical schema, OLAP, 478

pilot projects, 598

planar graphs, 323

planned processes, proof-of-concept

projects, 599 platforms, data mining, 527 point of maximum benefit, 101 point-of-sale data

association rules, 288

scanners, 3

as useful data source, 60 population diversity, 178 positive ratings, voting, 284 postcards, as communication

channel, 89 potential revenue, behavior-based

variables, 583-585 precision measurements, classification

codes, 273-274 preclassified tests, 79 predictions

accuracy, 79

association rules, 70

business goals, formulating, 605

collaborative filtering, 284-285

credit risks, 113-114

customer longevity, 119-120

data transformation, 57

defined, 52

directed data mining, 57 errors, 191 future behaviors, 10 historical data, 10 model sets for, 70-71 neural networks, 215 patterns, 45

prediction task examples, 10 profiling versus, 52-53 response, MBR, 258 uses for, 54 probabilities calculating, 309 class labels, 85

distribution and, 135

hazards, 394-396

statistics, 133-135 probation periods, 518 problem management

data transformation, 56-57

identification, 43

lift ratio, 83

profiling as, 53-54

rule-oriented problems, 176

variable selection problems, neural networks, 233 products

clustering by usage, market based analysis, 294-295

co-occurrence of, 299

hierarchical categories, 305

information as, 14

introduction, planning for, 27

product codes, as categorical value, 239

product-focused businesses, 2

taxonomy, 305 profiling

business goals, formulating, 605

collaborative filtering, 283-284

data transformation, 57

decision trees, 12

demographic profiles, 31

descriptive, 52

directed, 52

examples of, 54

gender example, 12

new customer information, 283

overview, 12

predication versus, 52-53

as problem management, 53-54

survey response, 53

profitability marketing campaigns, 100-104 proof-of concept projects, 599 results, assessing, 85

projective visualization (Marc Goodman), 206-208



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 [ 219 ] 220 221 222