Промышленный лизинг
Методички
sorting customers by, 8 z-scores, 551 search programs, link analysis, 331 searchable criteria, relevance feedback, 268 sectional center facility (SCF), 553 selection step, generic algorithms, 429 self-organizing map (SOM), 249-251, 372 sensitivity analysis, neural networks, 247-248 sequential analysis, association rules, 318-319 sequential events, applying decision trees to, 205 sequential patterns, identifying, 24 server platforms, affordability, 13 service business sectors, customer relationships, 13-14 shared labels, fax machines, 341 short form, census data, 94 short-term trends, 75 sigmoid action functions, neural networks, 225 signatures, customers assembling, 68 business versus residential customers, 561 columns, pivoting, 563 computational issues, 594-596 considerations, 564 customer identification, 560-562 data for, cataloging, 559-560 discussed, 540-541 model set creation, 68 snapshots, 562 time frames, identifying, 562 similarity and distance, automatic cluster detection, 359-363 similarity matrix, 368 similarity measurements, MBR, 271-272 Simplifying Assumptions Corporation (SAC), 97, 100 simulated annealing, 230 single linkage, automatic cluster detection, 369 single response rates, 141 single views, customers, 517-518 sites. See Web sites skewed distributions, data correction, 73 SKUs (stock-keeping units), 305 small-business relationships, customer relationship management, 2 SMP (symmetric multiprocessor), 485 snapshots, customer signatures, 562 social information filtering, 282 soft clustering, automatic cluster detection, 367 SOI (sphere of influence), 38 sole proprietors, 3 solicitation, marketing campaigns, 96 SOM (self-organizing map), 249-251, 372 source systems, 484, 486-487, 594 special-purpose code, 595 sphere of influence (SOI), 38 spiders, web crawlers, 331 splits, decision trees on categorical input variables, 174 chi-square testing, 180-183 discussed, 170 diversity measures, 177-178 entropy, 179 finding, 172 Gini splitting criterion, 178 information gain ratio, 178, 180 intrinsic information of, 180 missing values, 174-175 multiway, 171 on numeric input variables, 173 population diversity, 178 purity measures, 177-178 reduction in variance, 183 surrogate, 175 spreadsheets, results, assessing, 85 SQL data, time series analysis, 572-573 stability-based pruning, decision trees, 191-192 staffing, data mining, 525-526 standard deviation estimation, 81 statistics, 132, 138 variance and, 138 standard error of proportion, statistical analysis, 139-141 standardization, numeric values, 551 standardized values, statistics, 129-133 star schema structure, relational databases, 505 statistical analysis business data versus scientific data, 159 censored data, 161 Central Limit Theorem, 129-130 chi-square tests case study, 155-158 degrees of freedom values, chi-square tests, 152-153 difference of proportions versus, 153-154 discussed, 149 expected values, calculating, 150-151 continuous variables, 137-138 correlation ranges, 139 cross-tabulations, 136 density function, 133 as disciplinary technique, 123 discrete values, 127-131 experimentation, 160-161 field values, 128 histograms and, 127 marketing campaign approaches acuity of testing, 147-148 confidence intervals, 146 proportion, standard error of, 139-141 sample sizes, 145 mean values, 137 median values, 137 mode values, 137 multiple comparisons, 148-149 normal distribution, 130-132 null hypothesis and, 125-126 probabilities, 133-135 p-values, 126 q-values, 126 range values, 137 regression ranges, 139 sample variation, 129 standard deviation, 132, 138 standardized values, 129-133 sum of values, 137-138 time series analysis, 128-129 truncated data, 162 variance, 138 z-values, 131, 138 statistical regression techniques, generic algorithms, 423 status codes, as categorical value, 239 stemming, link analysis, 333 stock-keeping units (SKUs), 305 store comparisons, association rules for, 315-316 stratification customer relationships and, 469 hazards, 410 strings, fixed-length characters, 552-554 subgroups automatic cluster detection agglomerative clustering, 368-370 case study, 374-378 categorical variables, 359 centroid distance, 369 complete linkage, 369 data preparation, 363-365 dimension, 352 directed clustering, 372 discussed, 12, 91, 351 distance and similarity, 359-363 divisive clustering, 371-372 evaluation, 372-373 Gaussian mixture model, 366-367 geometric distance, 360-361 hard clustering, 367 Hertzsprung-Russell diagram, 352-354 luminosity, 351 scaling, 363-364 single linkage, 369 soft clustering, 367 SOM (self-organizing map), 372 vectors, angles between, 361-362 weighting, 363-365 zone boundaries, adjusting, 380 business goals, formulating, 605 customer attributes, 11 data transformation, 57 overview, 11 profiling tasks, 12 undirected data mining, 57 subscription-based relationships, customer relationships, 459-460 subtrees, decision trees, 189 sum of values, statistics, 137-138 summarization, data transformation, 44 summation function, 272 supermarket chains, as information brokers, 15-16 supervised learning, 57 support, market based analysis, 301 surrogate splits, decision trees, 175 survey responses customer classification, 91 inconclusive, 46 profiling, 53 survey-based market research, 113 useful data sources, 61 survival analysis attrition, handling different types of, 412-113 customer relationships, 413-415 estimation tasks, 10 forecasting, 415-416 symmetric multiprocessor (SMP), 489-490 tables, lookup, auxiliary information, 570-571 tainted results, 72 tangent function, 223 target columns, 547 target fields, input variables, 37 target market versus control group response, 38 targeted acquisition campaigns, 31 targeting good prospects, identifying, 88-89 prospecting, 88 taxonomy, products, 305 telecommunications customers, market based analysis, 288 telephone switches, transaction processing systems, 3 terabytes, 5 Teradata, relational database management software, 13 termination of services, 114 testing acuity of, statistical analysis, 147-148 chi-square tests case study, 155-158 CHIDIST function, 152 degrees of freedom values, 152-153 difference of proportions versus, 153-154 discussed, 149 expected values, calculating, 150-151 splits, decision trees, 180-183 F tests, 183-184 hypothesis testing confidence levels, 148 considerations, 51 decision-making process, 50-51 generating, 51 market basket analysis, 51 null hypothesis, statistics and, 125-126 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 [ 221 ] 222 |