![]() |
![]() |
|
Промышленный лизинг
Методички
Table 5.3 The 95 Percent Confidence Interval Bounds for the Difference between the Champion and Challenger groups
Size of Sample The formulas for the standard error of a proportion and for the standard error of a difference of proportions both include the sample size. There is an inverse relationship between the sample size and the size of the confidence interval: the larger the size of the sample, the narrower the confidence interval. So, if you want to have more confidence in results, it pays to use larger samples. Table 5.4 shows the confidence interval for different sizes of the challenger group, assuming the challenger response rate is observed to be 5 percent. For very small sizes, the confidence interval is very wide, often too wide to be useful. Earlier, we had said that the normal distribution is an approximation for the estimate of the actual response rate; with small sample sizes, the estimation is not a very good one. Statistics has several methods for handling such small sample sizes. However, these are generally not of much interest to data miners because our samples are much larger. Table 5.4 The 95 Percent Confidence Interval for Difference Sizes of the Challenger Group
What the Confidence Interval Really Means The confidence interval is a measure of only one thing, the statistical dispersion of the result. Assuming that everything else remains the same, it measures the amount of inaccuracy introduced by the process of sampling. It also assumes that the sampling process itself is random-that is, that any of the one million customers could have been offered the challenger offer with an equal likelihood. Random means random. The following are examples of what not to do: Use customers in California for the challenger and everyone else for the champion. Use the 5 percent lowest and 5 percent highest value customers for the challenger, and everyone else for the champion. Use the 10 percent most recent customers for the challenger, and everyone else for the champion. Use the customers with telephone numbers for the telemarketing campaign; everyone else for the direct mail campaign. All of these are biased ways of splitting the population into groups. The previous results all assume that there is no such systematic bias. When there is systematic bias, the formulas for the confidence intervals are not correct. Using the formula for the confidence interval means that there is no systematic bias in deciding whether a particular customer receives the champion or the challenger message. For instance, perhaps there was a champion model that predicts the likelihood of customers responding to the champion offer. If this model were used, then the challenger sample would no longer be a random sample. It would consist of the leftover customers from the champion model. This introduces another form of bias. Or, perhaps the challenger model is only available to customers in certain markets or with certain products. This introduces other forms of bias. In such a case, these customers should be compared to the set of customers receiving the champion offer with the same constraints. Another form of bias might come from the method of response. The challenger may only accept responses via telephone, but the champion may accept them by telephone or on the Web. In such a case, the challenger response may be dampened because of the lack of a Web channel. Or, there might need to be special training for the inbound telephone service reps to handle the challenger offer. At certain times, this might mean that wait times are longer, another form of bias. The confidence interval is simply a statement about statistics and dispersion. It does not address all the other forms of bias that might affect results, and these forms of bias are often more important to results than sample variation. The next section talks about setting up a test and control experiment in marketing, diving into these issues in more detail. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 [ 56 ] 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||