Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 [ 136 ] 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

of loyalty-that the longer customers stay around, the less likely they are to stop at any particular point in time-is really a statement about hazards.

The world of marketing is a bit different from the world of medical research. For one thing, the consequences of our actions are much less dire: a patient may die from poor treatment, whereas the consequences in marketing are merely measured in dollars and cents. Another important difference is the volume of data. The largest medical studies have a few tens of thousands of participants, and many draw conclusions from a just a few hundred. When trying to determine mean time between failure (MTBF) or mean time to failure (MTTF)-manufacturing lingo for how long to wait until an expensive piece of machinery breaks down-conclusions are often based on no more than a few dozen failures.

In the world of customers, tens of thousands is the lower limit, since customer databases often contain data on millions of customers and former customers. Much of the statistical background of survival analysis is focused on extracting every last bit of information out of a few hundred data points. In data mining applications, the volumes of data are so large that statistical concerns about confidence and accuracy are replaced by concerns about managing large volumes of data.

The importance of survival analysis is that it provides a way of understanding time-to-event characteristics, such as:

When a customer is likely to leave

The next time a customer is likely to migrate to a new customer segment

The next time a customer is likely to broaden or narrow the customer relationship

The factors in the customer relationship that increase or decrease likely tenure

The quantitative effect of various factors on customer tenure

These insights into customers feed directly into the marketing process. They make it possible to understand how long different groups of customers are likely to be around-and hence how profitable these segments are likely to be. They make it possible to forecast numbers of customers, taking into account both new acquisition and the decline of the current base. Survival analysis also makes it possible to determine which factors, both those at the beginning of customers relationships as well as later experiences, have the biggest effect on customers staying around the longest. And, the analysis can be applied to things other then the end of the customer tenure, making it possible to determine when another event-such as a customer returning to a Web site-is no longer likely to occur.

A good place to start with survival is with visualizing customer retention, which is a rough approximation of survival. After this discussion, we move on to hazards, the building blocks of survival. These are in turn combined into



survival curves, which are similar to retention curves but more useful. The chapter ends with a discussion of Cox Proportional Hazard Regression and other applications of survival analysis. Along the way, the chapter provides particular applications of survival in the business context. As with all statistical methods, there is a depth to survival that goes far beyond this introductory chapter, which is consciously trying to avoid the complex mathematics underlying these techniques.

Customer Retention

Customer retention is a concept familiar to most businesses that are concerned about their customers, so it is a good place to start. Retention is actually a close approximation to survival, especially when considering a group of customers who all start at about the same time. Retention provides a familiar framework to introduce some key concepts of survival analysis such as customer half-life and average truncated customer tenure.

Calculating Retention

How long do customers stay around? This seemingly simple question becomes more complicated when applied to the real world. Understanding customer retention requires two pieces of information:

When each customer started

When each customer stopped

The difference between these two values is the customer tenure, a good measurement of customer retention.

Any reasonable database that purports to be about customers should have this data readily accessible. Of course, marketing databases are rarely simple. There are two challenges with these concepts. The first challenge is deciding on what is a start and stop, a decision that often depends on the type of business and available data. The second challenge is technical: finding these start and stop dates in available data may be less obvious than it first appears.

For subscription and account-based businesses, start and stop dates are well understood. Customers start magazine subscriptions at a particular point in time and end them when they no longer want to pay for the magazine. Customers sign up for telephone service, a banking account, ISP service, cable service, an insurance policy, or electricity service on a particular date and cancel on another date. In all of these cases, the beginning and end of the relationship is well defined.

Other businesses do not have such a continuous relationship. This is particularly true of transactional businesses, such as retailing, Web portals, and cata-logers, where each customers purchases (or visits) are spread out over time-or



may be one-time only. The beginning of the relationship is clear-usually the first purchase or visit to a Web site. The end is more difficult but is sometimes created through business rules. For instance, a customer who has not made a purchase in the previous 12 months may be considered lapsed. Customer retention analysis can produce useful results based on these definitions. A similar area of application is determining the point in time after which a customer is no longer likely to return (there is an example of this later in the chapter).

The technical side can be more challenging. Consider magazine subscriptions. Do customers start on the date when they sign up for the subscription? Do customers start when the magazine first arrives, which may be several weeks later? Or do they start when the promotional period is over and they start paying?

Although all three questions are interesting aspects of the customer relationship, the focus is usually on the economic aspects of the relationship. Costs and/or revenue begin when the account starts being used-that is, on the issue date of the magazine-and end when the account stops. For understanding customers, it is definitely interesting to have the original contact date and time, in addition to the first issue date (are customers who sign up on weekdays different from customers who sign up on weekends?), but this is not the beginning of the economic relationship. As for the end of the promotional period, this is really an initial condition or time-zero covariate on the customer relationship. When the customer signs up, the initial promotional period is known. Survival analysis can take advantage of such initial conditions for refining models.

What a Retention Curve Reveals

Once tenures can be calculated, they can be plotted on a retention curve, which shows the proportion of customers that are retained for a particular period of time. This is actually a cumulative histogram, because customers who have tenures of 3 months are included in the proportions for 1 month and 2 months. Hence, a retention curve always starts at 100 percent.

For now, lets assume that all customers start at the same time. Figure 12.1, for instance, compares the retention of two groups of customers who started at about the same point in time 10 years ago. The points on the curve show the proportion of customers who were retained for 1 year, for 2 years, and so on. Such a curve starts at 100 percent and gradually slopes downward. When a retention curve represents customers who all started at about the same time- as in this case-it is a close approximation to the survival curve.

Differences in retention among different groups are clearly visible in the chart. These differences can be quantified. The simplest measure is to look at retention at particular points in time. After 10 years, for instance, 24 percent of the regular customers are still around, and only about a third of them even make it to 5 years. Premium customers do much better. Over half make it to 5 years, and 42 percent have a customer lifetime of at least 10 years.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 [ 136 ] 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222