Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 [ 114 ] 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

Sequential Analysis Using Association Rules

Association rules find things that happen at the same time-what items are purchased at a given time. The next natural question concerns sequences of events and what they mean. Examples of results in this area are:

New homeowners purchase shower curtains before purchasing furniture.

Customers who purchase new lawnmowers are very likely to purchase a new garden hose in the following 6 weeks.

When a customer goes into a bank branch and asks for an account reconciliation, there is a good chance that he or she will close all his or her accounts.

Time-series data usually requires some way of identifying the customer over time. Anonymous transactions cannot reveal that new homeowners buy shower curtains before they buy furniture. This requires tracking each customer, as well as knowing which customers recently purchased a home. Since larger purchases are often made with credit cards or debit cards, this is less of a problem. For problems in other domains, such as investigating the effects of medical treatments or customer behavior inside a bank, all transactions typically include identity information.

MvflTililliM In order to consider time-series analyses on your customers, there has to be some way of identifying customers. Without a way of tracking individual customers, there is no way to analyze their behavior over time.

For the purposes of this section, a time series is an ordered sequence of items. It differs from a transaction only in being ordered. In general, the time series contains identifying information about the customer, since this information is used to tie the different transactions together into a series. Although there are many techniques for analyzing time series, such as ARIMA (a statistical technique) and neural networks, this section discusses only how to manipulate the time-series data to apply the market basket analysis.

In order to use time series, the transaction data must have two additional features:

A timestamp or sequencing information to determine when transactions occurred relative to each other

Identifying information, such as account number, household ID, or customer ID that identifies different transactions as belonging to the same customer or household (sometimes called an economic marketing unit)



Building sequential rules is similar to the process of building association rules:

1. All items purchased by a customer are treated as a single order, and each item retains the timestamp indicating when it was purchased.

2. The process is the same for finding groups of items that appear together.

3. To develop the rules, only rules where the items on the left-hand side were purchased before items on the right-hand side are considered.

The result is a set of association rules that can reveal sequential patterns.

Lessons Learned

Market basket data describes what customers purchase. Analyzing this data is complex, and no single technique is powerful enough to provide all the answers. The data itself typically describes the market basket at three different levels. The order is the event of the purchase; the line-items are the items in the purchase, and the customer connects orders together over time.

Many important questions about customer behavior can be answered by looking at product sales over time. Which are the best selling items? Which items that sold well last year are no longer selling well this year? Inventory curves do not require transaction level data. Perhaps the most important insight they provide is the effect of marketing interventions-did sales go up or down after a particular event?

However, inventory curves are not sufficient for understanding relationships among items in a single basket. One technique that is quite powerful is association rules. This technique finds products that tend to sell together in groups. Sometimes is the groups are sufficient for insight. Other times, the groups are turned into explicit rules-when certain items are present then we expect to find certain other items in the basket.

There are three measures of association rules. Support tells how often the rule is found in the transaction data. Confidence says how often when the if part is true that the then part is also true. And, lift tells how much better the rule is at predicting the then part as compared to having no rule at all.

The rules so generated fall into three categories. Useful rules explain a relationship that was perhaps unexpected. Trivial rules explain relationships that are known (or should be known) to exist. And inexplicable rules simply do not make sense. Inexplicable rules often have weak support.



Market basket analysis and association rules provide ways to analyze item-level detail, where the relationships between items are determined by the baskets they fall into. In the next chapter, well turn to link analysis, which generalizes the ideas of items linked by relationships, using the background of an area of mathematics called graph theory.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 [ 114 ] 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222