Промышленный лизинг Промышленный лизинг  Методички 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 [ 120 ] 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222

Hubs and Authorities in Practice

The strongest case for the advantage of adding link analysis to text-based searching comes from the market place. Google, a search engine developed at Stanford by Sergey Brin and Lawence Page using an approach very similar to Klein-bergs, was the first of the major search engines to make use of link analysis to find hubs and authorities. It quickly surpassed long-entrenched search services such as AltaVista and Yahoo! The reason was qualitatively better searches.

The authors noticed that something was special about Google back in April of 2001 when we studied the web logs from our companys site, www .data-miners.com. At that time, industry surveys gave Google and AltaVista approximately equal 10 percent shares of the market for web searches, and yet Google accounted for 30 percent of the referrals to our site while AltaVista accounted for only 3 percent. This is apparently because Google was better able to recognize our site as an authority for data mining consulting because it was less confused by the large number of sites that use the phrase data mining even though they actually have little to do with the topic.

Case Study: Who Is Using Fax Machines from Home?

Graphs appear in data from other industries as well. Mobile, local, and longdistance telephone service providers have records of every telephone call that their customers make and receive. This data contains a wealth of information about the behavior of their customers: when they place calls, who calls them, whether they benefit from their calling plan, to name a few. As this case study shows, link analysis can be used to analyze the records of local telephone calls to identify which residential customers have a high probability of having fax machines in their home.

Why Finding Fax Machines Is Useful

What is the use of knowing who owns a fax machine? How can a telephone provider act on this information? In this case, the provider had developed a package of services for residential work-at-home customers. Targeting such customers for marketing purposes was a revolutionary concept at the company. In the tightly regulated local phone market of not so long ago, local service providers lost revenue from work-at-home customers, because these customers could have been paying higher business rates instead of lower residential rates. Far from targeting such customers for marketing campaigns, the local telephone providers would deny such customers residential rates- punishing them for behaving like a small business. For this company, developing and selling work-at-home packages represented a new foray into customer service. One question remained. Which customers should be targeted for the new package?



There are many approaches to defining the target set of customers. The company could effectively use neighborhood demographics, household surveys, estimates of computer ownership by zip code, and similar data. Although this data improves the definition of a market segment, it is still far from identifying individual customers with particular needs. A team, including one of the authors, suggested that the ability to find residential fax machine usage would improve this marketing effort, since fax machines are often (but not always) used for business purposes. Knowing who uses a fax machine would help target the work-at-home package to a very well-defined market segment, and this segment should have a better response rate than a segment defined by less precise segmentation techniques based on statistical properties.

Customers with fax machines offer other opportunities as well. Customers that are sending and receiving faxes should have at least two lines-if they only have one, there is an opportunity to sell them a second line. To provide better customer service, the customers who use faxes on a line with call waiting should know how to turn off call waiting to avoid annoying interruptions on fax transmissions. There are other possibilities as well: perhaps owners of fax machines would prefer receiving their monthly bills by fax instead of by mail, saving both postage and printing costs. In short, being able to identify who is sending or receiving faxes from home is valuable information that provides opportunities for increasing revenues, reducing costs, and increasing customer satisfaction.

The Data as a Graph

The raw data used for this analysis was composed of selected fields from the call detail data fed into the billing system to generate monthly bills. Each record contains 80 bytes of data, with information such as:

The 10-digit telephone number that originated the call, three digits for the area code, three digits for the exchange, and four digits for the line

The 10-digit telephone number of the line where the call terminated

The 10-digit telephone number of the line being billed for the call

The date and time of the call

The duration of the call

The day of the week when the call was placed

Whether the call was placed at a pay phone

In the graph in Figure 10.8, the data has been narrowed to just three fields: duration, originating number, and terminating number. The telephone numbers are the nodes of the graph, and the calls themselves are the edges, weighted by the duration of the calls. A sample of telephone calls is shown in Table 10.1.




Figure 10.8 Five calls link together seven telephone numbers.

Table 10.1 Five Telephone Calls

ORIGINATING NUMBER

TERMINATING NUMBER

DURATION

353-3658

350-5166

00:00:41

353-3068

350-5166

00:00:23

353-4271

353-3068

00:00:01

353-3108

555-1212

00:00:42

353-3108

350-6595

00:01:22

The Approach

Finding fax machines is based on a simple observation: Fax machines tend to call other fax machines. A set of known fax numbers can be expanded based on the calls made to or received from the known numbers. If an unclassified telephone number calls known fax numbers and doesnt hang up quickly, then there is evidence that it can be classified as a fax number. This simple characterization



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 [ 120 ] 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222