Up To Date Local Data (or not)

My partner Jason just told me about a Google Local search experience he and Ryan McIntyre just had. They were out last weekend guitar shopping. Jason is looking for a custom shop Gibson ES-335 and can’t find one anywhere. (If you know where to find one, please email me.) They had planned a day of hitting every guitar store in Denver. On Google local, they found a listing for Cadillac Guitars. It was near the Denver Guitar Center, so they decided to visit it.

They went to the address, but instead of a guitar shop, they found Rupps Drums. They figured that they must have made a mistake, because clearly this store had been around a while. After driving around the block and checking the web again, they were clueless and decided to go into the drum store. They asked about Cadillac Guitars and were informed that they moved out over 15 years ago.

15 years ago? Bad data – oops. Now – I can’t bash Google because Judy’s Book shows the same listing, as does Yahoo, as does Yelp, as does …  Someone needs to run a dedupe algorithm on addresses – oh – and fix the underlying data that everyone is using to seed their databases.

  • from experience, i am positive that yellow pages data providers, like, cough, axciom, know full well that the databases they sell by size contain listings in triplicate.

    after all, if the data was clean, they could only charge 1/3 as much. Oops!

  • Ugh. I’m too familliar with the crappiness of this data. The two data providers have little incentive to improve their data quality (they come from the ‘consumer is secondary’ Yellow Pages world).

    I have to believe that the industry is ripe for disruption, but I haven’t figured out how it’ll be disrupted yet.

  • No kidding. Just yesterday I drove to a non existent carwash in Denver because Google said it was there. At least Google had an easy way to let them know the data was wrong. Speaking of which, I think the interface they’ve got going on Google Maps and other Googly places will encourage others to help correct the data. Still, Jimbo might be right. Search is broken.

  • More likely you s/b blaming InfoUSA for this as I believe they supply many, if not most, of these sites w/their info. However, I’m still a bit surprised because they and Axciom keep up w/various sources for change of address info, newly incorporated businesses, bankruptcies and so on. Perhaps there’s some slippage and a small number of businesses changes fall through the cracks.

  • It is simply impossible to keep up with every single business opening/move/closure.

  • P-Air – my experience with several of these data providers is that they have huge problems with the underlying source data (dupes and errors). The dupes are something the portal / search provider can solve for; the errors are much harder (but there should be a way for users to correct the data when they discover that it is wrong.) Unfortunately, in the cases where the data correction is available (e.g. Judy’s Book) there is no way to get it back to the source data provider in a consistent fashion (e.g. no open API.) Blech. There’s a huge data issue here that is being replicated throughout multiple services – big interesting opportunity (duh).

  • evansink

    Up to date, well-geocoded, small business listing data has been a tough problem for at least a decade. The key players, InfoUSA, Acxiom and Amacai all source their primary records from telecom directories. They all have various techniques for verification, enhancement and updating – ranging from merging with dozens of other digital source records to outbound tele-services.

    The major challenge goes back to the intent behind these data collection efforts – these databases are created for direct marketing database licensing. While licensing for consumer web services has become common place, the data collection methods have not evolved to align with consumer search needs. So, while the data is pretty high quality for print mail campaigns and often have substantially enhanced the files with executive contact names, etc, the providers do little to update the content value and currency for retail mom-and-pop stores, which have a high frequency churn.

    No easy fix – while the instinct is to light a web 2.0 fire under the problem to collect, crawl, refine the data, the lack of an up-to-date internet presence for small businesses is a major stumbling block. While many SME’s have websites, the majority have not been touched for a year or years, so the reliability of data is problematic.

    The growing importance of consumer web services will certainly create motivation and incentive for business profile publishing, but it’s a very large problem. SME’s are not racing to the web, they generally are stuck in their own real world dealing with other problems and opportunities.

  • I tried AOL Local by doing a couple of searches and the results look like phone book listings with a setup for reviews in the future. The information is from InfoUSA. My question is: What’s so new about that? Name change? New Tab? etc. I’m not impressed with the following fact: 90% of all local search results on major search engines or portals are derived from the same directory information sources. The difference is how it’s ranked and displayed. WOW! So my next question is what has really changed for visitors that are searching locally in the last 3 to 5 years? Not phone book listings, that’s for sure. People don’t really need a computer for local phone numbers and a physical address. Let me know what you think.

  • Todd

    Check out freebase.com…they’re trying to solve this problem. (No, I don’t work there).