« « Previous Post | Next Post » »

Prominence of APIs == death of “Web Scraping”?

October 11, 07 by Bharani

I did a brief research on the sites that are exposing the information to the world (other websites that is). Leading websites like Yahoo, Google, Technorati, Amazon, CNET, eBay to name a few are in the forefront when it comes to exposing the information through APIs.

What is the rationale to expose the information, that is supposed to be an asset or “Competitive Advantage”?. The rationale is pretty evident when you observe the way the information is exposed and the way the exposure drives traffic back to the originating site!

This article in read/write web summarizes the thought and concept beautifully. The internet is transforming to one giant structured database…

ProgrammableWeb neatly summarizes the APIs from various websites. As of today, there are 523 APIs available.

In Indian Context, the APIs haven’t caught up in a big scale. So the practice of “Web Scraping” looms large. Huge number of Yellow page companies boasting lakhs of business-details illustrates the effect of “Web scraping”. Same set of data problems can be seen with each Yellow page company for the simple reason that everyone copies the same set of data and repackages it. No one gives credit back to the other site.

When Bixee.com offered a one-stop search for all Jobs in India, by crawling and extracting information from all leading Job portals including Naukri, Naukri reacted negatively by filing a law-suit against Bixee. Later Naukri realized the complimentary nature of Bixee in generating traffic back to them and decided to expose the data in legal channels. If the information is scraped for the purpose of “Search and Redirection” to the original site, “Scraping” has a positive effect.

I cannot remember a single service that is successful through “Web Scraping” without redirecting back to the original site.

This entry no have comments... but you can be first.

Leave a Reply