MIH India Launches OneFamily

October 29, 07 by Bharani

OneFamily

MIH India has launched a Family networking product called OneFamily, aiming to provide an online colloboration and socialization platform to family members and close friends. One can add his/her Family tree, add anecdotes about family members, share photos and messages with family members, make announcements about events/important occasions etc.,

Enabling users to handle their family tree easily and intuitively is not an easy job. Onefamily has managed to do it as simple as possible. I really liked the “Take a Tour” section and it essentially summarizes what the Product is all about.

Text Books Online!

October 14, 07 by Bharani

Department of School Education, TamilNadu has done an excellent job. They have made available all the text-books (State Board syllabus) from Standard 1 till Standard 12 in PDF Form. Each chapter is available as a separate download. The books are available both in English and Tamil. These kind of initiatives from government bring a great deal of benefit to students.

Going through the books of Standard 11 and 12, brought back the memories of good old days!

The Syllabus have changed. Working with Office documents (Excel, Word, Presentation), Database and Object oriented programming are emphasized in the current syllabus of Computer science. It was BASIC and FORTRAN programming during my days (1995) :)

Prominence of APIs == death of “Web Scraping”?

October 11, 07 by Bharani

I did a brief research on the sites that are exposing the information to the world (other websites that is). Leading websites like Yahoo, Google, Technorati, Amazon, CNET, eBay to name a few are in the forefront when it comes to exposing the information through APIs.

What is the rationale to expose the information, that is supposed to be an asset or “Competitive Advantage”?. The rationale is pretty evident when you observe the way the information is exposed and the way the exposure drives traffic back to the originating site!

This article in read/write web summarizes the thought and concept beautifully. The internet is transforming to one giant structured database…

ProgrammableWeb neatly summarizes the APIs from various websites. As of today, there are 523 APIs available.

In Indian Context, the APIs haven’t caught up in a big scale. So the practice of “Web Scraping” looms large. Huge number of Yellow page companies boasting lakhs of business-details illustrates the effect of “Web scraping”. Same set of data problems can be seen with each Yellow page company for the simple reason that everyone copies the same set of data and repackages it. No one gives credit back to the other site.

When Bixee.com offered a one-stop search for all Jobs in India, by crawling and extracting information from all leading Job portals including Naukri, Naukri reacted negatively by filing a law-suit against Bixee. Later Naukri realized the complimentary nature of Bixee in generating traffic back to them and decided to expose the data in legal channels. If the information is scraped for the purpose of “Search and Redirection” to the original site, “Scraping” has a positive effect.

I cannot remember a single service that is successful through “Web Scraping” without redirecting back to the original site.

Better way to search Trains!

October 10, 07 by Bharani

One of my colleague remarked about a mash-up called eRail.in, a service to search trains in a better way. The site simply abstracts the difficulties or unpleasant user experience that one faces in the official website of Indian Railways.

The way eRail.in accomplishes that is simple. Most of the information about Indian trains are static such as Train numbers, Station codes, Timings, Route, Price, Class etc., So they have scraped the data and made the search experience faster. The “availability” et al comes in to picture only after narrowing down the afore-mentioned parameters. At that point of time, the site simply connects you to the Indian Railways website.

Unlike the fares of Airlines, the fares of Railways remain fairly static. So there is no overhead for real-time fare extraction from different airlines(like ixigo does!).

I couldn’t figure out the person (or people) behind it. My guess is a techie, who did this to exhibit his Mash-up skills! Nevertheless, a good utility (in the context of relatively bad experience in Indian railways websites Train Enquiry & IRCTC).

UPDATE: 5Map is the company behind this initiative

People Search!

September 29, 07 by Bharani

I realized that Linkedin.com, a leading professional networking platform and Orkut.com, leading social networking site are working towards exposing their respective data through APIs. This is great news to “People Search Engines” like Pipl.com and Spock.com. Till now these engines have been crawling all public urls of Social networking sites, name databases and blogs (technorati does host the details of people behind the registered blogs). Even with the limited information extraction, the quality of search has been good. With the exposure of the data from the social networking sites, the quality of search would improve highly because of rich and structured information available.

Try a search against your name in Pipl.com or Spock.com.

Break….

September 26, 07 by Bharani

What I have done to my blog is absolute injustice. A post a week, fortnight or even a month would have been acceptable. Anyways, this time, the lack of posts is not atrributed to busy-schedule, but to lack of ‘ideas’ to ponder and to an extent lack of interest.

I promised to myself at the beginning of the year that I will publish 150 posts…I have hardly published 15 posts. Just 10% of the target! But fortunately the game is not over yet :) Unfortunately the equation is tough with 3 months to go, I have 135 posts to publish..

Few updates from my side…Have decided to take a break from my career after a long-thought. Now that DWAAR is launched, there is no better logical exit. I will spend 2-3 months before choosing and accepting my next responsibility. As of now, I will try to do things that I wanted to do but couldn’t do…Try my hands at new things, Travel, Fitness, Family, …and of course blogging…

Looking forward for the break in a month’s time.

DWAAR Enhanced

August 14, 07 by Bharani

Over the last 2 months, we have worked behind the scenes to significantly enhance DWAAR. The new version of DWAAR will be released online by evening today. Watch out! More later….

Human-powered Search!

July 25, 07 by Bharani

Killerstartups.com, a user-driven internet start-up community, profiles a whopping 249 search-related products! There are various flavours of Search engines originating from different parts of the globe.

Wikipedia, the Free encyclopedia, lists down around 100 odd search engines. It’s interesting to see the way they have classified these search engines in to 29 categories.

Off late, I am fascinated by the concept of human-powered search engines like www.Mahalo.com & www.chacha.com. They focus on serving high-quality relevant results using the human intelligence. This is accomplished through the people-centric approach, whereby experts hand-pick best sites and difficult-to-find sites under each category and add them to the search engine. This approach clearly overcomes the problem that the mega search engines like Google, Yahoo et al poses. For example, the PageRank algorithm of Google prevents new, niche, less SEO’ed sites and not-so-often-quoted sites to figure on top results. Many a times, such sites turn out to be the most relevant ones.

Until the “artificial intelligence”, “machine-learning”, “natural-language processing”, “neural-networks” and other esoteric concepts gather some traction and shape, the human-powered approach is a dependable way of solving the problem. But there will be questions raised like “How to do you scale it up?”, “How many years, people you need to accomplish this?”, “How will you eliminate the subjectivity and bias of the experts?”. This is where I think Mahalo and Chacha are doing a good job. They invite users/volunteers to hand-pick the results. They also have a good reputation system which enforces sanctity and validity on the user-contributions. The focus is on top search terms and to add the long-tail terms as and when a demand is seen. The approach eliminates spam-sites, copy-paste sites, SEO-oriented blogs et al., The game of “online advertising and marketing” has clearly spoilt the quality of search results on major search engine. It is refreshing to see search engines that are powered by humans and that are clean from advertising gamers.

Mahalo.com: Mahalo is the world’s first human-powered search engine powered by an enthusiastic and energetic group of Guides.

Chacha.com: The first search engine that uses the brainpower of really smart people to find anything you want on the Internet.

More on DWAAR…

June 12, 07 by Bharani

DWAAR.com is available for public view now, but under “Labs”.

I am handling the “cities” vertical and to an extent “Restaurants” vertical also. Both of these verticals have similar requirement as far as product features and data are concerned. Both of these verticals face common problems too! There are two key aspects of this product that should come out very strongly, but doesn’t: The Search quality (Ability to pull up relevant results intelligently based upon query keywords. The efficiency lies in avoiding both false positives and false negatives!) and the Data Quality.

Though the Search quality is at manageable levels and can be solved in the short-term at technology level, the Data quality still remains an unsolved mystery. Following are some of the major problems that we face with the Data:

1. In-accurate Data. Wrong phone numbers, wrong addresses, wrong names etc.,
2. Mis-classification of Data. Business records being classified in to wrong categories.
3. Lack of Depth: The records does not have enriched information. example: What products are services the businesses offer? What are the special features? etc.,
4. Lack of coverage: There aren’t sufficient records for some categories and cities.
5. In-complete Data. The necessary fields are found missing.
6. Duplication of Data.

Our immediate goal is to rectify the above problems and we are doing it earnestly through various approaches.

The problems that are mentioned above are industry-wide, especially in India. That doesn’t justify the problems that we have in the product, but atleast it gives us a chance to be a differentiated product in this space (with high focus of data quality).

I am so desperate to increase the high-quality data corpus for “local” that I am digitizing the visiting cards of mine and my friends during my free time. Now-a-days, when I go shopping, I don’t forget to collect the visiting card. The businesses/vendors/service providers that I come across will definitely be available in DWAAR in few weeks time :) Once this data gets in to Production, I don’t even have to worry about missed visiting cards. I just go to DWAAR and search for the information :)…I definitely take care not to digitize personal visiting cards (of working professionals)! If any of you would like to add good businesses/vendors to the database, please ping me (@ bharani at gmail.com)…I will be very grateful to talk with you and for your help.

I envision that “DWAAR” would be the place to go for finding any business and associated contact details for Indians.

Internal Launch

April 09, 07 by Bharani

The product that we have been working on for past few months, was finally demonstrated to the South African Top management. We call this “Internal Launch”. The reception to the demo was very positive and the team was excited and upbeat.

There are still issues to be sorted out. There are still changes/features to be implemented. But, the product is shaping up well. We will be launching the product shortly as Public Beta. The marketing plans are being charted out. Since this is my first involvement with a commercial internet product, I am excited and looking forward for all the actions.

Commercial-grade Internet Products appear simpler and easier-to-manage from outside. But the amount of Technology involvement is huge! Especially on the Hardware, network, Data centers etc., Excellent Technology Operations team is key to the success of Internet product. Not just that, there are umpteen other issues to be taken care of. Devil is definitely in Details!

Scalability [Ability to accommodate more traffic, more users, more features, more data], Performance [Ability to load pages faster, do transactions faster], Ease of use [Ability to complete the intent with minimal effort and with little learning from end-users], Tolerance to Failure [Fault-tolerance, graceful degradation, Business continuity etc.,] and Discoverability [Ability to be identified by Search Engines (SEO)] are some of the things that should be thought about obligatorily!

The aspect of Business Development in our Product merits scrutiny in-depth. I will dedicate a post for this later.