You are here

Search engines meet data-mining applications

Submitted by Karthik on 15 March, 2004 - 22:24

I came across Vivisimo yesterday, and was quite impressed by its clustering system. With my curiosity piqued, I googled (sic) around for other sites that offered new and improved searching capabilities, and was (pleasantly) surprised by what was on offer :) Read on for a list of what's out there, and my (rather brief) opinions on each of them.

  • Vivisimo: Probably has the most google-esque interface, with fast load times, and a purely text-based approach. From what I can tell, Vivisimo like most new breed search sites, does more cataloging/grouping than the actual indexing of pages. That bit is left to 3rd party sites such as Lycos, MSN et. al. But surprisingly the results are great. What might have needed about 3-4 refined search queries in google, is usually just reduced to a single click, as your query is broken down and grouped into separate sections, and listed sequentially. This works especially well when you submit an ambiguous query.

    Pros: Fast, efficient with a clean userfriendly layout.

    Cons: High dependence on 3rd-party sites. And a surprising number of sponsored listings which are probably scraped from Overture.

  • Grokker: This is the software that has been promising much better results than Google for a while now, and I decided to give the (whopping) 19.2 MB trial download a go. While Vivisimo was purely text-based, Grokker follows a highly visual mode of cataloging/mapping. The interface is quite intuitive for an experienced user, but I'm not entirely certain that neophytes will be able to use it effectively. I personally found that if used as recommended, Grokker shaves off a lot of time that you spend in refining your searches in Google; However, it takes an unbelievable amount of time downloading it's results and cataloging them, and takes a big bite out of your CPU usage. That's definitely not what you want happening at all.

    Pros: Very intuitive, innovative and logical. Good fun too :) There also seem to be a lot more "advanced" features that I didn't test out, such as the mapping systems, various plug-ins etc.

    Cons: Extremely slow; high CPU usage; 19.2MB download; prohibitively expensive at USD 49 (minimum).

  • Kartoo: Kartoo has been around for a couple of years now, and caught my eye initially as it is entirely written in Flash. This essentially seems to be a less-intesive, web-based form of Grokker, but with a mixture of text and visual interfaces. Both of them have a similar cataloging system using visual and interactive maps. While Kartoo makes excellent use of Flash in accomplishing this, it is restricted due to it's inherent restrictions.

    Pros: Besides the advantages of cataloging and mapping systems, it offers similar visual elements as Grokker on a web-based system, without the 19.2MB download; Reasonably fast, with a simple interface.

    Cons: Inherent limitations of any intensive implementation using Flash - the processor gets hammered; While the interface works, it is very tacky; Still have to wait a while to get your results;

  • Mooter: While Grokker required a download, Kartoo required Flash, Mooter offers something similar with a standard browser (with heavy use of javascript though). Still in it's beta stages, Mooter doesn't seem to be very intensive producing far fewer results than the other options.

    Pros: Text based; Simple and user-friendly interface, with the clean "clustering" map, increasing it's ease of use.

    Cons: Not intensive enough for some purposes; Still have to wait a while to get your results.

  • Eurekster: Another product still in it's beta stages, Eurekster moves away from the grouping/mapping/cataloging/clustering (sic) concept, and towards a more personalized system, by keeping track of your selections and weighting your choices. Another innovative idea is that similar people will need similar results. Therefore, a network of people (e.g. your work colleagues) can influence the search results of other members of the network (by way of the afore-mentioned tracking system). For example, Flash might mean Macromedia Flash to web-designers, and mean DC Comics' Flash (the superhero..), to comic book enthusiasts.

    Pros: Text based; Clean and user-friendly interface; Interesting and possibly useful concepts, which might make Eurekster invaluable for some people.

    Cons: Requires signups, logins and all that muck to be used effectively; Lack of the grouping concept (especially if you don't have/use any friends :P) puts it on the same rung as Google; Google however is aeons ahead and produces far more comprehensive results.

  • Dipsie: I came across these guys in my web logs. As per their website: "Our goal is to deliver more relevant results from a more complete index of the Web and enable our users to find what they're looking for within 2 clicks."

    Pros: N/A

    Cons: N/A

However, all said and done, for an average internet user, nothing comes even vaguely close to the power and reliability of Google. But I'm glad to see the competition making some innovative advances. Hopefully, Google will introduce it's own comprehensive grouping system, rather than relying on it's present rudimentary directory system. But for the moment, Google is still the Numero Uno in search engines. For the moment..