Brad's Books and Organizations

Books

Books

Organizations

Organizations

Hi, I’m Brad Feld, a managing director at the Foundry Group who lives in Boulder, Colorado. I invest in software and Internet companies around the US, run marathons and read a lot.

« swipe left for tags/categories

swipe right to go back »

The Growth of Lucene

Comments (15)

All of a sudden I’m seeing Lucene (and Nutch and Solr) being adopted by a bunch of my companies.  While several have been using it for a while, it’s spreading rapidly.  I’m curious about the experience folks have had with it – everyone I’ve talked to is thrilled; I’ve yet to hear from someone that said “it didn’t work for me and I decided to use ‘X’ instead.”  Any feedback?

  • http://15meanings.com Will

    Brad,
    We have used it for a couple of projects and it works extremely well. It gave us indexing ability to data that we previously did not have access to. I do not know all the tech details, but from a product guys side of things, it works well!

    Will

  • Ian Spivey

    I used Lucene for a bioinformatics tool platform back in 2004 and it worked very well. There was a bit of a learning curve from a developer’s perspective, which I imagine has flattened after a couple years of documentation. Saved a lot of time and effort overall, however.

  • Josh

    At IBM we’ve incorporated Lucene into our no-charge enterprise search offering: http://omnifind.ibm.yahoo.com/productinfo.php

  • Chris

    There’s not really anything else you could use instead :-) The Lucene API isn’t the most, shall we say, intuitive, but it gets the job done.

  • http://http:.//fishdujour.typepad.com Gavin

    Lucene is the de-facto freetext indexing toolkit in the Java space.

    By itself it only accepts text. This means you have to use adapters to extract text from files (e.g. Word documents). Typically the Jakarta POI project is recommended (for most common MS doc formats), but it doesn’t seem too active these days so I would have concerns with how it would play with Office Vista docs for example. The commercial offerings have good document format support, but are not friendly from a dev point of view (their Java toolkits suck – at least the ones I’m familiar with).

  • http://blog.simpy.com/ Otis Gospodnetic

    Brad:
    Lucene is good, if I may say so. It’s been around for nearly a decade. One of the companies you invested in in (and I worked at) cca. 2000 used Lucene – Neomeo. I make *heavy* use of Lucene in Simpy.

    Josh:
    Not really true – you could use a number of similar (and free and/or open-source) tools – have a look at http://www.simpy.com/user/otis/search/retrieval for some of them. You will also find a few alternatives mentioned in my Lucene in Action book – see http://lucenebook.com/ .

    Regarding the API not being friendly – please provide more feedback. I’m one of Lucene developers, so I’m curious about what’s not intuitive.

  • http://blog.simpy.com/ Otis Gospodnetic

    Another one for Josh and anyone else interested in Lucene and its alternatives:
    http://searchcafe.blogspot.com/2007/03/open-source-search-engines-in-java-and.html

  • Peter Delahunty

    You might also be interested in http://www.opensymphony.com/compass/

    This sits on top of lucene and ORM frameworks like Hibernate. Basically it gives you lucene search for your database. So when you search it returns object through hibernate etc. Also when you update the domain model through hibernate it updates the the search index.

    If lucene is like JDBC then Compass is like Hibernate for search.

  • http://feh.holsman.net Ian Holsman

    you might want to also look at some of the ‘newer’ technologies coming out of the same space.

    Hadoop http://lucene.apache.org/hadoop a framework for running apps on large clusters of hardware

    UIMA – http://incubator.apache.org/uima/ a framework for analyzing large amounts of unstructured data

  • http://www.clickability.com Jeff Freund

    Lucene is a fantastic project – it is the core search technology within our SaaS Web Content Management platform. We have deployed Lucene in the form of stand alone “search servers” – like solr, but we built long before solr was out. This provides a service based approach for both in-application search, and published website search, both of which it is excellent for.

    This replaced a database based full text search, providing a much higher degree of performance, scalability, customization and flexibility.

  • mnorusis

    Some great stuff can be done with Lucene by even a mid level java developer. In the hands of a more advanced Java developer, you can get just about anything you could need out of an indexer.

    We love its flexibility.

  • Jeff Rodenburg

    We have gone through both Lucene.Net as well as Solr, mixing and matching the technologies as we needed them. We’re eventually migrating everything to Solr, as it keeps us very flexible and allows us to take advantage of a bigger community on the java/linux side of Lucene than what’s presently found on the dotnet/win side.

  • Robert Selvaraj

    SearchBlox is an enterprise search product that uses Lucene. SearchBlox provides out-of-the-box search capabilities. http://www.searchblox.com/

  • http://loudervoice.com Conor O'Neill

    We just rebuilt our search backend for the LouderVoice launch using Lucene + PyLucene. It's working out very well so far for review search. The language stemming and multi-lingual capabilities really help us hugely.

  • http://loudervoice.com Conor O’Neill

    We just rebuilt our search backend for the LouderVoice launch using Lucene + PyLucene. It’s working out very well so far for review search. The language stemming and multi-lingual capabilities really help us hugely.

Build something great with me