Lucene là cái gì

  • Field.Keyword – The data is stored and indexed but not tokenized. This is most useful for data that should be stored unchanged such as a date. In fact, the Field.Keyword can take a Date object as input.
  • Field.Text – The data is stored, indexed, and tokenized. Field.Text fields should not be used for large amounts of data such as the article itself because the index will get very large since it will contain a full copy of the article plus the tokenized version.
  • Field.UnStored – The data is not stored but it is indexed and tokenized. Large amounts of data such as the text of the article should be placed in the index unstored.
  • Field.UnIndexed – The data is stored but not indexed or tokenized. This is used with data that you want returned with the results of a search but you won’t actually be searching on this data. In our example, since we won’t allow searching for the URL there is no reason to index it but we want it returned to us when a search result is found.

Specifying Search Criteria

Lucene supports a wide array of possible searches including AND OR and NOT, fuzzy searches, proximity searches, wildcard searches, and range searches. Let’s take a look at a couple of examples:

Find all of Professor Henry’s articles that contain relativity and quantum physics:

author:Henry relativity AND “quantum physics”

Find all the articles that contain the phrase “string theory” and don’t contain Einstein:

“string theory” NOT Einstein

Find all the articles that contain Kepler within five words of Galileo:

“Galileo Kepler”~5

Find all the articles that Professor Johnson wrote in January of this year:

author:Johnson date:[01/01/2004 TO 01/31/2004]

đọc phần cuối, có một câu buồn cười quá: Like many of the Jakarta projects, the documentation for Lucene is not very good, but with a little trial and error you should be able to get Lucene working.


làm về mấy vụ index, search siêc này, mới biết, nào là Jakarta Lucenee, nào là Tokyo Cabinet, sao Hà Nội không có cái gì nhỉ ;)), Hànoi xì tin chẳng hạn =))

Thằng bạn làm cùng mình, tên là Phùng Văn Huy – , hôm nọ nó nghịch cho dòng “hacked by huyphungvan :-s” vào trang, mấy hôm sau quên mất. Hôm qua, anh Giang lên cty kể anh đang đi đường, đối tác vdc gọi điện hỏi “website bị hack à”, anh Giang hỏi “ai hack”, “thằng huy phùng văn” =))

rồi còn khi nhấn nút upload mà chưa chọn file, thì hiện ra thông báo “chọn file đi nhé bạn hiền” =)) rồi mặt cười mặt mếu đủ cả trên các thông báo của site =))

làm site cho vdc mà cứ như làm site xì tin 9x =))

ps: bị chỉnh hết rồi, các bạn vào test giờ ko thấy mấy cai hay ho ấy đâu ;))

4 thoughts on “Lucene là cái gì”

  1. Oh my goodness! Amazing article dude! Many thanks, However I am experiencing issues with your RSS.
    I don’t understand why I am unable to subscribe to
    it. Is there anybody getting similar RSS issues?
    Anyone who knows the answer will you kindly respond? Thanx!!

  2. I noticed your website’s ranking in google’s
    search results is very low. You are loosing
    a lot of traffic. You need hi PR backlinks to rank in top
    ten. I know – buying them is too expensive. It’s better to own them.
    I know how to do that, simply google it:
    Polswor’s Backlinks Source


Leave a Reply