我将使用
SpanFirstQuery,它匹配字段开头附近的字词。由于所有跨度查询都依赖于位置,在对Lucene进行索引时默认启用。
让我们对其进行独立测试:您只需要提供您
SpanTermQuery可以找到该术语的最大位置即可(在我的示例中为一个)。
SpanTermQuery spanTermQuery = new SpanTermQuery(new Term("title", "lucene"));SpanFirstQuery spanFirstQuery = new SpanFirstQuery(spanTermQuery, 1);给定您的两个文档,如果您使用进行了分析,则该查询将仅找到标题为“ Lucene:Homepage”的第一个文档
StandardAnalyzer。
现在,我们可以将上述内容
SpanFirstQuery与普通的文本查询结合起来,并使第一个仅影响得分。您可以轻松地使用a
BooleanQuery并将span查询作为应子句放置,如下所示:
Term term = new Term("title", "lucene");TermQuery termQuery = new TermQuery(term);SpanFirstQuery spanFirstQuery = new SpanFirstQuery(new SpanTermQuery(term), 1);BooleanQuery booleanQuery = new BooleanQuery();booleanQuery.add(new BooleanClause(termQuery, BooleanClause.Occur.MUST));booleanQuery.add(new BooleanClause(spanFirstQuery, BooleanClause.Occur.SHOULD));可能有不同的方法可以达到相同的目的,可能使用
CustomScoreQuery过分或自定义代码来实现评分,但是在我看来,这是最简单的方法。
我用于测试的代码将打印以下输出(包括分数),该输出
TermQuery首先执行,然后执行唯一
SpanFirstQuery,最后执行合并
BooleanQuery:
------ TermQuery --------Total hits: 2title: I have a question about lucene - score: 0.26010898title: Lucene: I have a really hard question about it - score: 0.22295055------ SpanFirstQuery --------Total hits: 1title: Lucene: I have a really hard question about it - score: 0.15764984------ BooleanQuery: TermQuery (MUST) + SpanFirstQuery (SHOULD) --------Total hits: 2title: Lucene: I have a really hard question about it - score: 0.26912516title: I have a question about lucene - score: 0.09196242
这是完整的代码:
public static void main(String[] args) throws Exception { Directory directory = FSDirectory.open(new File("data")); index(directory); IndexReader indexReader = DirectoryReader.open(directory); IndexSearcher indexSearcher = new IndexSearcher(indexReader); Term term = new Term("title", "lucene"); System.out.println("------ TermQuery --------"); TermQuery termQuery = new TermQuery(term); search(indexSearcher, termQuery); System.out.println("------ SpanFirstQuery --------"); SpanFirstQuery spanFirstQuery = new SpanFirstQuery(new SpanTermQuery(term), 1); search(indexSearcher, spanFirstQuery); System.out.println("------ BooleanQuery: TermQuery (MUST) + SpanFirstQuery (SHOULD) --------"); BooleanQuery booleanQuery = new BooleanQuery(); booleanQuery.add(new BooleanClause(termQuery, BooleanClause.Occur.MUST)); booleanQuery.add(new BooleanClause(spanFirstQuery, BooleanClause.Occur.SHOULD)); search(indexSearcher, booleanQuery); } private static void index(Directory directory) throws Exception { IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, new StandardAnalyzer(Version.LUCENE_41)); IndexWriter writer = new IndexWriter(directory, config); FieldType titleFieldType = new FieldType(); titleFieldType.setIndexOptions(FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS); titleFieldType.setIndexed(true); titleFieldType.setStored(true); document document = new document(); document.add(new Field("title","I have a question about lucene", titleFieldType)); writer.adddocument(document); document = new document(); document.add(new Field("title","Lucene: I have a really hard question about it", titleFieldType)); writer.adddocument(document); writer.close(); } private static void search(IndexSearcher indexSearcher, Query query) throws Exception { TopDocs topDocs = indexSearcher.search(query, 10); System.out.println("Total hits: " + topDocs.totalHits); for (ScoreDoc hit : topDocs.scoreDocs) { document result = indexSearcher.doc(hit.doc); for (IndexableField field : result) { System.out.println(field.name() + ": " + field.stringValue() + " - score: " + hit.score); } } }


