我不认为您会找到想要查找返回String的方法的内容。您需要处理Attribute。
应该像这样工作:
Reader reader = new StringReader("This is a test string");NGramTokenizer gramTokenizer = new NGramTokenizer(reader, 1, 3);CharTermAttribute charTermAttribute = gramTokenizer.addAttribute(CharTermAttribute.class);gramTokenizer.reset();while (gramTokenizer.incrementToken()) { String token = charTermAttribute.toString(); //Do something}gramTokenizer.end();gramTokenizer.close();不过,如果此后需要重用,请确保将Tokenizer重置()。
每个注释的单词而不是字符的标记分组:
Reader reader = new StringReader("This is a test string");TokenStream tokenizer = new StandardTokenizer(Version.LUCENE_36, reader);tokenizer = new ShingleFilter(tokenizer, 1, 3);CharTermAttribute charTermAttribute = tokenizer.addAttribute(CharTermAttribute.class);while (tokenizer.incrementToken()) { String token = charTermAttribute.toString(); //Do something}


