这将做得很好。我对句子的定义:句子以非空格开头,以句点,感叹号或问号(或字符串结尾)结尾。标点符号后可能会有一个结束语。
[^.!?s][^.!?]*(?:[.!?](?!['"]?s|$)[^.!?]*)*[.!?]?['"]?(?=s|$)
import java.util.regex.*;public class TEST { public static void main(String[] args) { String subjectString = "This is a sentence. " + "So is "this"! And is "this?" " + "This is 'stackoverflow.com!' " + "Hello World"; String[] sentences = null; Pattern re = Pattern.compile( "# Match a sentence ending in punctuation or EOS.n" + "[^.!?\s] # First char is non-punct, non-wsn" + "[^.!?]* # Greedily consume up to punctuation.n" + "(?: # Group for unrolling the loop.n" + " [.!?] # (special) inner punctuation ok ifn" + " (?!['"]?\s|$) # not followed by ws or EOS.n" + " [^.!?]* # Greedily consume up to punctuation.n" + ")*# Zero or more (special normal*)n" + "[.!?]? # Optional ending punctuation.n" + "['"]? # Optional closing quote.n" + "(?=\s|$)", Pattern.MULTILINE | Pattern.COMMENTS); Matcher reMatcher = re.matcher(subjectString); while (reMatcher.find()) { System.out.println(reMatcher.group()); } }}这是输出:
This is a sentence.
So is "this"!
And is "this?"
This is 'stackoverflow.com!'
Hello World
正确地匹配所有这些(最后一个句子没有结尾标点符号),看起来似乎并不那么容易!



