首先,正则表达式
[aA-zZ]*不会执行您认为的操作。这意味着“匹配零个或多个
aS或字符ASCII之间的范围内
A和ASCII
z(其还包括
[,
],及其它),或
ZS”。因此,它也匹配空字符串。
假设您只在寻找不重复的单词,该单词仅由ASCII字母组成,不区分大小写,保留第一个单词(这意味着您不希望匹配
"it's it's"或
"oléolé!"),那么您可以在单个regex操作中做到这一点:
String result = subject.replaceAll("(?i)\b([a-z]+)\b(?:\s+\1\b)+", "$1");将会改变
Hello hello Hello there there past pastures
进入
Hello there past pastures
说明:
(?i) # Mode: case-insensitiveb # Match the start of a word([a-z]+) # Match one ASCII "word", capture it in group 1b # Match the end of a word(?: # Start of non-capturing group: s+ # Match at least one whitespace character 1 # Match the same word as captured before (case-insensitively) b # and make sure it ends there.)+ # Repeat that as often as possible
看到它住在regex101.com。



