您可以通过编写自己的Tokenizer轻松地做到这一点。
例如,以下Tokenizer的行为与默认行为相同,但将跳过没有正确列数的任何行。
public class SkipBadColumnCountTokenizer extends Tokenizer { private final int expectedColumns; private final List<Integer> ignoredLines = new ArrayList<>(); public SkipBadColumnCountTokenizer(Reader reader, CsvPreference preferences, int expectedColumns) { super(reader, preferences); this.expectedColumns = expectedColumns; } @Override public boolean readColumns(List<String> columns) throws IOException { boolean moreInputExists; while ((moreInputExists = super.readColumns(columns)) && columns.size() != this.expectedColumns){ System.out.println(String.format("Ignoring line %s with %d columns: %s", getLineNumber(), columns.size(), getUntokenizedRow())); ignoredLines.add(getLineNumber()); } return moreInputExists; } public List<Integer> getIgnoredLines(){ return this.ignoredLines; }}以及使用此Tokenizer进行的简单测试…
@Testpublic void testInvalidRows() throws IOException { String input = "column1,column2,column3n" + "has,three,columnsn" + "only,twon" + "onen" + "three,columns,againn" + "one,too,many,columns"; CsvPreference preference = CsvPreference.EXCEL_PREFERENCE; int expectedColumns = 3; SkipBadColumnCountTokenizer tokenizer = new SkipBadColumnCountTokenizer( new StringReader(input), preference, expectedColumns); try (ICsvBeanReader beanReader = new CsvBeanReader(tokenizer, preference)) { String[] header = beanReader.getHeader(true); TestBean bean; while ((bean = beanReader.read(TestBean.class, header)) != null){ System.out.println(bean); } System.out.println(String.format("Ignored lines: %s", tokenizer.getIgnoredLines())); }}打印以下输出(注意如何跳过所有无效行):
TestBean{column1='has', column2='three', column3='columns'}Ignoring line 3 with 2 columns: only,twoIgnoring line 4 with 1 columns: oneTestBean{column1='three', column2='columns', column3='again'}Ignoring line 6 with 4 columns: one,too,many,columnsIgnored lines: [3, 4, 6]


