在Java中创建镶木地板文件

不建议使用ParquetWriter的构造函数（1.8.1），但不建议使用ParquetWriter本身，您仍然可以通过扩展其中的abstract
Builder子类来创建ParquetWriter。

这里是实木复合地板创建者自己的示例ExampleParquetWriter：

  public static class Builder extends ParquetWriter.Builder<Group, Builder> {    private MessageType type = null;    private Map<String, String> extrametaData = new HashMap<String, String>();    private Builder(Path file) {      super(file);    }    public Builder withType(MessageType type) {      this.type = type;      return this;    }    public Builder withExtrametaData(Map<String, String> extrametaData) {      this.extrametaData = extrametaData;      return this;    }    @Override    protected Builder self() {      return this;    }    @Override    protected WriteSupport<Group> getWriteSupport(Configuration conf) {      return new GroupWriteSupport(type, extrametaData);    }  }

如果您不想使用Group和GroupWriteSupport（捆绑在Parquet中，但仅用作数据模型实现的示例），则可以使用Avro，协议缓冲区或Thrift内存中数据模型。这是一个使用Avro编写Parquet的示例：

try (ParquetWriter<GenericData.Record> writer = AvroParquetWriter        .<GenericData.Record>builder(fileToWrite)        .withSchema(schema)        .withConf(new Configuration())        .withCompressionCodec(CompressionCodecName.SNAPPY)        .build()) {    for (GenericData.Record record : recordsToWrite) {        writer.write(record);    }}

您将需要以下依赖项：

<dependency>    <groupId>org.apache.parquet</groupId>    <artifactId>parquet-avro</artifactId>    <version>1.8.1</version></dependency><dependency>    <groupId>org.apache.parquet</groupId>    <artifactId>parquet-hadoop</artifactId>    <version>1.8.1</version></dependency>

完整的例子在这里。

在Java中创建镶木地板文件

面试问答相关栏目本月热门文章