All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.apache.parquet.avro.package-info Maven / Gradle / Ivy

/* 
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 * 
 *   http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */
/**
 *
 * 

* Provides classes to store Avro data in Parquet files. Avro schemas are converted to * parquet schemas as follows. Only record schemas are converted, * other top-level schema types are not converted and attempting to do so will result * in an error. Avro types are converted to Parquet types using the mapping shown here: *

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Avro to Parquet type mapping
Avro typeParquet type
nullno type (the field is not encoded in Parquet), unless a null union
booleanboolean
intint32
longint64
floatfloat
doubledouble
bytesbinary
stringbinary (with original type UTF8)
recordgroup containing nested fields
enumbinary (with original type ENUM)
arraygroup (with original type LIST) containing one repeated group field
mapgroup (with original type MAP) containing one repeated group * field (with original type MAP_KEY_VALUE) of (key, value)
fixedfixed_len_byte_array
unionan optional type, in the case of a null union, otherwise not supported
* *

* For Parquet files that were not written with classes from this package there is no * Avro write schema stored in the Parquet file metadata. To read such files using * classes from this package you must either provide an Avro read schema, * or a default Avro schema will be derived using the following mapping. *

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Parquet to Avro type mapping
Parquet typeAvro type
booleanboolean
int32int
int64long
int96not supported
floatfloat
doubledouble
fixed_len_byte_arrayfixed
binary (with no original type)bytes
binary (with original type UTF8)string
binary (with original type ENUM)string
group (with original type LIST) containing one repeated group fieldarray
group (with original type MAP) containing one repeated group * field (with original type MAP_KEY_VALUE) of (key, value)map
* *

* Parquet fields that are optional are mapped to an Avro null union. *

* *

* Some conversions are lossy. Avro nulls are not represented in Parquet, * so they are lost when converted back to Avro. Similarly, a Parquet enum does not * store its values, so it cannot be converted back to an Avro enum, * which is why an Avro string had to suffice. Type names for nested records, enums, * and fixed types are lost in the conversion to Parquet. * Avro aliases, default values, field ordering, and documentation strings are all * dropped in the conversion to Parquet. * * Parquet maps can have any type for keys, but this is not true in Avro where map keys * are assumed to be strings. *

*/ package org.apache.parquet.avro;




© 2015 - 2024 Weber Informatics LLC | Privacy Policy