org.apache.drill.exec.physical.resultSet.package-info Maven / Gradle / Ivy
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* Provides a second-generation row set (AKA "record batch") writer used
* by client code to
* - Define the schema of a result set.
* - Write data into the vectors backing a row set.
*
*
Terminology
* The code here follows the "row/column" naming convention rather than
* the "record/field" convention.
*
* - Result set
* - A set of zero or more row sets that hold rows of data.
-
*
- Row set
* - A collection of rows with a common schema. Also called a "row
* batch" or "record batch." (But, in Drill, the term "record batch" also
* usually means an operator on that set of records. Here, a row set is
* just the rows &nash; separate from operations on that data.
* - Row
* - A single row of data, in the usual database sense. Here, a row is
* a kind of tuple (see below) allowing both name and index access to
* columns.
* - Tuple
* - In relational theory, a row is a tuple: a collection of values
* defined by a schema. Tuple values are indexed by position or name.
* - Column
* - A single value within a row or row set. (Generally, the context
* makes clear if the term refers to single value or all values for a
* column for a row set. Columns are backed by value vectors.
* - Map
* - In Drill, a map is what other systems call a "structure". It is,
* in fact, a nested tuple. In a Java or Python map, each map instance has
* a distinct set of name/value pairs. But, in Drill, all map instances have
* the same schema; hence the so-called "map" is really a tuple. This
* implementation exploits that fact and treats the row, and nested maps,
* almost identically: both provide columns indexed by name or position.
* - Row Set Mutator
* - An awkward name, but retains the "mutator" name from the previous
* generation. The mechanism to build a result set as series of row sets.
* - Tuple Loader
* - Mechanism to build a single tuple (row or map) by providing name
* or index access to columns. A better name would b "tuple writer", but
* that name is already used elsewhere.
* - Column Loader
* - Mechanism to write values to a single column.
-
*
* Building the Schema
* The row set mutator works for two cases: a known schema or a discovered
* schema. A known schema occurs in the case, such as JDBC, where the
* underlying data source can describe the schema before reading any rows.
* In this case, client code can build the schema and pass that schema to
* the mutator directly. Alternatively, the client code can build the
* schema column-by-column before the first row is read.
*
* Readers that discover schema can build the schema incrementally: add
* a column, load data for that column for one row, discover the next
* column, and so on. Almost any kind of column can be added at any time
* within the first batch:
* - Required columns are "back-filled" with zeros in the active batch,
* if that value
* makes sense for the column. (Date and Interval columns will throw an
* exception if added after the first row as there is no good "zero"
* value for that column. Varchar columns are back-filled with blanks.
-
*
- Optional (nullable) columns can be added at any time; they are
* back-filled with nulls in the active batch. In general, if a column is
* added after the first row, it should be nullable, not required, unless
* the data source has a "missing = blank or zero" policy.
* - Repeated (array) columns can be added at any time; they are
* back-filled with empty entries in the first batch. Arrays can also be
* safely added at any time.
* Client code must be aware of the semantics of adding columns at various
* times.
* - Columns added before or during the first row are the trivial case;
* this works for all data types and modes.
* - Required (non-nullable0 structured columns (Date, Period) cannot be
* added after the first row (as there is no good zero-fill value.)
* - Columns added within the first batch appear to the rest of Drill as
* if they were added before the first row: the downstream operators see the
* same schema from batch to batch.
* - Columns added after the first batch will trigger a
* schema-change event downstream.
* - The above is true during an "overflow row" (see below.) Once
* overflow occurs, columns added later in that overflow row will actually
* appear in the next batch, and will trigger a schema change when that
* batch is returned. That is, overflow "time shifts" a row addition from
* one batch to the next, and so it also time-shifts the column addition.
*
* Use the {@link org.apache.drill.exec.record.metadata.TupleBuilder} class
* to build the schema. The schema class is part of the
* {@link org.apache.drill.exec.physical.resultSet.RowSetLoader} object available from the
* {@link org.apache.drill.exec.physical.resultSet.ResultSetLoader#writer()} method.
* Using the Schema
* Presents columns using a physical schema. That is, map columns appear
* as columns that provide a nested map schema. Presumes that column
* access is primarily structural: first get a map, then process all
* columns for the map.
*
* If the input is a flat structure, then the physical schema has a
* flattened schema as the degenerate case.
*
* In both cases, access to columns is by index or by name. If new columns
* are added while loading, their index is always at the end of the existing
* columns.
*
Writing Data to the Batch
* Each batch is delimited by a call to {@link org.apache.drill.exec.physical.resultSet.ResultSetLoader#startBatch()}
* and a call to {@link org.apache.drill.exec.physical.resultSet.impl.VectorState#harvestWithLookAhead()}
* to obtain the completed batch. Note that readers do not
* call these methods; the scan operator does this work.
*
* Each row is delimited by a call to {@code startValue()} and a call to
* {@code saveRow()}. startRow() performs initialization necessary
* for some vectors such as repeated vectors. saveRow() moves the
* row pointer ahead.
*
* A reader can easily reject a row by calling startRow(), begin
* to load a row, but omitting the call to saveRow() In this case,
* the next call to startRow() repositions the row pointer to the
* same row, and new data will overwrite the previous data, effectively erasing
* the unwanted row. This also works for the last row; omitting the call to
* saveRow() causes the batch to hold only the rows actually
* saved.
*
* Readers then write to each column. Columns are accessible via index
* ({@link org.apache.drill.exec.physical.resultSet.RowSetLoader#column(int)} or by name
* ({@link org.apache.drill.exec.physical.resultSet.RowSetLoader#column(String)}.
* Indexed access is much faster.
* Column indexes are defined by the order that columns are added. The first
* column is column 0, the second is column 1 and so on.
*
* Each call to the above methods returns the same column writer, allowing the
* reader to cache column writers for additional performance.
*
* All column writers are of the same class; there is no need to cast to a
* type corresponding to the vector. Instead, they provide a variety of
* setType methods, where the type is one of various Java
* primitive or structured types. Most vectors provide just one method, but
* others (such as VarChar) provide two. The implementation will throw an
* exception if the vector does not support a particular type.
*
* Note that this class uses the term "loader" for row and column writers
* since the term "writer" is already used by the legacy record set mutator
* and column writers.
*
Handling Batch Limits
* The mutator enforces two sets of batch limits:
* - The number of rows per batch. The limit defaults to 64K (the Drill
* maximum), but can be set lower by the client.
* - The size of the largest vector, which is capped at 16 MB. (A future
* version may allow adjustable caps, or cap the memory of the entire
* batch.
* Both limits are presented to the client via the
* {@link org.apache.drill.exec.physical.resultSet.RowSetLoader#isFull()} method.
* After each call to {@code saveRow()},
* the client should call isFull() to determine if the client can add another row. Note
* that failing to do this check will cause the next call to
* {@link org.apache.drill.exec.physical.resultSet.ResultSetLoader#startBatch()} to throw an exception.
*
* The limits have subtle differences, however. Row limits are simple: at
* the end of the last row, the mutator notices that no more rows are possible,
* and so does not allow starting a new row.
*
* Vector overflow is more complex. A row may consist of columns (a, b, c).
* The client may write column a, but then column b might trigger a vector
* overflow. (For example, b is a Varchar, and the value for b is larger than
* the space left in the vector.) The client cannot stop and rewrite a. Instead,
* the client simply continues writing the row. The mutator, internally, moves
* this "overflow" row to a new batch. The overflow row becomes the first row
* of the next batch rather than the first row of the current batch.
*
* For this reason, the client can treat the two overflow cases identically,
* as described above.
*
* There are some subtle differences between the two cases that clients may
* occasionally may need to expect:
* - When a vector overflow occurs, the returned batch will have one
* fewer rows than the client might expect if it is simply counting the rows
* written.
* - A new column added to the batch after overflow occurs will appear in
* the next batch, triggering a schema change between the current and
* next batches.
*/
package org.apache.drill.exec.physical.resultSet;