com.databricks.sdk.service.sql.StatementExecutionService Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of databricks-sdk-java Show documentation

There is a newer version: 0.38.0

// Code generated from OpenAPI specs by Databricks SDK Generator. DO NOT EDIT. package com.databricks.sdk.service.sql; import com.databricks.sdk.support.Generated; /** * The Databricks SQL Statement Execution API can be used to execute SQL statements on a SQL * warehouse and fetch the result. * * **Getting started** * * We suggest beginning with the [Databricks SQL Statement Execution API tutorial]. * * **Overview of statement execution and result fetching** * * Statement execution begins by issuing a :method:statementexecution/executeStatement request * with a valid SQL statement and warehouse ID, along with optional parameters such as the data * catalog and output format. If no other parameters are specified, the server will wait for up to * 10s before returning a response. If the statement has completed within this timespan, the * response will include the result data as a JSON array and metadata. Otherwise, if no result is * available after the 10s timeout expired, the response will provide the statement ID that can be * used to poll for results by using a :method:statementexecution/getStatement request. * * You can specify whether the call should behave synchronously, asynchronously or start * synchronously with a fallback to asynchronous execution. This is controlled with the * `wait_timeout` and `on_wait_timeout` settings. If `wait_timeout` is set between 5-50 seconds * (default: 10s), the call waits for results up to the specified timeout; when set to `0s`, the * call is asynchronous and responds immediately with a statement ID. The `on_wait_timeout` setting * specifies what should happen when the timeout is reached while the statement execution has not * yet finished. This can be set to either `CONTINUE`, to fallback to asynchronous mode, or it can * be set to `CANCEL`, which cancels the statement. * * In summary: - Synchronous mode - `wait_timeout=30s` and `on_wait_timeout=CANCEL` - The call * waits up to 30 seconds; if the statement execution finishes within this time, the result data is * returned directly in the response. If the execution takes longer than 30 seconds, the execution * is canceled and the call returns with a `CANCELED` state. - Asynchronous mode - `wait_timeout=0s` * (`on_wait_timeout` is ignored) - The call doesn't wait for the statement to finish but returns * directly with a statement ID. The status of the statement execution can be polled by issuing * :method:statementexecution/getStatement with the statement ID. Once the execution has succeeded, * this call also returns the result and metadata in the response. - Hybrid mode (default) - * `wait_timeout=10s` and `on_wait_timeout=CONTINUE` - The call waits for up to 10 seconds; if the * statement execution finishes within this time, the result data is returned directly in the * response. If the execution takes longer than 10 seconds, a statement ID is returned. The * statement ID can be used to fetch status and results in the same way as in the asynchronous mode. * * Depending on the size, the result can be split into multiple chunks. If the statement * execution is successful, the statement response contains a manifest and the first chunk of the * result. The manifest contains schema information and provides metadata for each chunk in the * result. Result chunks can be retrieved by index with * :method:statementexecution/getStatementResultChunkN which may be called in any order and in * parallel. For sequential fetching, each chunk, apart from the last, also contains a * `next_chunk_index` and `next_chunk_internal_link` that point to the next chunk. * * A statement can be canceled with :method:statementexecution/cancelExecution. * * **Fetching result data: format and disposition** * * To specify the format of the result data, use the `format` field, which can be set to one of * the following options: `JSON_ARRAY` (JSON), `ARROW_STREAM` ([Apache Arrow Columnar]), or `CSV`. * * There are two ways to receive statement results, controlled by the `disposition` setting, * which can be either `INLINE` or `EXTERNAL_LINKS`: * * - `INLINE`: In this mode, the result data is directly included in the response. It's best * suited for smaller results. This mode can only be used with the `JSON_ARRAY` format. * * - `EXTERNAL_LINKS`: In this mode, the response provides links that can be used to download the * result data in chunks separately. This approach is ideal for larger results and offers higher * throughput. This mode can be used with all the formats: `JSON_ARRAY`, `ARROW_STREAM`, and `CSV`. * * By default, the API uses `format=JSON_ARRAY` and `disposition=INLINE`. * * **Limits and limitations** * * Note: The byte limit for INLINE disposition is based on internal storage metrics and will not * exactly match the byte count of the actual payload. * * - Statements with `disposition=INLINE` are limited to 25 MiB and will fail when this limit is * exceeded. - Statements with `disposition=EXTERNAL_LINKS` are limited to 100 GiB. Result sets * larger than this limit will be truncated. Truncation is indicated by the `truncated` field in the * result manifest. - The maximum query text size is 16 MiB. - Cancelation might silently fail. A * successful response from a cancel request indicates that the cancel request was successfully * received and sent to the processing engine. However, an outstanding statement might have already * completed execution when the cancel request arrives. Polling for status until a terminal state is * reached is a reliable way to determine the final state. - Wait timeouts are approximate, occur * server-side, and cannot account for things such as caller delays and network latency from caller * to service. - The system will auto-close a statement after one hour if the client stops polling * and thus you must poll at least once an hour. - The results are only available for one hour after * success; polling does not extend this. * * [Apache Arrow Columnar]: https://arrow.apache.org/overview/ [Databricks SQL Statement * Execution API tutorial]: https://docs.databricks.com/sql/api/sql-execution-tutorial.html * * This is the high-level interface, that contains generated methods. * * Evolving: this interface is under development. Method signatures may change. */ @Generated public interface StatementExecutionService { /** * Cancel statement execution. * * Requests that an executing statement be canceled. Callers must poll for status to see the * terminal state. */ void cancelExecution(CancelExecutionRequest cancelExecutionRequest); /** Execute a SQL statement. */ ExecuteStatementResponse executeStatement(ExecuteStatementRequest executeStatementRequest); /** * Get status, manifest, and result first chunk. * * This request can be used to poll for the statement's status. When the `status.state` field * is `SUCCEEDED` it will also return the result manifest and the first chunk of the result data. * When the statement is in the terminal states `CANCELED`, `CLOSED` or `FAILED`, it returns HTTP * 200 with the state set. After at least 12 hours in terminal state, the statement is removed * from the warehouse and further calls will receive an HTTP 404 response. * * **NOTE** This call currently might take up to 5 seconds to get the latest status and result. */ GetStatementResponse getStatement(GetStatementRequest getStatementRequest); /** * Get result chunk by index. * *

After the statement execution has `SUCCEEDED`, this request can be used to fetch any chunk * by index. Whereas the first chunk with `chunk_index=0` is typically fetched with * :method:statementexecution/executeStatement or :method:statementexecution/getStatement, this * request can be used to fetch subsequent chunks. The response structure is identical to the * nested `result` element described in the :method:statementexecution/getStatement request, and * similarly includes the `next_chunk_index` and `next_chunk_internal_link` fields for simple * iteration through the result set. */ ResultData getStatementResultChunkN( GetStatementResultChunkNRequest getStatementResultChunkNRequest); }