All Downloads are FREE. Search and download functionalities are using the official Maven repository.

lluxio-shaded-hadoop3-client.305.source-code.streams.adoc Maven / Gradle / Ivy

There is a newer version: 313
Show newest version

== Streams

There are several objects in Vert.x that allow items to be read from and written.

In Vert.x, write calls return immediately, and writes are queued internally.

It's not hard to see that if you write to an object faster than it can actually write the data to
its underlying resource, then the write queue can grow unbounded - eventually resulting in
memory exhaustion.

To solve this problem a simple flow control (_back-pressure_) capability is provided by some objects in the Vert.x API.

Any flow control aware object that can be _written-to_ implements {@link io.vertx.core.streams.WriteStream},
while any flow control object that can be _read-from_ is said to implement {@link io.vertx.core.streams.ReadStream}.

Let's take an example where we want to read from a `ReadStream` then write the data to a `WriteStream`.

A very simple example would be reading from a {@link io.vertx.core.net.NetSocket} then writing back to the
same `NetSocket` - since `NetSocket` implements both `ReadStream` and `WriteStream`. Note that this works
between any `ReadStream` and `WriteStream` compliant object, including HTTP requests, HTTP responses,
async files I/O, WebSockets, etc.

A naive way to do this would be to directly take the data that has been read and immediately write it
to the `NetSocket`:

[source,$lang]
----
{@link examples.StreamsExamples#pipe1(io.vertx.core.Vertx)}
----

There is a problem with the example above: if data is read from the socket faster than it can be
written back to the socket, it will build up in the write queue of the `NetSocket`, eventually
running out of RAM. This might happen, for example if the client at the other end of the socket
wasn't reading fast enough, effectively putting back-pressure on the connection.

Since `NetSocket` implements `WriteStream`, we can check if the `WriteStream` is full before
writing to it:

[source,$lang]
----
{@link examples.StreamsExamples#pipe2(io.vertx.core.Vertx)}
----

This example won't run out of RAM but we'll end up losing data if the write queue gets full. What we
really want to do is pause the `NetSocket` when the write queue is full:

[source,$lang]
----
{@link examples.StreamsExamples#pipe3(io.vertx.core.Vertx)}
----

We're almost there, but not quite. The `NetSocket` now gets paused when the file is full, but we also need to unpause
it when the write queue has processed its backlog:

[source,$lang]
----
{@link examples.StreamsExamples#pipe4(io.vertx.core.Vertx)}
----

And there we have it. The {@link io.vertx.core.streams.WriteStream#drainHandler} event handler will
get called when the write queue is ready to accept more data, this resumes the `NetSocket` that
allows more data to be read.

Wanting to do this is quite common while writing Vert.x applications, so we added the
{@link io.vertx.core.streams.ReadStream#pipeTo} method that does all of this hard work for you.
You just feed it the `WriteStream` and use it:

[source,$lang]
----
{@link examples.StreamsExamples#pipe5(io.vertx.core.Vertx)}
----

This does exactly the same thing as the more verbose example, plus it handles stream failures and termination: the
destination `WriteStream` is ended when the pipe completes with success or a failure.

You can be notified when the operation completes:

[source,$lang]
----
{@link examples.StreamsExamples#pipe6(io.vertx.core.net.NetServer)}
----

When you deal with an asynchronous destination, you can create a {@link io.vertx.core.streams.Pipe} instance that
pauses the source and resumes it when the source is piped to the destination:

[source,$lang]
----
{@link examples.StreamsExamples#pipe7(io.vertx.core.net.NetServer, io.vertx.core.file.FileSystem)}
----

When you need to abort the transfer, you need to close it:

[source,$lang]
----
{@link examples.StreamsExamples#pipe8(io.vertx.core.Vertx, io.vertx.core.file.FileSystem)}
----

When the pipe is closed, the streams handlers are unset and the `ReadStream` resumed.

As seen above, by default the destination is always ended when the stream completes, you can control this behavior
on the pipe object:

* {@link io.vertx.core.streams.Pipe#endOnFailure(boolean)} controls the behavior when a failure happens
* {@link io.vertx.core.streams.Pipe#endOnSuccess(boolean)} controls the behavior when the read stream ends
* {@link io.vertx.core.streams.Pipe#endOnComplete(boolean)} controls the behavior in all cases

Here is a short example:

[source,$lang]
----
{@link examples.StreamsExamples#pipe9(io.vertx.core.file.AsyncFile, io.vertx.core.file.AsyncFile)}
----

Let's now look at the methods on `ReadStream` and `WriteStream` in more detail:

=== ReadStream

`ReadStream` is implemented by {@link io.vertx.core.http.HttpClientResponse}, {@link io.vertx.core.datagram.DatagramSocket},
{@link io.vertx.core.http.HttpClientRequest}, {@link io.vertx.core.http.HttpServerFileUpload},
{@link io.vertx.core.http.HttpServerRequest}, {@link io.vertx.core.eventbus.MessageConsumer},
{@link io.vertx.core.net.NetSocket}, {@link io.vertx.core.http.WebSocket}, {@link io.vertx.core.TimeoutStream},
{@link io.vertx.core.file.AsyncFile}.

- {@link io.vertx.core.streams.ReadStream#handler}:
set a handler which will receive items from the ReadStream.
- {@link io.vertx.core.streams.ReadStream#pause}:
pause the stream. When paused no items will be received in the handler.
- {@link io.vertx.core.streams.ReadStream#fetch}:
fetch a specified amount of item from the stream. The handler will be called if any item arrives. Fetches
are cumulative.
- {@link io.vertx.core.streams.ReadStream#resume}:
resume the stream. The handler will be called if any item arrives. Resuming is equivalent of fetching `Long.MAX_VALUE` items.
- {@link io.vertx.core.streams.ReadStream#exceptionHandler}:
called when an exception occurs on the ReadStream.
- {@link io.vertx.core.streams.ReadStream#endHandler}:
called when the end of stream is reached. This might be when EOF is reached if the ReadStream represents a file,
or when end of request is reached if it's an HTTP request, or when the connection is closed if it's a TCP socket.

A read stream is either in _flowing_ or _fetch_ mode

* initially the stream is in flowing mode
* when the stream is in _flowing_ mode, elements are delivered to the handler
* when the stream is in _fetch_ mode, only the number of requested elements will be delivered to the handler

{@link io.vertx.core.streams.ReadStream#pause()}, {@link io.vertx.core.streams.ReadStream#resume()} and {@link io.vertx.core.streams.ReadStream#fetch}
change the mode

* `resume()` sets the _flowing_ mode
* `pause()` sets the _fetch_ mode and resets the demand to `0`
* `fetch(long)` requests a specific amount of elements and adds it to the actual demand

=== WriteStream

`WriteStream` is implemented by {@link io.vertx.core.http.HttpClientRequest}, {@link io.vertx.core.http.HttpServerResponse}
{@link io.vertx.core.http.WebSocket}, {@link io.vertx.core.net.NetSocket} and {@link io.vertx.core.file.AsyncFile}.

Functions:

- {@link io.vertx.core.streams.WriteStream#write}:
write an object to the WriteStream. This method will never block. Writes are queued internally and asynchronously
written to the underlying resource.
- {@link io.vertx.core.streams.WriteStream#setWriteQueueMaxSize}:
set the number of object at which the write queue is considered _full_, and the method {@link io.vertx.core.streams.WriteStream#writeQueueFull()}
returns `true`. Note that, when the write queue is considered full, if write is called the data will still be accepted
and queued. The actual number depends on the stream implementation, for {@link io.vertx.core.buffer.Buffer} the size
represents the actual number of bytes written and not the number of buffers.
- {@link io.vertx.core.streams.WriteStream#writeQueueFull}:
returns `true` if the write queue is considered full.
- {@link io.vertx.core.streams.WriteStream#exceptionHandler}:
Will be called if an exception occurs on the `WriteStream`.
- {@link io.vertx.core.streams.WriteStream#drainHandler}:
The handler will be called if the `WriteStream` is considered no longer full.




© 2015 - 2024 Weber Informatics LLC | Privacy Policy