Many resources are needed to download a project. Please understand that we have to compensate our server costs. Thank you in advance. Project price only 1 $
You can buy this project and download/modify it how often you want.
/*
* Copyright (c) 2008-2024, Hazelcast, Inc. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.hazelcast.jet.core;
import com.hazelcast.jet.JetException;
import com.hazelcast.jet.Traverser;
import com.hazelcast.jet.core.function.ObjLongBiFunction;
import com.hazelcast.jet.impl.execution.WatermarkCoalescer;
import com.hazelcast.jet.pipeline.Sources;
import javax.annotation.Nonnull;
import javax.annotation.Nullable;
import java.lang.reflect.Array;
import java.util.Arrays;
import java.util.function.Supplier;
import java.util.function.ToLongFunction;
import static com.hazelcast.internal.util.Preconditions.checkNotNegative;
import static com.hazelcast.jet.core.SlidingWindowPolicy.tumblingWinPolicy;
import static java.util.concurrent.TimeUnit.MILLISECONDS;
/**
* A utility that helps a source emit events according to a given {@link
* EventTimePolicy}. Generally this class should be used if a source needs
* to emit {@link Watermark watermarks}. The mapper deals with the
* following concerns:
*
*
1. Reading partition by partition
*
* Upon restart, it can happen that partition P1 has one very
* recent event and P2 has an old one. If we poll P1
* first and emit its recent event, it will advance the watermark. When we
* poll P2 later on, its event will be behind the watermark and
* can be dropped as late. This utility tracks the event timestamps for
* each source partition individually and allows the processor to emit the
* watermark that is correct with respect to all the partitions.
*
*
2. Some partition having no data
*
* It can happen that some partition does not have any events at all while
* others do, or the processor doesn't get any external partitions assigned
* to it. If we simply wait for the timestamps in all partitions to advance
* to some point, we won't be emitting any watermarks. This utility
* supports the idle timeout: if there's no new data from a
* partition after the timeout elapses, it will be marked as idle,
* allowing the processor's watermark to advance as if that partition
* didn't exist. If all partitions are idle or there are no partitions, the
* processor will emit a special idle message and the downstream
* will exclude this processor from watermark coalescing.
*
*
3. Wrapping of events
*
* Events may need to be wrapped with the extracted timestamp if {@link
* EventTimePolicy#wrapFn()} is set.
*
*
4. Throttling of Watermarks
*
* Watermarks are only consumed by windowing operations and emitting
* watermarks more frequently than the given {@link
* EventTimePolicy#watermarkThrottlingFrameSize()} is wasteful since they
* are broadcast to all processors. The mapper ensures that watermarks are
* emitted according to the throttling frame size.
*
*
Usage
*
* The API is designed to be used as a flat-mapping step in the {@link
* Traverser} that holds the output data. Your source can follow this
* pattern:
*
*
{@code
* public boolean complete() {
* if (traverser == null) {
* List records = poll();
* if (records.isEmpty()) {
* traverser = eventTimeMapper.flatMapIdle();
* } else {
* traverser = traverseIterable(records)
* .flatMap(event -> eventTimeMapper.flatMapEvent(
* event, event.getPartition()));
* }
* traverser = traverser.onFirstNull(() -> traverser = null);
* }
* emitFromTraverser(traverser, event -> {
* if (!(event instanceof Watermark)) {
* // store your offset after event was emitted
* offsetsMap.put(event.getPartition(), event.getOffset());
* }
* });
* return false;
* }
* }
*
* Other methods:
*
* Call {@link #addPartitions} and {@link #removePartition} to change your
* partition count initially or whenever the count changes.
*
* If you support state snapshots, save the value returned by {@link
* #getWatermark} for all partitions to the snapshot. When restoring the
* state, call {@link #restoreWatermark}.
*
* You should save the value under your external partition key so that the
* watermark value can be restored to correct processor instance. The key
* should also be wrapped using {@link BroadcastKey#broadcastKey
* broadcastKey()}, because the external partitions don't match Hazelcast
* partitions. This way, all processor instances will see all keys and they
* can restore the partitions they handle and ignore others.
*
*
* @param the event type
* @since Jet 3.0
*/
public class EventTimeMapper {
/**
* Value to use as the {@code nativeEventTime} argument when calling
* {@link #flatMapEvent(Object, int, long)} when there's no native event
* time to supply.
*/
public static final long NO_NATIVE_TIME = Long.MIN_VALUE;
private static final WatermarkPolicy[] EMPTY_WATERMARK_POLICIES = {};
private static final long[] EMPTY_LONGS = {};
private final byte wmKey;
private final long idleTimeoutNanos;
@Nullable
private final ToLongFunction super T> timestampFn;
private final Supplier extends WatermarkPolicy> newWmPolicyFn;
private final ObjLongBiFunction super T, ?> wrapFn;
@Nullable
private final SlidingWindowPolicy watermarkThrottlingFrame;
private final AppendableTraverser