io.hekate.coordinate.CoordinationService Maven / Gradle / Ivy

Go to download
/*
 * Copyright 2022 The Hekate Project
 *
 * The Hekate Project licenses this file to you under the Apache License,
 * version 2.0 (the "License"); you may not use this file except in compliance
 * with the License. You may obtain a copy of the License at:
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations
 * under the License.
 */

package io.hekate.coordinate;

import io.hekate.core.HekateBootstrap;
import io.hekate.core.service.DefaultServiceFactory;
import io.hekate.core.service.Service;
import java.util.List;

/**
 * « start hereMain entry point to distributed coordination API.
 *
 * Overview
 * 
 * {@link CoordinationService} provides support for implementing different coordination protocols among a set of cluster members. Such
 * protocols can perform application-specific rebalancing of a distributed data, node roles assignment or any other logic that requires a
 * coordinated agreement among multiple cluster members based on the consistent cluster topology (i.e. when all members have the same
 * consistent view of the cluster topology).
 * 
 *
 * 
 * Coordination process is triggered by the {@link CoordinationService} every time when cluster topology changes (i.e. when a new node
 * joins or an existing node leaves the cluster). Upon such event one of the cluster members is selected to be the process coordinator and
 * starts exchanging messages with all other participating nodes until coordination process is finished or is interrupted by a concurrent
 * cluster event.
 * 
 *
 * 
 * Service Configuration
 * Coordination Handler
 * Messaging
 * Topology Changes
 * Awaiting for Initial Coordination
 * Thread Management
 * 
 *
 * 
 * Service Configuration
 * 
 * {@link CoordinationService} can be registered and configured in {@link HekateBootstrap} with the help of {@link
 * CoordinationServiceFactory} as shown in the example below:
 * 
 *
 * 
 * 
 * Java
 * Spring XSD
 * Spring bean
 * 
 * 
 * ${source: coordinate/CoordinationServiceJavadocTest.java#configure}
 * 
 * 
 * Note: This example requires Spring Framework integration
 * (see HekateSpringBootstrap).
 * ${source: coordinate/service-xsd.xml#example}
 * 
 * 
 * Note: This example requires Spring Framework integration
 * (see HekateSpringBootstrap).
 * ${source: coordinate/service-bean.xml#example}
 * 
 * 
 *
 * 
 * Coordination Handler
 * 
 * Application-specific logic of a distributed coordination process must be encapsulated into an implementation of {@link
 * CoordinationHandler} interface.
 * 
 *
 * 
 * When {@link CoordinationService} start a new coordination process, it selects one of cluster nodes to be the coordinator and calls its
 * {@link CoordinationHandler#coordinate(CoordinatorContext)} method together with a coordination context object. This object provides
 * information about the coordination participants and provides methods for sending/receiving coordination requests to/from them.
 * All other nodes besides the coordinator will stay idle and will wait for requests from the coordinator.
 * 
 *
 * 
 * Once coordination is completed and each of coordination participants reaches its final state (according to an application logic) then
 * coordinator must explicitly notify the {@link CoordinationService} by calling the {@link CoordinatorContext#complete()} method.
 * 
 *
 * Example
 * 
 * The code example below shows how {@link CoordinationHandler} can be implemented in order to perform distributed coordination of cluster
 * members. For the sake of brevity the coordination scenario is very simple and has the goal of executing some application-specific logic
 * on each of the coordinated members with each member holding an exclusive local lock. The coordination protocol can be described as
 * follows:
 * 
 * 
 * Ask all members to acquire local lock and await for confirmation from each member
 * Ask all members to execute their application-specific logic and await for confirmation from each member
 * Ask all members to release their local locks
 * 
 *
 * 
 * ${source: coordinate/CoordinationServiceJavadocTest.java#handler}
 * 
 *
 * 
 * Messaging
 * 
 * {@link CoordinationService} provides support for asynchronous message exchange among the coordination participants. It can be done via
 * the following methods:
 * 
 * 
 * 
 * {@link CoordinationContext#broadcast(Object, CoordinationBroadcastCallback)} - broadcast the same request to all members at once
 * 
 * {@link CoordinationMember#request(Object, CoordinationRequestCallback)} - send request to individual member
 * 
 *
 * 
 * When some node receives a request (either from the coordinator or from some other node) then its
 * {@link CoordinationHandler#process(CoordinationRequest, CoordinationContext)} method gets called. Implementations of this method must
 * perform their application-specific logic based on the request payload and send back a response via the
 * {@link CoordinationRequest#reply(Object)} method.
 * 
 *
 * 
 * All messages of a coordination process are guaranteed to be send and received with the same consistent cluster topology (i.e.
 * both sender and receiver has exactly the same cluster topology view). If topology mismatch is detected between the sender and the
 * receiver then {@link CoordinationService} will transparently send a retry response back to the sender so that it could retry sending
 * later once its topology gets consistent with the receiver or cancel the coordination process and restart it with a more up to date
 * cluster topology.
 * 
 *
 * 
 * Topology Changes
 * 
 * If topology change happens while coordination process is still running then {@link CoordinationService} will try to cancel the current
 * process via {@link CoordinationHandler#cancel(CoordinationContext)} method and will start a new coordination process.
 * Implementations of the {@link CoordinationHandler} interface are required to stop all activities of the current coordination process as
 * soon as possible.
 * 
 *
 * 
 * In order to perform early detection of a cancelled coordination process please consider using the {@link
 * CoordinationContext#isCancelled()} method. If this method returns {@code true} then this context is not valid and should not be used
 * any more.
 * 
 *
 * 
 * In order to simplify handling of concurrent coordination processes it is recommended for implementations of the {@link
 * CoordinationHandler} interface to minimize state that should be held in each handler instance. If coordination logic requires some
 * transitional state to be kept during the coordination process then please consider keeping it as an attachment object of {@link
 * CoordinationContext} instance (see {@link CoordinationContext#setAttachment(Object)}/{@link CoordinationContext#getAttachment()}).
 * 
 *
 * 
 * Awaiting for Initial Coordination
 * 
 * Sometimes it is required for applications to await for initial coordination process to complete before proceeding to their main
 * tasks (f.e. if application needs to know which data partitions or roles were assigned to its node by some imaginary
 * coordination process when node joined the cluster).
 * 
 *
 * 
 * This can be done by obtaining a future object via {@link #futureOf(String)} method. This future object will be notified right after the
 * coordination process gets executed for the first time and can be used to {@link CoordinationFuture#get()} await} for its completion as
 * in the example below:
 * ${source: coordinate/CoordinationServiceJavadocTest.java#future}
 * 
 *
 * 
 * Thread Management
 * 
 * Each {@link CoordinationHandler} instance is bound to a single thread that is managed by the {@link CoordinationService}. All
 * coordination and messaging callbacks get processed on that thread sequentially in order to simplify asynchronous operations handling and
 * prevent concurrency issues.
 * 
 *
 * 
 * If particular {@link CoordinationHandler}'s operation takes long time to complete then it is recommended to use a separate thread
 * pool to offload such operations from the main coordination thread. Otherwise such operations will block subsequent notification from the
 * {@link CoordinationService} and will negatively impact on the overall coordination performance.
 * 
 *
 * @see CoordinationServiceFactory
 */
@DefaultServiceFactory(CoordinationServiceFactory.class)
public interface CoordinationService extends Service {
    /**
     * Returns all processes that are {@link CoordinationServiceFactory#setProcesses(List) registered} within this service.
     *
     * @return Processes or an empty list if there are no registered processes.
     */
    List allProcesses();

    /**
     * Returns a coordination process for the specified name.
     *
     * @param name Process name (see {@link CoordinationProcessConfig#setName(String)}).
     *
     * @return Coordination process.
     */
    CoordinationProcess process(String name);

    /**
     * Returns {@code true} if this service has a coordination process with the specified name.
     *
     * @param name Process name (see {@link CoordinationProcessConfig#setName(String)}).
     *
     * @return {@code true} if process exists.
     */
    boolean hasProcess(String name);

    /**
     * Returns an initial coordination future for the specified process name. The returned future object will be completed once the
     * coordination processes gets executed for the very first time on this node.
     *
     * @param process Process name (see {@link CoordinationProcessConfig#setName(String)}).
     *
     * @return Initial coordination future.
     */
    CoordinationFuture futureOf(String process);
}