org.apache.hadoop.service.launcher.package-info Maven / Gradle / Ivy

Show more of this group Show more artifacts with this name
Show all versions of hadoop-common Show documentation
Apache Hadoop Common
There is a newer version: 3.4.0
/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**

 This package contains classes, interfaces and exceptions to launch
 YARN services from the command line.

 Key Features

 
 General purpose YARN service launcher:

 The {@link org.apache.hadoop.service.launcher.ServiceLauncher} class parses
 a command line, then instantiates and launches the specified YARN service. It
 then waits for the service to finish, converting any exceptions raised or
 exit codes returned into an exit code for the (then exited) process. 
 

 This class is designed be invokable from the static 
 {@link org.apache.hadoop.service.launcher.ServiceLauncher#main(String[])}
 method, or from {@code main(String[])} methods implemented by
 other classes which provide their own entry points.
  

 

 Extended YARN Service Interface:

 The {@link org.apache.hadoop.service.launcher.LaunchableService} interface
 extends {@link org.apache.hadoop.service.Service} with methods to pass
 down the CLI arguments and to execute an operation without having to
 spawn a thread in the  {@link org.apache.hadoop.service.Service#start()} phase.
  

 

 Standard Exit codes:

 {@link org.apache.hadoop.service.launcher.LauncherExitCodes}
 defines a set of exit codes that can be used by services to standardize
 exit causes.

 

 Escalated shutdown:

 The {@link org.apache.hadoop.service.launcher.ServiceShutdownHook}
 shuts down any service via the hadoop shutdown mechanism.
 The {@link org.apache.hadoop.service.launcher.InterruptEscalator} can be
 registered to catch interrupts, triggering the shutdown -and forcing a JVM
 exit if it times out or a second interrupt is received.

 
Tests:
 test cases include interrupt handling and
 lifecycle failures.

 
Launching a YARN Service

 The Service Launcher can launch any YARN service.
 It will instantiate the service classname provided, using the no-args
 constructor, or if no such constructor is available, it will fall back
 to a constructor with a single {@code String} parameter,
 passing the service name as the parameter value.
 

 The launcher will initialize the service via
 {@link org.apache.hadoop.service.Service#init(Configuration)},
 then start it via its {@link org.apache.hadoop.service.Service#start()} method.
 It then waits indefinitely for the service to stop.
 
 
 After the service has stopped, a non-null value  of
 {@link org.apache.hadoop.service.Service#getFailureCause()} is interpreted
 as a failure, and, if it didn't happen during the stop phase (i.e. when
 {@link org.apache.hadoop.service.Service#getFailureState()} is not
 {@code STATE.STOPPED}, escalated into a non-zero return code).
 

 
 To view the workflow in sequence, it is:
 

 (prepare configuration files —covered later)
 instantiate service via its empty or string constructor
 call {@link org.apache.hadoop.service.Service#init(Configuration)}
 call {@link org.apache.hadoop.service.Service#start()}
 call
   {@link org.apache.hadoop.service.Service#waitForServiceToStop(long)}
 If an exception was raised: propagate it
 If an exception was recorded in
 {@link org.apache.hadoop.service.Service#getFailureCause()}
 while the service was running: propagate it.
 

 For a service to be fully compatible with this launch model, it must
 
 Start worker threads, processes and executors in its
 {@link org.apache.hadoop.service.Service#start()} method
 Terminate itself via a call to
 {@link org.apache.hadoop.service.Service#stop()}
 in one of these asynchronous methods.
 

 If a service does not stop itself, ever, then it can be launched
 as a long-lived daemon.
 The service launcher will never terminate, but neither will the service.
 The service launcher does register signal handlers to catch {@code kill}
 and control-C signals —calling {@code stop()} on the service when
 signaled.
 This means that a daemon service may get a warning and time to shut
 down.

 
 To summarize: provided a service launches its long-lived threads in its Service
 {@code start()} method, the service launcher can create it, configure it
 and start it, triggering shutdown when signaled.

 What these services can not do is get at the command line parameters or easily
 propagate exit codes (there is a way covered later). These features require
 some extensions to the base {@code Service} interface: the Launchable
 Service.

 
Launching a Launchable YARN Service

 A Launchable YARN Service is a YARN service which implements the interface
 {@link org.apache.hadoop.service.launcher.LaunchableService}. 
 
 It adds two methods to the service interface —and hence two new features:

 

 Access to the command line passed to the service launcher 
 A blocking {@code int execute()} method which can return the exit
 code for the application.
 

 This design is ideal for implementing services which parse the command line,
 and which execute short-lived applications. For example, end user 
 commands can be implemented as such services, thus integrating with YARN's
 workflow and {@code YarnClient} client-side code.  

 
 It can just as easily be used for implementing long-lived services that
 parse the command line -it just becomes the responsibility of the
 service to decide when to return from the {@code execute()} method.
 It doesn't even need to {@code stop()} itself; the launcher will handle
 that if necessary.
 

 The {@link org.apache.hadoop.service.launcher.LaunchableService} interface
 extends {@link org.apache.hadoop.service.Service} with two new methods.

 

 {@link org.apache.hadoop.service.launcher.LaunchableService#bindArgs(Configuration, List)}
 provides the {@code main(String args[])} arguments as a list, after any
 processing by the Service Launcher to extract configuration file references.
 This method is called before
 {@link org.apache.hadoop.service.Service#init(Configuration)}.
 This is by design: it allows the arguments to be parsed before the service is
 initialized, thus allowing services to tune their configuration data before
 passing it to any superclass in that {@code init()} method.
 To make this operation even simpler, the
 {@link org.apache.hadoop.conf.Configuration} that is to be passed in
 is provided as an argument.
 This reference passed in is the initial configuration for this service;
 the one that will be passed to the init operation.

 In
 {@link org.apache.hadoop.service.launcher.LaunchableService#bindArgs(Configuration, List)},
 a Launchable Service may manipulate this configuration by setting or removing
 properties. It may also create a new {@code Configuration} instance
 which may be needed to trigger the injection of HDFS or YARN resources
 into the default resources of all Configurations.
 If the return value of the method call is a configuration
 reference (as opposed to a null value), the returned value becomes that
 passed in to the {@code init()} method.
 

 After the {@code bindArgs()} processing, the service's {@code init()}
 and {@code start()} methods are called, as usual.
 

 At this point, rather than block waiting for the service to terminate (as
 is done for a basic service), the method
 {@link org.apache.hadoop.service.launcher.LaunchableService#execute()}
 is called.
 This is a method expected to block until completed, returning the intended 
 application exit code of the process when it does so. 
 
 
 After this {@code execute()} operation completes, the
 service is stopped and exit codes generated. Any exception raised
 during the {@code execute()} method takes priority over any exit codes
 returned by the method. This allows services to signal failures simply
 by raising exceptions with exit codes.
 


 

 To view the workflow in sequence, it is:
 

 (prepare configuration files —covered later)
 instantiate service via its empty or string constructor
 call {@link org.apache.hadoop.service.launcher.LaunchableService#bindArgs(Configuration, List)}
 call {@link org.apache.hadoop.service.Service#init(Configuration)} with the existing config,
  or any new one returned by
  {@link org.apache.hadoop.service.launcher.LaunchableService#bindArgs(Configuration, List)}
 call {@link org.apache.hadoop.service.Service#start()}
 call {@link org.apache.hadoop.service.launcher.LaunchableService#execute()}
 call {@link org.apache.hadoop.service.Service#stop()}
 The return code from
  {@link org.apache.hadoop.service.launcher.LaunchableService#execute()}
  becomes the exit code of the process, unless overridden by an exception.
 If an exception was raised in this workflow: propagate it
 If an exception was recorded in
  {@link org.apache.hadoop.service.Service#getFailureCause()}
  while the service was running: propagate it.
 


 Exit Codes and Exceptions

 
 For a basic service, the return code is 0 unless an exception
 was raised. 
 

 For a {@link org.apache.hadoop.service.launcher.LaunchableService}, the return
 code is the number returned from the
 {@link org.apache.hadoop.service.launcher.LaunchableService#execute()}
 operation, again, unless overridden an exception was raised.

 

 Exceptions are converted into exit codes -but rather than simply
 have a "something went wrong" exit code, exceptions may
 provide exit codes which will be extracted and used as the return code.
 This enables Launchable Services to use exceptions as a way
 of returning error codes to signal failures and for
 normal Services to return any error code at all.

 

 Any exception which implements the
 {@link org.apache.hadoop.util.ExitCodeProvider}
 interface is considered be a provider of the exit code: the method
 {@link org.apache.hadoop.util.ExitCodeProvider#getExitCode()}
 will be called on the caught exception to generate the return code.
 This return code and the message in the exception will be used to
 generate an instance of
 {@link org.apache.hadoop.util.ExitUtil.ExitException}
 which can be passed down to
 {@link org.apache.hadoop.util.ExitUtil#terminate(ExitUtil.ExitException)}
 to trigger a JVM exit. The initial exception will be used as the cause
 of the {@link org.apache.hadoop.util.ExitUtil.ExitException}.

 

 If the exception is already an instance or subclass of 
 {@link org.apache.hadoop.util.ExitUtil.ExitException}, it is passed
 directly to
 {@link org.apache.hadoop.util.ExitUtil#terminate(ExitUtil.ExitException)}
 without any conversion.
 One such subclass,
 {@link org.apache.hadoop.service.launcher.ServiceLaunchException}
 may be useful: it includes formatted exception message generation. 
 It also declares that it extends the
 {@link org.apache.hadoop.service.launcher.LauncherExitCodes}
 interface listing common exception codes. These are exception codes
 that can be raised by the {@link org.apache.hadoop.service.launcher.ServiceLauncher}
 itself to indicate problems during parsing the command line, creating
 the service instance and the like. There are also some common exit codes
 for Hadoop/YARN service failures, such as
 {@link org.apache.hadoop.service.launcher.LauncherExitCodes#EXIT_UNAUTHORIZED}.
 Note that {@link org.apache.hadoop.util.ExitUtil.ExitException} itself
 implements {@link org.apache.hadoop.util.ExitCodeProvider#getExitCode()}

 

 If an exception does not implement
 {@link org.apache.hadoop.util.ExitCodeProvider#getExitCode()},
 it will be wrapped in an {@link org.apache.hadoop.util.ExitUtil.ExitException}
 with the exit code
 {@link org.apache.hadoop.service.launcher.LauncherExitCodes#EXIT_EXCEPTION_THROWN}.

 

 To view the exit code extraction in sequence, it is:
 

 If no exception was triggered by a basic service, a
 {@link org.apache.hadoop.service.launcher.ServiceLaunchException} with an
 exit code of 0 is created.

 For a LaunchableService, the exit code is the result of {@code execute()}
 Again, a {@link org.apache.hadoop.service.launcher.ServiceLaunchException}
 with a return code of 0 is created.
 

 Otherwise, if the exception is an instance of {@code ExitException},
 it is returned as the service terminating exception.

 If the exception implements {@link org.apache.hadoop.util.ExitCodeProvider},
 its exit code and {@code getMessage()} value become the exit exception.

 Otherwise, it is wrapped as a
 {@link org.apache.hadoop.service.launcher.ServiceLaunchException}
 with the exit code
 {@link org.apache.hadoop.service.launcher.LauncherExitCodes#EXIT_EXCEPTION_THROWN}
 to indicate that an exception was thrown.

 This is finally passed to
 {@link org.apache.hadoop.util.ExitUtil#terminate(ExitUtil.ExitException)},
 by way of
 {@link org.apache.hadoop.service.launcher.ServiceLauncher#exit(ExitUtil.ExitException)};
 a method designed to allow subclasses to override for testing.

 The {@link org.apache.hadoop.util.ExitUtil} class then terminates the JVM
 with the specified exit code, printing the {@code toString()} value
 of the exception if the return code is non-zero.
 

 This process may seem convoluted, but it is designed to allow any exception
 in the Hadoop exception hierarchy to generate exit codes,
 and to minimize the amount of exception wrapping which takes place.

 Interrupt Handling

 The Service Launcher has a helper class,
 {@link org.apache.hadoop.service.launcher.InterruptEscalator}
 to handle the standard SIGKILL signal and control-C signals.
 This class registers for signal callbacks from these signals, and,
 when received, attempts to stop the service in a limited period of time.
 It then triggers a JVM shutdown by way of
 {@link org.apache.hadoop.util.ExitUtil#terminate(int, String)}
 
 If a second signal is received, the
 {@link org.apache.hadoop.service.launcher.InterruptEscalator}
 reacts by triggering an immediate JVM halt, invoking 
 {@link org.apache.hadoop.util.ExitUtil#halt(int, String)}. 
 This escalation process is designed to address the situation in which
 a shutdown-hook can block, yet the caller (such as an init.d daemon)
 wishes to kill the process.
 The shutdown script should repeat the kill signal after a chosen time period,
 to trigger the more aggressive process halt. The exit code will always be
 {@link org.apache.hadoop.service.launcher.LauncherExitCodes#EXIT_INTERRUPTED}.
 

 The {@link org.apache.hadoop.service.launcher.ServiceLauncher} also registers
 a {@link org.apache.hadoop.service.launcher.ServiceShutdownHook} with the
 Hadoop shutdown hook manager, unregistering it afterwards. This hook will
 stop the service if a shutdown request is received, so ensuring that
 if the JVM is exited by any thread, an attempt to shut down the service
 will be made.
 

 
Configuration class creation

 The Configuration class used to initialize a service is a basic
 {@link org.apache.hadoop.conf.Configuration} instance. As the launcher is
 the entry point for an application, this implies that the HDFS, YARN or other
 default configurations will not have been forced in through the constructors
 of {@code HdfsConfiguration} or {@code YarnConfiguration}.
 
 What the launcher does do is use reflection to try and create instances of
 these classes simply to force in the common resources. If the classes are
 not on the classpath this fact will be logged.
 

 Applications may consider it essential to either force load in the relevant
 configuration, or pass it down to the service being created. In which
 case further measures may be needed.
 
 
1: Creation in an extended {@code ServiceLauncher}
 
 

 Subclass the Service launcher and override its
 {@link org.apache.hadoop.service.launcher.ServiceLauncher#createConfiguration()}
 method with one that creates the right configuration.
 This is good if a single
 launcher can be created for all services launched by a module, such as
 HDFS or YARN. It does imply a dedicated script to invoke the custom
 {@code main()} method.

 
2: Creation in {@code bindArgs()}

 

 In
 {@link org.apache.hadoop.service.launcher.LaunchableService#bindArgs(Configuration, List)},
 a new configuration is created:

 
 public Configuration bindArgs(Configuration config, List<String> args)
    throws Exception {
   Configuration newConf = new YarnConfiguration(config);
   return newConf;
 }
 

 This guarantees a configuration of the right type is generated for all
 instances created via the service launcher. It does imply that this is
 expected to be only way that services will be launched.

 3: Creation in {@code serviceInit()}

 
 protected void serviceInit(Configuration conf) throws Exception {
   super.serviceInit(new YarnConfiguration(conf));
 }
 

 
 This is a strategy used by many existing YARN services, and is ideal for
 services which do not implement the LaunchableService interface. Its one
 weakness is that the configuration is now private to that instance. Some
 YARN services use a single shared configuration instance as a way of
 propagating information between peer services in a
 {@link org.apache.hadoop.service.CompositeService}.
 While a dangerous practice, it does happen.


 Summary: the ServiceLauncher makes a best-effort attempt to load the
 standard Configuration subclasses, but does not fail if they are not present.
 Services which require a specific subclasses should follow one of the
 strategies listed;
 creation in {@code serviceInit()} is the recommended policy.
 
 
Configuration file loading

 Before the service is bound to the CLI, the ServiceLauncher scans through
 all the arguments after the first one, looking for instances of the argument
 {@link org.apache.hadoop.service.launcher.ServiceLauncher#ARG_CONF}
 argument pair: {@code --conf <file>}. This must refer to a file
 in the local filesystem which exists.
 
 It will be loaded into the Hadoop configuration
 class (the one created by the
 {@link org.apache.hadoop.service.launcher.ServiceLauncher#createConfiguration()}
 method.
 If this argument is repeated multiple times, all configuration
 files are merged with the latest file on the command line being the
 last one to be applied.
 

 All the {@code --conf <file>} argument pairs are stripped off
 the argument list provided to the instantiated service; they get the
 merged configuration, but not the commands used to create it.

 
Utility Classes

 

 
 {@link org.apache.hadoop.service.launcher.IrqHandler}: registers interrupt
 handlers using {@code sun.misc} APIs.
 

 
 {@link org.apache.hadoop.service.launcher.ServiceLaunchException}: a
 subclass of {@link org.apache.hadoop.util.ExitUtil.ExitException} which
 takes a String-formatted format string and a list of arguments to create
 the exception text.
 

 
 */


package org.apache.hadoop.service.launcher;

import java.util.List;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.ExitUtil;