All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce Maven / Gradle / Ivy

Go to download

The AWS Java SDK for Amazon EMR module holds the client classes that are used for communicating with Amazon Elastic MapReduce Service

There is a newer version: 1.11.85
Show newest version
/*
 * Copyright 2010-2016 Amazon.com, Inc. or its affiliates. All Rights
 * Reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License").
 * You may not use this file except in compliance with the License.
 * A copy of the License is located at
 *
 *  http://aws.amazon.com/apache2.0
 *
 * or in the "license" file accompanying this file. This file is distributed
 * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
 * express or implied. See the License for the specific language governing
 * permissions and limitations under the License.
 */
package com.amazonaws.services.elasticmapreduce;

import com.amazonaws.*;
import com.amazonaws.regions.*;

import com.amazonaws.services.elasticmapreduce.model.*;

/**
 * Interface for accessing Amazon EMR.
 * 

*

* Amazon Elastic MapReduce (Amazon EMR) is a web service that makes it easy to * process large amounts of data efficiently. Amazon EMR uses Hadoop processing * combined with several AWS products to do tasks such as web indexing, data * mining, log file analysis, machine learning, scientific simulation, and data * warehousing. *

*/ public interface AmazonElasticMapReduce { /** * Overrides the default endpoint for this client * ("https://elasticmapreduce.amazonaws.com"). Callers can use this method * to control which AWS region they want to work with. *

* Callers can pass in just the endpoint (ex: * "elasticmapreduce.amazonaws.com") or a full URL, including the protocol * (ex: "https://elasticmapreduce.amazonaws.com"). If the protocol is not * specified here, the default protocol from this client's * {@link ClientConfiguration} will be used, which by default is HTTPS. *

* For more information on using AWS regions with the AWS SDK for Java, and * a complete list of all available endpoints for all AWS services, see: http://developer.amazonwebservices.com/connect/entry.jspa?externalID= * 3912 *

* This method is not threadsafe. An endpoint should be configured when * the client is created and before any service requests are made. Changing * it afterwards creates inevitable race conditions for any service requests * in transit or retrying. * * @param endpoint * The endpoint (ex: "elasticmapreduce.amazonaws.com") or a full URL, * including the protocol (ex: * "https://elasticmapreduce.amazonaws.com") of the region specific * AWS endpoint this client will communicate with. */ void setEndpoint(String endpoint); /** * An alternative to {@link AmazonElasticMapReduce#setEndpoint(String)}, * sets the regional endpoint for this client's service calls. Callers can * use this method to control which AWS region they want to work with. *

* By default, all service endpoints in all regions use the https protocol. * To use http instead, specify it in the {@link ClientConfiguration} * supplied at construction. *

* This method is not threadsafe. A region should be configured when the * client is created and before any service requests are made. Changing it * afterwards creates inevitable race conditions for any service requests in * transit or retrying. * * @param region * The region this client will communicate with. See * {@link Region#getRegion(com.amazonaws.regions.Regions)} for * accessing a given region. Must not be null and must be a region * where the service is available. * * @see Region#getRegion(com.amazonaws.regions.Regions) * @see Region#createClient(Class, * com.amazonaws.auth.AWSCredentialsProvider, ClientConfiguration) * @see Region#isServiceSupported(String) */ void setRegion(Region region); /** *

* AddInstanceGroups adds an instance group to a running cluster. *

* * @param addInstanceGroupsRequest * Input to an AddInstanceGroups call. * @return Result of the AddInstanceGroups operation returned by the * service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.AddInstanceGroups */ AddInstanceGroupsResult addInstanceGroups( AddInstanceGroupsRequest addInstanceGroupsRequest); /** *

* AddJobFlowSteps adds new steps to a running job flow. A maximum of 256 * steps are allowed in each job flow. *

*

* If your job flow is long-running (such as a Hive data warehouse) or * complex, you may require more than 256 steps to process your data. You * can bypass the 256-step limitation in various ways, including using the * SSH shell to connect to the master node and submitting queries directly * to the software running on the master node, such as Hive and Hadoop. For * more information on how to do this, go to Add More than 256 Steps to a Job Flow in the Amazon Elastic * MapReduce Developer's Guide. *

*

* A step specifies the location of a JAR file stored either on the master * node of the job flow or in Amazon S3. Each step is performed by the main * function of the main class of the JAR file. The main class can be * specified either in the manifest of the JAR or by using the MainFunction * parameter of the step. *

*

* Elastic MapReduce executes each step in the order listed. For a step to * be considered complete, the main function must exit with a zero exit code * and all Hadoop jobs started while the step was running must have * completed and run successfully. *

*

* You can only add steps to a job flow that is in one of the following * states: STARTING, BOOTSTRAPPING, RUNNING, or WAITING. *

* * @param addJobFlowStepsRequest * The input argument to the AddJobFlowSteps operation. * @return Result of the AddJobFlowSteps operation returned by the service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.AddJobFlowSteps */ AddJobFlowStepsResult addJobFlowSteps( AddJobFlowStepsRequest addJobFlowStepsRequest); /** *

* Adds tags to an Amazon EMR resource. Tags make it easier to associate * clusters in various ways, such as grouping clusters to track your Amazon * EMR resource allocation costs. For more information, see Tagging Amazon EMR Resources. *

* * @param addTagsRequest * This input identifies a cluster and a list of tags to attach. * @return Result of the AddTags operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.AddTags */ AddTagsResult addTags(AddTagsRequest addTagsRequest); /** *

* Provides cluster-level details including status, hardware and software * configuration, VPC settings, and so on. For information about the cluster * steps, see ListSteps. *

* * @param describeClusterRequest * This input determines which cluster to describe. * @return Result of the DescribeCluster operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.DescribeCluster */ DescribeClusterResult describeCluster( DescribeClusterRequest describeClusterRequest); /** *

* This API is deprecated and will eventually be removed. We recommend you * use ListClusters, DescribeCluster, ListSteps, * ListInstanceGroups and ListBootstrapActions instead. *

*

* DescribeJobFlows returns a list of job flows that match all of the * supplied parameters. The parameters can include a list of job flow IDs, * job flow states, and restrictions on job flow creation date and time. *

*

* Regardless of supplied parameters, only job flows created within the last * two months are returned. *

*

* If no parameters are supplied, then job flows matching either of the * following criteria are returned: *

*
    *
  • Job flows created and completed in the last two weeks
  • *
  • Job flows created within the last two months that are in one of the * following states: RUNNING, WAITING, * SHUTTING_DOWN, STARTING
  • *
*

* Amazon Elastic MapReduce can return a maximum of 512 job flow * descriptions. *

* * @param describeJobFlowsRequest * The input for the DescribeJobFlows operation. * @return Result of the DescribeJobFlows operation returned by the service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.DescribeJobFlows */ @Deprecated DescribeJobFlowsResult describeJobFlows( DescribeJobFlowsRequest describeJobFlowsRequest); /** * Simplified method form for invoking the DescribeJobFlows operation. * * @see #describeJobFlows(DescribeJobFlowsRequest) */ @Deprecated DescribeJobFlowsResult describeJobFlows(); /** *

* Provides more detail about the cluster step. *

* * @param describeStepRequest * This input determines which step to describe. * @return Result of the DescribeStep operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.DescribeStep */ DescribeStepResult describeStep(DescribeStepRequest describeStepRequest); /** *

* Provides information about the bootstrap actions associated with a * cluster. *

* * @param listBootstrapActionsRequest * This input determines which bootstrap actions to retrieve. * @return Result of the ListBootstrapActions operation returned by the * service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.ListBootstrapActions */ ListBootstrapActionsResult listBootstrapActions( ListBootstrapActionsRequest listBootstrapActionsRequest); /** *

* Provides the status of all clusters visible to this AWS account. Allows * you to filter the list of clusters based on certain criteria; for * example, filtering by cluster creation date and time or by status. This * call returns a maximum of 50 clusters per call, but returns a marker to * track the paging of the cluster list across multiple ListClusters calls. *

* * @param listClustersRequest * This input determines how the ListClusters action filters the list * of clusters that it returns. * @return Result of the ListClusters operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.ListClusters */ ListClustersResult listClusters(ListClustersRequest listClustersRequest); /** * Simplified method form for invoking the ListClusters operation. * * @see #listClusters(ListClustersRequest) */ ListClustersResult listClusters(); /** *

* Provides all available details about the instance groups in a cluster. *

* * @param listInstanceGroupsRequest * This input determines which instance groups to retrieve. * @return Result of the ListInstanceGroups operation returned by the * service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.ListInstanceGroups */ ListInstanceGroupsResult listInstanceGroups( ListInstanceGroupsRequest listInstanceGroupsRequest); /** *

* Provides information about the cluster instances that Amazon EMR * provisions on behalf of a user when it creates the cluster. For example, * this operation indicates when the EC2 instances reach the Ready state, * when instances become available to Amazon EMR to use for jobs, and the IP * addresses for cluster instances, etc. *

* * @param listInstancesRequest * This input determines which instances to list. * @return Result of the ListInstances operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.ListInstances */ ListInstancesResult listInstances(ListInstancesRequest listInstancesRequest); /** *

* Provides a list of steps for the cluster. *

* * @param listStepsRequest * This input determines which steps to list. * @return Result of the ListSteps operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.ListSteps */ ListStepsResult listSteps(ListStepsRequest listStepsRequest); /** *

* ModifyInstanceGroups modifies the number of nodes and configuration * settings of an instance group. The input parameters include the new * target instance count for the group and the instance group ID. The call * will either succeed or fail atomically. *

* * @param modifyInstanceGroupsRequest * Change the size of some instance groups. * @return Result of the ModifyInstanceGroups operation returned by the * service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.ModifyInstanceGroups */ ModifyInstanceGroupsResult modifyInstanceGroups( ModifyInstanceGroupsRequest modifyInstanceGroupsRequest); /** * Simplified method form for invoking the ModifyInstanceGroups operation. * * @see #modifyInstanceGroups(ModifyInstanceGroupsRequest) */ ModifyInstanceGroupsResult modifyInstanceGroups(); /** *

* Removes tags from an Amazon EMR resource. Tags make it easier to * associate clusters in various ways, such as grouping clusters to track * your Amazon EMR resource allocation costs. For more information, see Tagging Amazon EMR Resources. *

*

* The following example removes the stack tag with value Prod from a * cluster: *

* * @param removeTagsRequest * This input identifies a cluster and a list of tags to remove. * @return Result of the RemoveTags operation returned by the service. * @throws InternalServerException * This exception occurs when there is an internal failure in the * EMR service. * @throws InvalidRequestException * This exception occurs when there is something wrong with user * input. * @sample AmazonElasticMapReduce.RemoveTags */ RemoveTagsResult removeTags(RemoveTagsRequest removeTagsRequest); /** *

* RunJobFlow creates and starts running a new job flow. The job flow will * run the steps specified. Once the job flow completes, the cluster is * stopped and the HDFS partition is lost. To prevent loss of data, * configure the last step of the job flow to store results in Amazon S3. If * the JobFlowInstancesConfig * KeepJobFlowAliveWhenNoSteps parameter is set to * TRUE, the job flow will transition to the WAITING state * rather than shutting down once the steps have completed. *

*

* For additional protection, you can set the JobFlowInstancesConfig * TerminationProtected parameter to TRUE to lock * the job flow and prevent it from being terminated by API call, user * intervention, or in the event of a job flow error. *

*

* A maximum of 256 steps are allowed in each job flow. *

*

* If your job flow is long-running (such as a Hive data warehouse) or * complex, you may require more than 256 steps to process your data. You * can bypass the 256-step limitation in various ways, including using the * SSH shell to connect to the master node and submitting queries directly * to the software running on the master node, such as Hive and Hadoop. For * more information on how to do this, go to Add More than 256 Steps to a Job Flow in the Amazon Elastic * MapReduce Developer's Guide. *

*

* For long running job flows, we recommend that you periodically store your * results. *

* * @param runJobFlowRequest * Input to the RunJobFlow operation. * @return Result of the RunJobFlow operation returned by the service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.RunJobFlow */ RunJobFlowResult runJobFlow(RunJobFlowRequest runJobFlowRequest); /** *

* SetTerminationProtection locks a job flow so the Amazon EC2 instances in * the cluster cannot be terminated by user intervention, an API call, or in * the event of a job-flow error. The cluster still terminates upon * successful completion of the job flow. Calling SetTerminationProtection * on a job flow is analogous to calling the Amazon EC2 * DisableAPITermination API on all of the EC2 instances in a cluster. *

*

* SetTerminationProtection is used to prevent accidental termination of a * job flow and to ensure that in the event of an error, the instances will * persist so you can recover any data stored in their ephemeral instance * storage. *

*

* To terminate a job flow that has been locked by setting * SetTerminationProtection to true, you must first unlock the * job flow by a subsequent call to SetTerminationProtection in which you * set the value to false. *

*

* For more information, go to Protecting a Job Flow from Termination in the Amazon Elastic * MapReduce Developer's Guide. *

* * @param setTerminationProtectionRequest * The input argument to the TerminationProtection operation. * @return Result of the SetTerminationProtection operation returned by the * service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.SetTerminationProtection */ SetTerminationProtectionResult setTerminationProtection( SetTerminationProtectionRequest setTerminationProtectionRequest); /** *

* Sets whether all AWS Identity and Access Management (IAM) users under * your account can access the specified job flows. This action works on * running job flows. You can also set the visibility of a job flow when you * launch it using the VisibleToAllUsers parameter of * RunJobFlow. The SetVisibleToAllUsers action can be called only by * an IAM user who created the job flow or the AWS account that owns the job * flow. *

* * @param setVisibleToAllUsersRequest * The input to the SetVisibleToAllUsers action. * @return Result of the SetVisibleToAllUsers operation returned by the * service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.SetVisibleToAllUsers */ SetVisibleToAllUsersResult setVisibleToAllUsers( SetVisibleToAllUsersRequest setVisibleToAllUsersRequest); /** *

* TerminateJobFlows shuts a list of job flows down. When a job flow is shut * down, any step not yet completed is canceled and the EC2 instances on * which the job flow is running are stopped. Any log files not already * saved are uploaded to Amazon S3 if a LogUri was specified when the job * flow was created. *

*

* The maximum number of JobFlows allowed is 10. The call to * TerminateJobFlows is asynchronous. Depending on the configuration of the * job flow, it may take up to 5-20 minutes for the job flow to completely * terminate and release allocated resources, such as Amazon EC2 instances. *

* * @param terminateJobFlowsRequest * Input to the TerminateJobFlows operation. * @return Result of the TerminateJobFlows operation returned by the * service. * @throws InternalServerErrorException * Indicates that an error occurred while processing the request and * that the request was not completed. * @sample AmazonElasticMapReduce.TerminateJobFlows */ TerminateJobFlowsResult terminateJobFlows( TerminateJobFlowsRequest terminateJobFlowsRequest); /** * Shuts down this client object, releasing any resources that might be held * open. This is an optional method, and callers are not expected to call * it, but can if they want to explicitly release any open resources. Once a * client has been shutdown, it should not be used to make any more * requests. */ void shutdown(); /** * Returns additional metadata for a previously executed successful request, * typically used for debugging issues where a service isn't acting as * expected. This data isn't considered part of the result data returned by * an operation, so it's available through this separate, diagnostic * interface. *

* Response metadata is only cached for a limited period of time, so if you * need to access this extra diagnostic information for an executed request, * you should use this method to retrieve it as soon as possible after * executing a request. * * @param request * The originally executed request. * * @return The response metadata for the specified request, or null if none * is available. */ ResponseMetadata getCachedResponseMetadata(AmazonWebServiceRequest request); }





© 2015 - 2025 Weber Informatics LLC | Privacy Policy