io.cdap.wrangler.api.lineage.Lineage Maven / Gradle / Ivy
/*
* Copyright © 2019 Cask Data, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not
* use this file except in compliance with the License. You may obtain a copy of
* the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*/
package io.cdap.wrangler.api.lineage;
import io.cdap.cdap.api.annotation.Beta;
/**
* This class Lineage
defines a mutation defined by the directive that gets capture as lineage.
*
* Directives have to implement this class to inject their mutations for lineage to be constructed.
*
* The method lineage
is invoked separately in the prepareRun
phase of the pipeline
* execution. Before the method lineage
is invoked, the framework ensures that the receipe is
* parsed and initialize on each directive that is included is called. All the class variables of the directive
* are available to be used within the lineage
method.
*
*
*
* {@link Mutation} captures all the changes the directive is going to be applying of the data. It has
* two major methods:
*
*
*
* readable
- This method is defined to provide the post transformation description of the mutation
* the directive is applying on data. Care should be taken to use the right tense as the lineage would be consumed
* by users after the transformation has been applied on the data. As best practise, it's highly recommended to use
* past-tense for describing the transformations. Additionally, the language of the description is not trying
* to provide complete details and configuration, but actually focusing on operations that directive has
* performed on data.
*
* -
*
relation
- This methods defines the relations between various columns either being used
* as target or source for performing the data transformation.
*
*
*
* Following are few examples of how the method can be implemented:
*
*
* @Override
* public Mutation lineage() {
* return Mutation.builder()
* .readable("Looking up catalog using value in column '%s' and results written into column '%s', src, dest)
* .relation(src, Many.of(src, dest))
* .build();
* }
*
*
* Another example:
*
*
* @Override
* public Mutation lineage() {
* return Mutation.builder()
* .readable("Dropped columns %s", columns")
* .drop(Many.of(columns))
* .build();
* }
*
*
* @see Mutation
* @see Relation
* @see Many
*/
@Beta
public interface Lineage {
/**
* Returns a Mutation that can be used to generate lineage.
*
* @return a instance of {@link Mutation} object used to generate lineage.
*/
Mutation lineage();
}
© 2015 - 2024 Weber Informatics LLC | Privacy Policy