docs.org.apache.nifi.processors.asana.GetAsanaObject.additionalDetails.md Maven / Gradle / Ivy

Go to download


# GetAsanaObject

### Description

This processor collects various objects (e.g. tasks, comments, etc...) from Asana via the specified
`AsanaClientService`. When the processor started for the first time with a given configuration it collects each of the
objects matching the user specified criteria, and emits `FlowFile`s of each on the `NEW` relationship. Then, it polls
Asana in the frequency of the configured _Run Schedule_ and detects changes by comparing the object fingerprints. When
there are updates, it emits them through the `UPDATED` and `REMOVED` relationships, respectively.

### FlowFile contents & attributes

Each emitted `FlowFile` contains the Json representation of the fetched Asana object. These can be processed further via
the respective processors, that accept text data in this format. The `FlowFile`s emitted from the `REMOVED` relationship
have no content, because the actual data is not stored in the processor, and so there is no way to retrieve the deleted
content.

Each `FlowFile`, regardless to which relationship they were emitted from, have an `asana.gid` attribute set, which
contain the ID of the object in Asana. These IDs are globally unique within the Asana instance, regardless of what type
of object they were assigned to. In case of _Events_, these IDs are generated by the client, because Asana does not keep
track of these objects.

### Object fingerprints

These are used only for content change detection.

Fingerprints are generally calculated by applying an `SHA-512` algorithm on the retrieved object. In case of immutable
objects, like _Attachments_, these fingerprints are static, so _update_s (which is impossible anyway) are not detected.
In case of _Projects_ and _Tasks_, where the last modification time is available, these timestamps are stored as
fingerprints.

### Batch size

By default, this processor emits each fetched object from Asana in a separate `FlowFile`. This is usually OK for a
workspace having low traffic, and thus generating data in low rate. For workspaces with high volume of traffic, it is
advisable to set the batch size to a reasonably high value, to have better performance. With this value set to something
other than the default (1), the processor will emit `FlowFile`s that have multiple items batched together in a Json
array, but in exchange, without having the `asana.gid` attribute set.

### Configuring filters, filtering by name

In case of collecting some objects, like _Project Events_, _Tasks_, and _Team Members_, the processor requires/allows
defining filters. In example: if you would like to collect _Tasks_, then you need to define the project from where the
tasks you would like to collect.

In these cases, when the filters refer to some parent object, you need to provide its name in the configuration, in
case-sensitive manner. Another important note to keep in mind, Asana lets the users create multiple objects with the
same name. In example: you can create two projects with name 'My project'. But when you need to refer to this project by
its name, it is impossible to figure out which 'My project' you intended to refer to, therefore these situations should
be avoided. In such cases, this processor picks the first one returned by Asana when listing them. This is not random,
but the ordering is not guaranteed.

### Further reading about Asana

* [Asana Academy](https://academy.asana.com)
* [Asana Guide](https://asana.com/guide)
* [Asana Developer Documentation](https://developers.asana.com/docs)
* [Java client library for the Asana API](https://github.com/Asana/java-asana/)