DataFire integration for AWS Data Pipeline
npm install @datafire/amazonaws_datapipelineClient library for AWS Data Pipeline
bash
npm install --save @datafire/amazonaws_datapipeline
`
`js
let amazonaws_datapipeline = require('@datafire/amazonaws_datapipeline').create({
accessKeyId: "",
secretAccessKey: "",
region: ""
});amazonaws_datapipeline.ActivatePipeline({
"pipelineId": ""
}).then(data => {
console.log(data);
});
`Description
AWS Data Pipeline configures and manages a data-driven workflow called a pipeline. AWS Data Pipeline handles the details of scheduling and ensuring that data dependencies are met so that your application can focus on processing the data.
AWS Data Pipeline provides a JAR implementation of a task runner called AWS Data Pipeline Task Runner. AWS Data Pipeline Task Runner provides logic for common data management scenarios, such as performing database queries and running data analysis using Amazon Elastic MapReduce (Amazon EMR). You can use AWS Data Pipeline Task Runner as your task runner, or you can write your own task runner to provide custom data management.
AWS Data Pipeline implements two main sets of functionality. Use the first set to create a pipeline and define data sources, schedules, dependencies, and the transforms to be performed on the data. Use the second set in your task runner application to receive the next task ready for processing. The logic for performing the task, such as querying the data, running data analysis, or converting the data from one format to another, is contained within the task runner. The task runner performs the task assigned to it by the web service, reporting progress to the web service as it does so. When the task is done, the task runner reports the final success or failure of the task to the web service.
Actions
$3
`js
amazonaws_datapipeline.ActivatePipeline({
"pipelineId": ""
}, context)
`#### Input
* input
object
* parameterValues ParameterValueList
* pipelineId required id
* startTimestamp timestamp#### Output
* output ActivatePipelineOutput
$3
`js
amazonaws_datapipeline.AddTags({
"pipelineId": "",
"tags": []
}, context)
`#### Input
* input
object
* pipelineId required id
* tags required tagList#### Output
* output AddTagsOutput
$3
`js
amazonaws_datapipeline.CreatePipeline({
"name": "",
"uniqueId": ""
}, context)
`#### Input
* input
object
* description string
* name required id
* tags tagList
* uniqueId required id#### Output
* output CreatePipelineOutput
$3
`js
amazonaws_datapipeline.DeactivatePipeline({
"pipelineId": ""
}, context)
`#### Input
* input
object
* cancelActive cancelActive
* pipelineId required id#### Output
* output DeactivatePipelineOutput
$3
`js
amazonaws_datapipeline.DeletePipeline({
"pipelineId": ""
}, context)
`#### Input
* input
object
* pipelineId required id#### Output
Output schema unknown
$3
`js
amazonaws_datapipeline.DescribeObjects({
"pipelineId": "",
"objectIds": []
}, context)
`#### Input
* input
object
* marker string
* evaluateExpressions boolean
* marker string
* objectIds required idList
* pipelineId required id#### Output
* output DescribeObjectsOutput
$3
`js
amazonaws_datapipeline.DescribePipelines({
"pipelineIds": []
}, context)
`#### Input
* input
object
* pipelineIds required idList#### Output
* output DescribePipelinesOutput
$3
`js
amazonaws_datapipeline.EvaluateExpression({
"pipelineId": "",
"objectId": "",
"expression": ""
}, context)
`#### Input
* input
object
* expression required longString
* objectId required id
* pipelineId required id#### Output
* output EvaluateExpressionOutput
$3
`js
amazonaws_datapipeline.GetPipelineDefinition({
"pipelineId": ""
}, context)
`#### Input
* input
object
* pipelineId required id
* version string#### Output
* output GetPipelineDefinitionOutput
$3
`js
amazonaws_datapipeline.ListPipelines({}, context)
`#### Input
* input
object
* marker string
* marker string#### Output
* output ListPipelinesOutput
$3
`js
amazonaws_datapipeline.PollForTask({
"workerGroup": ""
}, context)
`#### Input
* input
object
* hostname id
* instanceIdentity InstanceIdentity
* workerGroup required string#### Output
* output PollForTaskOutput
$3
`js
amazonaws_datapipeline.PutPipelineDefinition({
"pipelineId": "",
"pipelineObjects": []
}, context)
`#### Input
* input
object
* parameterObjects ParameterObjectList
* parameterValues ParameterValueList
* pipelineId required id
* pipelineObjects required PipelineObjectList#### Output
* output PutPipelineDefinitionOutput
$3
`js
amazonaws_datapipeline.QueryObjects({
"pipelineId": "",
"sphere": ""
}, context)
`#### Input
* input
object
* limit string
* marker string
* limit int
* marker string
* pipelineId required id
* query Query
* sphere required string#### Output
* output QueryObjectsOutput
$3
`js
amazonaws_datapipeline.RemoveTags({
"pipelineId": "",
"tagKeys": []
}, context)
`#### Input
* input
object
* pipelineId required id
* tagKeys required stringList#### Output
* output RemoveTagsOutput
$3
`js
amazonaws_datapipeline.ReportTaskProgress({
"taskId": ""
}, context)
`#### Input
* input
object
* fields fieldList
* taskId required taskId#### Output
* output ReportTaskProgressOutput
$3
`js
amazonaws_datapipeline.ReportTaskRunnerHeartbeat({
"taskrunnerId": ""
}, context)
`#### Input
* input
object
* hostname id
* taskrunnerId required id
* workerGroup string#### Output
* output ReportTaskRunnerHeartbeatOutput
$3
`js
amazonaws_datapipeline.SetStatus({
"pipelineId": "",
"objectIds": [],
"status": ""
}, context)
`#### Input
* input
object
* objectIds required idList
* pipelineId required id
* status required string#### Output
Output schema unknown
$3
`js
amazonaws_datapipeline.SetTaskStatus({
"taskId": "",
"taskStatus": ""
}, context)
`#### Input
* input
object
* errorId string
* errorMessage errorMessage
* errorStackTrace string
* taskId required taskId
* taskStatus required TaskStatus#### Output
* output SetTaskStatusOutput
$3
`js
amazonaws_datapipeline.ValidatePipelineDefinition({
"pipelineId": "",
"pipelineObjects": []
}, context)
`#### Input
* input
object
* parameterObjects ParameterObjectList
* parameterValues ParameterValueList
* pipelineId required id
* pipelineObjects required PipelineObjectList#### Output
* output ValidatePipelineDefinitionOutput
Definitions
$3
* ActivatePipelineInput object: Contains the parameters for ActivatePipeline.
* parameterValues ParameterValueList
* pipelineId required id
* startTimestamp timestamp$3
* ActivatePipelineOutput object: Contains the output of ActivatePipeline.$3
* AddTagsInput object: Contains the parameters for AddTags.
* pipelineId required id
* tags required tagList$3
* AddTagsOutput object: Contains the output of AddTags.$3
* CreatePipelineInput object: Contains the parameters for CreatePipeline.
* description string
* name required id
* tags tagList
* uniqueId required id$3
* CreatePipelineOutput object: Contains the output of CreatePipeline.
* pipelineId required id$3
* DeactivatePipelineInput object: Contains the parameters for DeactivatePipeline.
* cancelActive cancelActive
* pipelineId required id$3
* DeactivatePipelineOutput object: Contains the output of DeactivatePipeline.$3
* DeletePipelineInput object: Contains the parameters for DeletePipeline.
* pipelineId required id$3
* DescribeObjectsInput object: Contains the parameters for DescribeObjects.
* evaluateExpressions boolean
* marker string
* objectIds required idList
* pipelineId required id$3
* DescribeObjectsOutput object: Contains the output of DescribeObjects.
* hasMoreResults boolean
* marker string
* pipelineObjects required PipelineObjectList$3
* DescribePipelinesInput object: Contains the parameters for DescribePipelines.
* pipelineIds required idList$3
* DescribePipelinesOutput object: Contains the output of DescribePipelines.
* pipelineDescriptionList required PipelineDescriptionList$3
* EvaluateExpressionInput object: Contains the parameters for EvaluateExpression.
* expression required longString
* objectId required id
* pipelineId required id$3
* EvaluateExpressionOutput object: Contains the output of EvaluateExpression.
* evaluatedExpression required longString$3
* Field object: A key-value pair that describes a property of a pipeline object. The value is specified as either a string value (StringValue) or a reference to another object (RefValue) but not as both.
* key required fieldNameString
* refValue fieldNameString
* stringValue fieldStringValue$3
* GetPipelineDefinitionInput object: Contains the parameters for GetPipelineDefinition.
* pipelineId required id
* version string$3
* GetPipelineDefinitionOutput object: Contains the output of GetPipelineDefinition.
* parameterObjects ParameterObjectList
* parameterValues ParameterValueList
* pipelineObjects PipelineObjectList$3
* InstanceIdentity object: Identity information for the EC2 instance that is hosting the task runner. You can get this value by calling a metadata URI from the EC2 instance. For more information, see Instance Metadata in the Amazon Elastic Compute Cloud User Guide. Passing in this value proves that your task runner is running on an EC2 instance, and ensures the proper AWS Data Pipeline service charges are applied to your pipeline.
* document string
* signature string$3
* InternalServiceError object: An internal service error occurred.
* message errorMessage$3
* InvalidRequestException object: The request was not valid. Verify that your request was properly formatted, that the signature was generated with the correct credentials, and that you haven't exceeded any of the service limits for your account.
* message errorMessage$3
* ListPipelinesInput object: Contains the parameters for ListPipelines.
* marker string$3
* ListPipelinesOutput object: Contains the output of ListPipelines.
* hasMoreResults boolean
* marker string
* pipelineIdList required pipelineList$3
* Operator object: Contains a logical operation for comparing the value of a field with a specified value.
* type OperatorType
* values stringList$3
* OperatorType string (values: EQ, REF_EQ, LE, GE, BETWEEN)$3
* ParameterAttribute object: The attributes allowed or specified with a parameter object.
* key required attributeNameString
* stringValue required attributeValueString$3
* ParameterAttributeList array
* items ParameterAttribute$3
* ParameterObject object: Contains information about a parameter object.
* attributes required ParameterAttributeList
* id required fieldNameString$3
* ParameterObjectList array
* items ParameterObject$3
* ParameterValue object: A value or list of parameter values.
* id required fieldNameString
* stringValue required fieldStringValue$3
* ParameterValueList array
* items ParameterValue$3
* PipelineDeletedException object: The specified pipeline has been deleted.
* message errorMessage$3
* PipelineDescription object: Contains pipeline metadata.
* description string
* fields required fieldList
* name required id
* pipelineId required id
* tags tagList$3
* PipelineDescriptionList array
* items PipelineDescription$3
* PipelineIdName object: Contains the name and identifier of a pipeline.
* id id
* name id$3
* PipelineNotFoundException object: The specified pipeline was not found. Verify that you used the correct user and account identifiers.
* message errorMessage$3
* PipelineObject object: Contains information about a pipeline object. This can be a logical, physical, or physical attempt pipeline object. The complete set of components of a pipeline defines the pipeline.
* fields required fieldList
* id required id
* name required id$3
* PipelineObjectList array
* items PipelineObject$3
* PipelineObjectMap array
* items object
* key id
* value PipelineObject$3
* PollForTaskInput object: Contains the parameters for PollForTask.
* hostname id
* instanceIdentity InstanceIdentity
* workerGroup required string$3
* PollForTaskOutput object: Contains the output of PollForTask.
* taskObject TaskObject$3
* PutPipelineDefinitionInput object: Contains the parameters for PutPipelineDefinition.
* parameterObjects ParameterObjectList
* parameterValues ParameterValueList
* pipelineId required id
* pipelineObjects required PipelineObjectList$3
* PutPipelineDefinitionOutput object: Contains the output of PutPipelineDefinition.
* errored required boolean
* validationErrors ValidationErrors
* validationWarnings ValidationWarnings$3
* Query object: Defines the query to run against an object.
* selectors SelectorList$3
* QueryObjectsInput object: Contains the parameters for QueryObjects.
* limit int
* marker string
* pipelineId required id
* query Query
* sphere required string$3
* QueryObjectsOutput object: Contains the output of QueryObjects.
* hasMoreResults boolean
* ids idList
* marker string$3
* RemoveTagsInput object: Contains the parameters for RemoveTags.
* pipelineId required id
* tagKeys required stringList$3
* RemoveTagsOutput object: Contains the output of RemoveTags.$3
* ReportTaskProgressInput object: Contains the parameters for ReportTaskProgress.
* fields fieldList
* taskId required taskId$3
* ReportTaskProgressOutput object: Contains the output of ReportTaskProgress.
* canceled required boolean$3
* ReportTaskRunnerHeartbeatInput object: Contains the parameters for ReportTaskRunnerHeartbeat.
* hostname id
* taskrunnerId required id
* workerGroup string$3
* ReportTaskRunnerHeartbeatOutput object: Contains the output of ReportTaskRunnerHeartbeat.
* terminate required boolean$3
* Selector object: A comparision that is used to determine whether a query should return this object.
* fieldName string
* operator Operator$3
* SelectorList array: The list of Selectors that define queries on individual fields.
* items Selector$3
* SetStatusInput object: Contains the parameters for SetStatus.
* objectIds required idList
* pipelineId required id
* status required string$3
* SetTaskStatusInput object: Contains the parameters for SetTaskStatus.
* errorId string
* errorMessage errorMessage
* errorStackTrace string
* taskId required taskId
* taskStatus required TaskStatus$3
* SetTaskStatusOutput object: Contains the output of SetTaskStatus.$3
* Tag object: Tags are key/value pairs defined by a user and associated with a pipeline to control access. AWS Data Pipeline allows you to associate ten tags per pipeline. For more information, see Controlling User Access to Pipelines in the AWS Data Pipeline Developer Guide.
* key required tagKey
* value required tagValue$3
* TaskNotFoundException object: The specified task was not found.
* message errorMessage$3
* TaskObject object: Contains information about a pipeline task that is assigned to a task runner.
* attemptId id
* objects PipelineObjectMap
* pipelineId id
* taskId taskId$3
* TaskStatus string (values: FINISHED, FAILED, FALSE)$3
* ValidatePipelineDefinitionInput object: Contains the parameters for ValidatePipelineDefinition.
* parameterObjects ParameterObjectList
* parameterValues ParameterValueList
* pipelineId required id
* pipelineObjects required PipelineObjectList$3
* ValidatePipelineDefinitionOutput object: Contains the output of ValidatePipelineDefinition.
* errored required boolean
* validationErrors ValidationErrors
* validationWarnings ValidationWarnings$3
* ValidationError object: Defines a validation error. Validation errors prevent pipeline activation. The set of validation errors that can be returned are defined by AWS Data Pipeline.
* errors validationMessages
* id id$3
* ValidationErrors array
* items ValidationError$3
* ValidationWarning object: Defines a validation warning. Validation warnings do not prevent pipeline activation. The set of validation warnings that can be returned are defined by AWS Data Pipeline.
* id id
* warnings validationMessages$3
* ValidationWarnings array
* items ValidationWarning$3
* attributeNameString string$3
* attributeValueString string$3
* boolean boolean$3
* cancelActive boolean$3
* errorMessage string$3
* fieldList array
* items Field$3
* fieldNameString string$3
* fieldStringValue string$3
* id string$3
* idList array
* items id$3
* int integer$3
* longString string$3
* pipelineList array
* items PipelineIdName$3
* string string$3
* stringList array
* items string$3
* tagKey string$3
* tagList array
* items Tag$3
* tagValue string$3
* taskId string$3
* timestamp string$3
* validationMessage string$3
* validationMessages array`