Trigger in Azure Data Factory/Synapse

 There are three types of trigger in ADF/Synapse:

a) Schedule b)Tumbling Window c) Event

a) Schedule: When we create a schedule trigger we need to specify a schedule (start date, recurrence, end date), and associates with pipeline. Pipeline and trigger have a many to many relationships like one trigger can kick off multiple pipeline and Multiple trigger can kick off single pipelines.


Create a schedule trigger using Azure UI:

Create a schedule trigger using Power shell:

create a .json file in any folder like:  C:\ADFv2QuickStartPSH\

Content of File: {

    "properties": {

        "name": "MyTrigger",

        "type": "ScheduleTrigger",

        "typeProperties": {

            "recurrence": {

                "frequency": "Minute",

                "interval": 15,

                "startTime": "2017-12-08T00:00:00Z",

                "endTime": "2017-12-08T01:00:00Z",

                "timeZone": "UTC"

            }

        },

        "pipelines": [{

                "pipelineReference": {

                    "type": "PipelineReference",

                    "referenceName": "Adfv2QuickStartPipeline"

                },

                "parameters": {

                    "inputPath": "adftutorial/input",

                    "outputPath": "adftutorial/output"

                }

            }

        ]

    }

}

Create a Trigger:

 Set-AzDataFactoryV2Trigger -ResourceGroupName $ResourceGroupName -DataFactoryName $DataFactoryName -Name "MyTrigger" -DefinitionFile "C:\ADFv2QuickStartPSH\MyTrigger.json"

Confirm the status of trigger:

Get-AzDataFactoryV2Trigger -ResourceGroupName $ResourceGroupName -DataFactoryName $DataFactoryName -Name "MyTrigger"

Start Trigger:

Start-AzDataFactoryV2Trigger -ResourceGroupName $ResourceGroupName -DataFactoryName $DataFactoryName -Name "MyTrigger"

Confirm the status of Trigger started or not:

Get-AzDataFactoryV2Trigger -ResourceGroupName $ResourceGroupName -DataFactoryName $DataFactoryName -Name "MyTrigger"

Get The Trigger Run:

Get-AzDataFactoryV2TriggerRun -ResourceGroupName $ResourceGroupName -DataFactoryName $DataFactoryName -TriggerName "MyTrigger" -TriggerRunStartedAfter "2017-12-08T00:00:00" -TriggerRunStartedBefore "2017-12-08T01:00:00"


Tumbling Window Trigger: Tumbling window triggers are a type of trigger that fires at a periodic time interval from a specified start time, while retaining state. Tumbling windows are a series of fixed-sized, non-overlapping, and contiguous time intervals. A tumbling window trigger has a one-to-one relationship with a pipeline and can only reference a singular pipeline.





delayDelay:The amount of time to delay the start of data processing for the window. The pipeline run is started after the expected execution time plus the amount of delay. The delay defines how long the trigger waits past the due time before triggering a new run. The delay doesn’t alter the window startTime. For example, a delay value of 00:10:00 implies a delay of 10 minutes.



maxConcurrencyThe number of simultaneous trigger runs that are fired for windows that are ready. For example, to back fill hourly runs for yesterday results in 24 windows. If maxConcurrency = 10, trigger events are fired only for the first 10 windows (00:00-01:00 - 09:00-10:00). After the first 10 triggered pipeline runs are complete, trigger runs are fired for the next 10 windows (10:00-11:00 - 19:00-20:00). Continuing with this example of maxConcurrency = 10, if there are 10 windows ready, there are 10 total pipeline runs. If there's only 1 window ready, there's only 1 pipeline run


dependsOn: typeThe type of TumblingWindowTriggerReference. Required if a dependency is set.
dependsOn: sizeThe size of the dependency tumbling window.
dependsOn: offsetThe offset of the dependency trigger. A timespan value that must be negative in a self-dependency. If no value specified, the window is the same as the trigger itself.

Execution order of windows in backfill scenario:  the startTime of trigger is in the past, then based on this formula, M=(CurrentTime- TriggerStartTime)/TumblingWindowSize, the trigger will generate {M} backfill(past) runs in parallel, honoring trigger concurrency, before executing the future runs.

Create a tumbling window dependency: 

Offset of the dependency trigger. Provide a value in time span format and both negative and positive offsets are allowed. This property is mandatory if the trigger is depending on itself and in all other cases it is optional. Self-dependency should always be a negative offset. If no value specified, the window is the same as the trigger itself.

Size of the dependency tumbling window. Provide a positive timespan value. This property is optional.

Tumbling window Self dependency properties:

In scenarios where the trigger shouldn't proceed to the next window until the preceding window is successfully completed, build a self-dependency. 

Dependency offset


Dependency size



Self-dependency

Dependency on another tumbling window trigger



Dependency on itself



Difference between schedule Trigger and Tumbling window Trigger:

Create a trigger that runs a pipeline in response to a storage event: Data integration scenarios often require customers to trigger pipelines based on events happening in storage account, such as the arrival or deletion of a file in Azure Blob Storage account. 


Data Factory and Synapse pipelines natively integrate with Azure Event Grid, which lets you trigger pipelines on such events.





Select your storage account from the Azure subscription dropdown or manually using its Storage account resource ID. Choose which container you wish the events to occur on. Container selection is required, but be mindful that selecting all containers can lead to a large number of events.

The Blob path begins with and Blob path ends with properties allow you to specify the containers, folders, and blob names for which you want to receive events.

  • Blob path begins with: The blob path must start with a folder path. Valid values include 2018/ and 2018/april/shoes.csv. This field can't be selected if a container isn't selected.
  • Blob path ends with: The blob path must end with a file name or extension. Valid values include shoes.csv and .csv. Container and folder names, when specified, they must be separated by a /blobs/ segment. For example, a container named 'orders' can have a value of /orders/blobs/2018/april/shoes.csv. To specify a folder in any container, omit the leading '/' character. For example, april/shoes.csv
Select whether your trigger will respond to a Blob created event, Blob deleted event, or both. 
Select whether or not your trigger ignores blobs with zero bytes.

To successfully create a new or update an existing Storage Event Trigger, the Azure account signed into the service needs to have appropriate access to the relevant storage account. Otherwise, the operation will fail with Access Denied.

Any of following RBAC settings works for storage event trigger:

  • Owner role to the storage account
  • Contributor role to the storage account
  • Microsoft.EventGrid/EventSubscriptions/Write permission to storage account

Flow Diagram:


custom event trigger:

Data integration scenarios often require Azure Data Factory customers to trigger pipelines when certain events occur. Data Factory native integration with Azure Event Grid now covers custom topics

To use the custom event trigger in Data Factory, you need to first set up a custom topic in Event Grid.


Trigger Metadata in pipeline:

Pipeline sometimes needs to understand and reads metadata from trigger that invokes it. For instance, with Tumbling Window Trigger run, based upon window start and end time, pipeline will process different data slices or folders.

his pattern is especially useful for Tumbling Window Trigger, where trigger provides window start and end time, and Custom Event Trigger, where trigger parse and process values in custom defined data field.