As all of you know, Apache Airflow is used for orchestration, scheduling and monitoring (at least I like to use it that way). In this blog post, I will share with you some commonly used commands.
In order to trigger a DAG which would run the DAG only once (start date and end date are same):
airflow backfill -s 2020-10-28 -e 2020-10-28 DAGID
PS: If the -s -e parameters are not same, it will run multiple times. E.g. if the difference between the parameters -s and -e is 30 days, then it will run 30 times!
In order to trigger a specific task for a DAG:
airflow backfill -t TASKID 2020-10-28 -e 2020-10-28 DAGID
In order to clear a specific task:
airflow clear -t TASKID 2020-10-28 -e 2020-10-28 DAGID
Running only cleared tasks on the whole DAG without backfilling each task:
Just clear the tasks which we want to clear, then start the whole DAG for the same run date. Example command:
airflow backfill -s 2020-10-28 -e 2020-10-28 DAGID
Starting and stopping airflow webserver:
If there is an existing "Master Airflow Webserver" running, stop it with the below commands:
First, list the currently running webserver(s):
ps aux | grep webserver
Second, after listing them, find the process ID of your "gunicorn: master [airflow-webserver]" and run the below command
kill PROCESSID (e.g. kill 5036)
Then start the web server with your port number again:
airflow webserver -p PORTNO
Open a new terminal while webserver is running and do your backfill on that terminal.