Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Installation (MySQL)
-
Install Apache AirFlow.
wajig install \ python3-dev python3-pip \ mysql-server libmysqlclient-dev sudo AIRFLOW_GPL_UNIDECODE=yes pip3 install apache-airflow[mysql]
-
Add the following content into your
my.cnf
(e.g.,/etc/mysql/my.cnf
) file.[mysqld] explicit_defaults_for_timestamp=1
Below is an example of my.cnf
.
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/
!includedir /etc/mysql/mysql.conf.d/
[mysqld]
explicit_defaults_for_timestamp=1
-
Initial database.
airflow initdb
-
Start the web server.
airflow webserver -D -p 8080
-
Start a scheduler.
airflow scheduler -D
Tips and Traps
-
Just place your Python script which defines a DAG into the directory
AIRFLOW_HOME/dags/
and AirFlow will pick it up automatically. -
Avoid defining tasks using the BashOperator. Some bash commands (e.g.,
rsync
) might return the error code 0 even if it essentially succeeds. It is quite challenge to handle exceptions/error code of shell command to ingore non-critical errors.
Delete DAGs
https://gist.github.com/villasv/8bb1492beb46162c28dbc242d4887533
References
https://airflow.apache.org/start.html
https://airflow.apache.org/installation.html
https://airflow.apache.org/tutorial.html
https://airflow.apache.org/cli.html
Airflow Lesser Known Tips Tricks and Best Practises
We Are All Using AirFlow Wrong and How to Fix It
AirFlow Why Is Nothing Working
Yet Another Scalable Apache AirFlow with Docker Example Setup
https://kubernetes.io/blog/2018/06/28/airflow-on-kubernetes-part-1-a-different-kind-of-operator/
https://towardsdatascience.com/kubernetesexecutor-for-airflow-e2155e0f909c
https://bostata.com/built-to-scale-running-highly-concurrent-etl-with-apache-airflow/