For this exercise, we chose Celery that is an asynchronous task queue/job queue based on distributed message passing. Tasks can execute asynchronously (in the background) or synchronously (wait until ready). Celery requires a message transport to send and receive message. We chose Rabbitmq because it works well with celery. Other broker could be Redis, but for this exercise we used it as a memcachedb, a kind of persistent key-value store, for managing locks through all the tasks. We can use this distributed lock to have our tasks try to acquire a non-blocking lock, and exit if the lock isn’t acquired.
This project runs with docker
(you can use traditional virtualenv
but it's prepared out of the box for docker
).
We choose python 3.6.4
and not python 3.7
because Celery
doesn't support it yet.
Also to manage docker, we are using docker-compose.
If you already have docker
and docker-compose
, just run:
make start
# only test and flake8
make test
This command will download the images and build them in a container.
Right now the celery worker is configure to execute 1) no more than 3 tasks in parallel. We can change this changing the number of concurrency of the worker:
celery worker --app=task_manager.celeryapp:app --concurrency=3 --loglevel=info
Other option is to run several workers:
docker-compose scale worker=5
Celery tasks have a custom decorator in order to achieve: 2) each target (dave for example) can only execute one task at once. To do this, we choose Redis instead of django cache though this last option is recommended here because if memcached (or some other non- persistent cache) is used and (1) the cache daemon crashes or (2) the cache key is culled before the appropriate expiration time / lock release, then you have a race condition where two or more tasks could simultaneously acquire the task lock. We follow this recommendation: this) but with some changes. So, in case that the target is busy (locked), we will retry the task with a custom delay. Also the task has a rate limit configured.