4 Use Cases for When to Use Celery in a Flask Application
Celery helps you run code asynchronously or on a periodic schedule which are very common things you'd want to do in most web projects.
Last week I was a guest on the Profitable Python podcast where we mostly talked about how to grow an audience from scratch and potentially generate income within 6 months as a new developer.
But we also talked about a few other things, one of which was when it might be a good idea to use Celery in a Flask project or really any Python driven web application. Since that was only a side topic of the podcast, I wanted to expand on that subject so here we are.
Here’s a couple of use cases for when you might want to reach for using Celery. Personally I find myself using it in nearly every Flask application I create.
# Use Case #1: Sending Emails Out
I would say this is one of the most textbook examples of why it’s a good idea to use Celery or reach for a solution that allows you to execute a task asynchronously.
For example, imagine someone visits your site’s contact page in hopes to fill it out and send you an email. After they click the send email button an email will be sent to your inbox.
Sending Emails without Celery
The best way to explain why Celery is useful is by first demonstrating how it would work if you weren’t using Celery.
A typical workflow for this would be:
- User visits
/contact
page and is presented with a contact form - User fills out contact form
- User clicks send email button
- User’s mouse cursor turns into a busy icon and now their browser hangs
- Your Flask app handles the POST request
- Your Flask app validates the form
- Your Flask app likely compiles a template of the email
- Your Flask app takes that email and sends it to your configured email provider
- Your Flask app waits until your email provider (gmail, sendgrid, etc.) responds
- Your Flask app returns an HTML response to the user by redirecting to a page
- User’s browser renders the new page and the busy mouse cursor is gone
Notice how steps 4 and 11 are in italics. Those are very important steps because between steps 4 and 11 the user is sitting there with a busy mouse cursor icon and your site appears to be loading slow for that user. They are waiting for a response.
The problem with the above workflow:
The real problem here is you have no control over how long steps 8 and 9 take. They might take 500ms, 2 seconds, 20 seconds or even time out after 120 seconds.
That’s because you’re contacting an external site. In this case gmail’s SMTP servers or some other transactional email service like sendgrid or mailgun. You have absolutely no control over how long it will take for them to process your request.
What’s really dangerous about this scenario is now imagine if 10 visitors were trying to fill out your contact form and you had gunicorn or uwsgi running which are popular Python application servers.
If you don’t have them configured with multiple workers and / or threads then your app server is going to get very bogged down and it won’t be able to handle all 10 of those requests until each one finishes sequentially.
Normally this isn’t a problem if your requests finish quickly, such as within less than 100ms and it’s especially not too big of a deal if you have a couple of processes running. You could crank through dozens of concurrent requests in no time, but not if they take 2 or 3 seconds each – that changes everything.
It gets worse too because other requests are going to start to hang too. These requests might be another visitor trying to access your home page or any other page of your application.
Basically your app server is going to get overloaded by waiting and the longer your requests take to respond the worse it’s going to get for every request and before you know it, now it’s taking 8 seconds to load a simple home page instead of 80 milliseconds.
Why Celery Is an Amazing Tool
Technically the secret sauce to solving the above problem is being able to do steps 8 and 9 in the background. That’s why Celery is often labeled as a “background worker”.
I say “technically” there because you could solve this problem with something like Python 3’s async / await functionality but that is a much less robust solution out of the box.
Celery will keep track of the work you send to it in a database back-end such as Redis or RabbitMQ. This keeps the state out of your app server’s process which means even if your app server crashes your job queue will still remain.
Celery also allows you to track tasks that fail. A “task” or job is really just some work you tell Celery to do, such as sending an email. It can be anything.
Celery also allows you to set up retry policies for tasks that fail. For example if that email fails to send you can instruct Celery to try let’s say 5 times and even do advanced retry strategies like exponential back off which means to do something like try again after 4 seconds then 8, 16, 32 and 64 seconds later. You can configure all of this in great detail.
Celery also allows you to rate limit tasks. For example if you wanted to protect your contact form to not allow more than 1 email per 10 seconds for each visitor you can set up custom rules like that very easily. You can do this based on IP address or even per logged in user on your system.
What I’m trying to say is Celery is a very powerful tool which lets you do production ready things with almost no boilerplate and very little configuration. That’s why I very much prefer using it over async / await or other asynchronous solutions.
It’s also why I introduced using Celery very early on in my Build a SAAS App with Flask course. One of the first things we do in that course is cover sending emails for a contact form and we use Celery right out of the gate because I’m all for providing production ready examples instead of toy examples.
The cool thing is we use Docker in that course so adding Celery and Redis into the project is no big deal at all. It’s just a few lines of YAML configuration and we’re done. You can take a look at that in the open source version of the app we build in the course.
Sending Emails with Celery
With the above knowledge in hand, let’s adjust the workflow a bit:
- User visits
/contact
page and is presented with a contact form - User fills out contact form
- User clicks send email button
- User’s mouse cursor turns into a busy icon and now their browser hangs
- Your Flask app handles the POST request
- Your Flask app validates the form
- Your Flask app calls a Celery task that you created
- Your Flask app returns an HTML response to the user by redirecting to a page
- User’s browser renders the new page and the busy mouse cursor is gone
What’s much different about the above workflow vs the original one is steps 4 through 9 will finish executing almost immediately. I wouldn’t be surprised if everything finishes within 20 milliseconds. Meaning you could handle 50 of these requests in 1 second and that’s only with 1 process / thread on your app server.
That’s a huge improvement and it’s also very consistent. We’re back to controlling how long it takes for the user to get a response and we’re not bogging down our app server. We can easily scale to hundreds of concurrent requests per second by just adding more app server processes (or CPU cores basically).
We no longer need to send the email during the request / response cycle and wait for a response from your email provider. We can just execute the Celery task in the background and immediately respond with a redirect.
The user really doesn’t need to know if the email was delivered or not. They are just going to likely see a simple flash message that says thanks for contacting you and you’ll reply to them soon.
But if you did want to monitor the task and get notified when it finishes you can do that too with Celery. However in this case, it doesn’t really matter if the email gets delivered 500ms or 5 seconds after that point in time because it’s all the same from the user’s point of view.
By the way, in case you’re wondering, the Celery task in the new step 7 would be the original workflow’s steps 7-9 which were:
- Your Celery task likely compiles a template of the email
- Your Celery task takes that email and sends it to your configured email provider
- Your Celery task waits until your email provider (gmail, sendgrid, etc.) responds
So it’s the same things being done. It’s just that Celery handles it in the background.
# Use Case #2: Connecting to Third Party APIs
We just talked about sending emails but really this applies to doing any third party API calls to an external service that you don’t control. Really any external network call.
This also really ties into making API calls in your typical request / response cycle of an HTTP connection. For example, the user visits a page, you want to contact a third party API and now you want to respond back to the user.
I would reach for Celery pretty much always for the above use case and if I needed to update the UI of a page after getting the data back from the API then I would either use websockets or good old long polling.
Either one allows you to respond back immediately and then update your page after you get the data back.
Websockets are nice because as soon as you get the data back from the API in your Celery task then you can broadcast that to the user but if you already have long polling set up that works too. A lot of people dislike long polling but in some cases it can get you pretty far without needing to introduce the complexities of using websockets.
By the way in the Build a SAAS App with Flask course I recently added a free update that covers using websockets. You’ll see how seamlessly you can integrate it into a Celery task.
# Use Case #3: Performing Long Running Tasks
Another use case is doing something that takes a pretty long time. This could be generating a report that might take 2 minutes to generate or perhaps transcoding a video.
These are things you would expect to see a progress bar for.
Could you imagine how crazy it would be if you weren’t using Celery for this? Imagine loading up a page to generate a report and then having to keep your browser tab open for 2 full minutes otherwise the report would fail to generate.
That would be madness, but Celery makes this super easy to pull off without that limitation. You can use the same exact strategies as the second use case to update your UI as needed. With websockets it would be quite easy to push progress updates too.
# Use Case #4: Running Tasks on a Schedule
This last use case is different than the other 3 listed above but it’s a very important one.
Imagine if you wanted to perform a task every day at midnight. In the past you might have reached for using cron jobs right? It wouldn’t be too bad to configure a cron job to run that task.
But there’s a couple of problems with using cron. Keep in mind, the same problems are there with systemd timers too.
For starters you would likely have to split that scheduled functionality out into its own file so you can call it independently.
Realistically that’s not too bad but it’s something you’ll want to do, and it becomes maybe annoying if you have to deal with loading in configuration settings or environment variables for that file (it’s 1 more thing to deal with for each scheduled task).
Also, in today’s world, we’re moving towards putting most things into containers and it’s considered a best practice to only run 1 process per container.
In other words you wouldn’t want to run both the cron daemon and your app server in the same container. If you do that, you’re going very much against the grain from community vetted best practices.
So you might think to just run cron on your Docker host and change your cron job to run a Docker command instead of just calling your Flask file straight up. That’s totally doable and would work but there’s a problem with that approach too.
What if you’ve scaled out to 3 web app servers? If each one had its own cron job then you would be running that task 3 times a day instead of once, potentially doing triple the work. That’s definitely not an intended result and could introduce race conditions if you’re not careful.
Now, I know, you could just decide to configure the cron jobs on 1 of the 3 servers but that’s going down a very iffy path because now suddenly you have these 3 servers but 1 of them is different. What happens if you’re doing a rolling restart and the 1 that’s assigned to do the cron job is unavailable when a task is supposed to run?
It’ll quickly become a configuration nightmare (I know because I tried this in the past)!
The above problems go away with Celery. It has a concept of a “beat” server that you can run where you can configure tasks that get run on whatever schedule you want. It even supports the cron style syntax, so you can do all sorts of wild schedules like every 2nd Tuesday of the month at 1am.
It’s also very much integrated with the configuration of your application. Since it’s just another task, all of your app’s configuration and environment variables are available. That’s a big win not to have to deal with that on a per file basis.
Another win is that the state of this schedule is stored in your Celery back-end such as Redis, which is only saved in 1 spot. That means if you have 1 or 100 web app servers your tasks will only get executed once.
In the rolling restart example, it won’t matter if 1 of the 3 app servers are unavailable. As long as at least 1 of them is available then your scheduled task will be able to run.
It’s a pretty sweet deal. In my opinion it’s even more easy to set up than a cron job too.
We use scheduled tasks a fair bit in the Build a SAAS App with Flask
course. For example in one
case we run a task every day at midnight which checks to see if a user’s credit
card expiration date is going to expire soon, and if it does then we mark the
card as is_expiring
and now the web UI can show a banner saying to please
update your card details.
Little things like that help reduce churn rate in a SAAS app. There’s a million examples of where you may want to have scheduled tasks. Perhaps you could look for user accounts that haven’t had activity in 6 months and then send out a reminder email or delete them from your database. You get the idea!
Celery is for sure one of my favorite Python libraries.
What are you using Celery for? Let me know below.