MWAA Airflow error adding celery.worker_autoscale

0

Trying to set this configuration:

celery.worker_autoscale to 40,20.

On save get's below error message.

"Some of the provided configurations do not have the expected format: celery.worker_autoscale, e.g: core.log_format."

I;m not changing any other settings. This is adding to existing environment.

5 Answers
0

I understand that you were not able to set celery.worker_autoscale=40,20, while other configuration seems fine.

This is MWAA expected behavior because it requires the same number for the two parameters, such as 5,5.

This configuration decides the minimum and maximum number of task concurrency for Workers and the values specified in minimum, maximum must be the same.

Reference: https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-tuning.html#best-practices-tuning-tasks-params

Because your environment class is medium, the default value (if you configure this) is 10,10, which means you can run 10 tasks in parallel, as long as you have sufficient resources on the worker to do so. Currently you have minimum number of workers = maximum number of workers = 1, which would allow you to run 1*10 tasks in parallel in your whole environment.

On the other hand, you can achieve auto-scaling by enabling scaling up and down the number of workers, if it is suitable for your use case. You can achieve this by going to your MWAA console -> edit -> Environment class -> Maximum worker count and Minimum worker count

Reference: https://docs.aws.amazon.com/mwaa/latest/userguide/mwaa-autoscaling.html#mwaa-autoscaling-console

AWS
answered a year ago
0

Thanks for the details. Those links are helpful to see additional options. As per AWS they need to be same which I did tried. I tried setting the option with same values.

celery.worker_autoscale 40,40 Still got the same error.

The node class I have is large. So not sure if that has anything to do with any limit. Minimum worker count =3 and Max worker count = 5 is my configuration

Note: But per Airflow document. they do not need to be same. But I'm trying same values as you suggested. https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#worker-autoscale

answered a year ago
0

You can update celery.worker_autoscale per https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-tuning.html#best-practices-tuning-tasks

However, the maximum values as described in https://docs.aws.amazon.com/mwaa/latest/userguide/environment-class.html#environment-class-sizes are the maximum number of Airflow tasks that each size of Fargate worker can handle. If you exceed those numbers, your Airflow tasks will fail with no logs nor warning due to insufficient Linux resources. As such, you may lower celery.worker_autoscale (for example, if you have resource-heavy tasks) but you should not increase it.

AWS
John_J
answered a year ago
0

Hi, can you explicitly state the celery.worker_autoscale for a small environment please?

Based on the link in your answer in the second paragraph "However, the maximum values as described..." it seems to be 50 ("Up to 50"). However, I am finding also that celery.worker_autoscale=40,40 throws the same error when trying to update the environment, despite being <50. 20,20 seems to work, but this does not seem to correlate with what you have said.

So yea, can you please explicitly state the 'maximum values' for each size worker available in MWAA?

Thanks

answered a year ago
0

As stated on https://docs.aws.amazon.com/mwaa/latest/userguide/best-practices-tuning.html#best-practices-tuning-tasks-params for a small environment you shouldn't go beyond 5,5, for medium 10,10 and for large 20, 20. As mentioned by others above, if you have intensive resource consuming tasks, you can lower these numbers (like for example 3,3 for small env). For a small environment with 5,5 and 10 workers you should be able to run 50 concurrent tasks. If you need more, you can increase the number of workers up to 25, but beware this can also lead to a need to increase the number of schedulers, which is capped to 5.

rj
answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions