- Newest
- Most votes
- Most comments
It's good that you've already started to take steps on your application side to help manage this spiky demand: In my limited experience, two of the best ways to help support quota requests are 1) evidence of sustained significant (compared to your total quota availability throughout the month) usage and 2) evidence that your application is sharing the responsibility of smoothing out the workload.
So to start with, yes I'd say it's worth raising an increase request but to still expect some need for smoothing on your side.
Demand-side measures
The ideal solution is to tackle this problem at source where possible: The more you can steer your customers to submit their documents incrementally as soon as they have them, and not just fall in to a "batch by default" mindset, the less waiting in queue they'll end up with. Share this best-practice with your customers and give them tools to make it simpler to implement: For example pre-built connectors to source data stores if that's relevant for your app, or APIs from your side that encourage submitting docs incrementally.
Noisy-neighbour isolation
Beyond that, you may want to implement your own burst quota mechanism to prevent "noisy neighbours" from impacting other customers' experience. You could track which files have been submitted by each tenant, and extend your counting system to impose a separate limit by tenant ID so that one big file dump doesn't completely block other users who are following best-practices.
Distributed & scalable architectures
Regardless of whether you move ahead with per-tenant limits or a single queue, it's worth thinking about the scale you want to build for. There are multiple distributed architectures that can tackle this kind of requirement, with different pros & cons in terms of pricing.
In this "distributed semaphore" pattern, each work item gets a Step Functions execution which periodically retries to get a lock from DynamoDB. There's a CDK example here that applies the pattern to Textract in particular. It's still retry based, and the profile of retries will affect pricing of both Step Functions (since it charges by state transitions) and DynamoDB (since it charges by request throughput amongst other things) - but SFn can in principle handle millions of open executions.
You could explore other combinations e.g. using SQS with some kind of event-driven callback to try and reduce costly retries, but there's usually some kind of trade-off between distributing the system for scale, guaranteeing no deadlocks, and avoiding retries altogether.
Here are a few other samples you could explore for similar patterns:
Relevant content
- asked 4 years ago
- asked 3 years ago
- AWS OFFICIALUpdated a year ago