How to connect a program to Amazon Polly?

0

Sorry for this vague question, I'm talking about the right way to access the API from a program.

I'm really confused with the options the docs provide, and the more I read the docs, the less I believe I will succeed.

I have two ideas in mind:

  1. Make a command line tool for generating audio from text using Amazon Polly
  2. Afterwards, make a mobile application that does the same

At the time being, I cannot even login into AWS CLI. For 5 hours I was following the "recommended"-breadcrumbed trail in the docs and seemed to end up with a nonsense scenario where I have to open the browser every hour. So I gave up and need your help folks.

How do I imagine it should work? Well, I create a config file (one time) and add there my credentials (one time) and programmatically get/refresh the token. That would suffice. As far as I understood, I can use AWS CLI to save credentials in a config file, and then SDK can read that file. However, I'm not sure if SDK can refresh the token by itself.

What I have currently done. I've created an IAM Identity Center user, and assigned to it two sets of predefined permission: AdministratorAccess and PowerUserAccess. Now I have to somehow use them to connect my tool to Amazon Polly, right? And it shouldn't involve any browser interaction. What could you recommend?

Mobile App

Even when I get the above working, I don't understand how it can work in the mobile app world.

I really doubt a mobile app should ask for any tokens (especially taking into account how difficult it is for me, an experienced IT developer, to get even connected to the service LOL). So how can a mobile app access Amazon Polly? Building-in my own credentials/service keys doesn't seem to be a good idea, because I'm not gonna pay for the users usage, right? So what options do I have?

Regards, Thanks in advance!

1 Answer
1

What's missing from your question is what you're trying to do. What I think you're saying is "I have a mobile app that needs to do something with Polly". I'm going to assume that's what you're looking for.

First, you can do what you describe but that will involve putting some sort of long-term credentials into your application - either in the form of IAM credentials or something else that will allow you to get those IAM credentials. As you've figured out, the credentials are necessary for calling Polly directly. But this is not necessarily the best way (even taking out the long-lived credential problem) - there are likely to be other services that you want to call on the AWS side.

Also: It would be very unwise to assign the permissions you mention to those credentials - if an external (malicious) user were to gain access they can do pretty much anything in your AWS account. Please use permissions that are least-privileged; in this case - access to Polly only.

What I'd recommend is that you use API Gateway. This will allow your application to call an API which then triggers specific actions. One action could be to trigger a Lambda function which calls Polly on your behalf. Another action might be to call a second AWS service to do some other work - again, a Lambda function.

The advantage here is that you don't need long-term credentials on the client; and you can create very specific permissions for each Lambda function so that they can only access the services they are supposed to. For example, the Polly Lambda function can only access Polly. I wouldn't be assigning Administrator or PowerUser access here either as there is no need to do that.

You can also use a service such as Cognito to authenticate your users and from there only allow those authenticated users to access the API (which you can link to Cognito).

This sounds complex and it kind of is; but it gives you a base to build on that is very flexible and extremely powerful.

You can start here: https://aws.amazon.com/getting-started/hands-on/build-serverless-web-app-lambda-apigateway-s3-dynamodb-cognito/ - you can substitute S3 and DynamoDB with Polly. There are many other guides too.

Edit based on comment below: The challenge here is that you need IAM credentials to communicate with Polly (or any AWS service for that matter). If you were running this from a computer that you control and is always in your possession you could install long-lived credentials on that computer. That isn't (strictly) best practice but the risk there is reasonably low. But for mobile applications that are in the hands of other people you don't necessarily want long-lived credentials out there.

You could put long-lived credentials in the app and only give them permission to call Polly. But the risk there is that someone could use those credentials to call the service and use it without your permission - extra charges could be an issue there.

So temporary credentials are a way to fix this issue. Cognito allows you to authenticate users and then issue temporary IAM credentials. That works if your users have to log in to user your app. If not, another method would be to use API Gateway (as mentioned above) and have an API Key which is in your app. This is (again, unfortunately) another form of long-lived credential but you can do rate limiting and some other work on API Gateway to block access if you need to.

Another way to do this is using IAM Roles Anywhere - here you would put a certificate in the app and use that to authenticate to get IAM credentials. The intent behind this is to use this with servers on premises but it can also work for an application. If you were going to do this (but it's quite a bit of work and Cognito would be better) then I'd recommend a certificate for each instance of the application - that way you can deny access on a per-user basis. Again, much work for you but much better control and security.

We can't stop you putting IAM credentials in your application. But it is generally not a good idea hence the suggestions above which have better security and more control for you.

profile pictureAWS
EXPERT
answered 9 months ago
  • Hello, Brettski. Thank you very much for pointing me out to such powerful tools, I think they're worth considering when developing for the web. However, at the time being I'm trying to solve a little different task. My tool calls Amazon Polly API directly to get audio streams. But it also works with other TTS engines, like Google's one for example. Then I process the streams with FFmpeg etc. So I think I really need to find a way to authenticate in Amazon Polly. What is the way to do this? How can I authenticate?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions