How do I access Amazon S3 Requester Pays buckets from AWS Glue, Amazon EMR, or Athena?

2 minute read
0

I want to access an Amazon Simple Storage Service (Amazon S3) Requester Pays bucket from AWS Glue, Amazon EMR, or Amazon Athena.

Resolution

To access S3 buckets that have Requester Pays turned on, all requests to the bucket must have the Requester Pays header.

AWS Glue

AWS Glue requests to Amazon S3 don't include the Requester Pays header by default. Without the Requester Pays header, an API call to a Requester Pays bucket fails with an AccessDenied error.

To add the Requester Pays header to an ETL script, use hadoopConfiguration().set() to set fs.s3.useRequesterPaysHeader to true on the GlueContext variable or the Apache Spark session variable.

GlueContext:

glueContext._jsc.hadoopConfiguration().set("fs.s3.useRequesterPaysHeader","true")

Spark session:

spark._jsc.hadoopConfiguration().set("fs.s3.useRequesterPaysHeader","true")

The following is an example ETL script that includes the header:


import sys from awsglue.transforms
import * from awsglue.utils
import getResolvedOptions from pyspark.context
import SparkContext from awsglue.context
import GlueContext from awsglue.job
import Job from awsglue.dynamicframe
import DynamicFrame

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()  
glueContext = GlueContext(sc)
spark = glueContext.spark_session  
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

spark._jsc.hadoopConfiguration().set("fs.s3.useRequesterPaysHeader","true")
# glueContext._jsc.hadoopConfiguration().set("fs.s3.useRequesterPaysHeader","true")

##AWS Glue DynamicFrame read and write
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "your_database_name", table_name = "your_table_name", transformation_ctx = "datasource0")
datasource0.show()
datasink = glueContext.write_dynamic_frame.from_options(frame = datasource0, connection_type = "s3", connection_options = {"path":"s3://awsdoc-example-bucket/path-to-source-location/"}, format = "csv")

##Spark DataFrame read and write
df = spark.read.csv("s3://awsdoc-example-bucket/path-to-source-location/")
df.show()
df.write.csv("s3://awsdoc-example-bucket/path-to-target-location/")

job.commit()

Note: In the preceding script, replace the following values with your values:

  • database_name with the name of your database
  • your_table_name with the name of your table
  • s3://awsdoc-example-bucket/path-to-source-location/ with the path to the source bucket
  • s3://awsdoc-example-bucket/path-to-target-location/ with the path to the destination bucket

Amazon EMR

To add fs.s3.useRequesterPaysHeader for Amazon EMR, set the following property in /usr/share/aws/emr/emrfs/conf/emrfs-site.xml:

<property>
   <name>fs.s3.useRequesterPaysHeader</name>
   <value>true</value>
</property>

Athena

To allow workgroup members to query Requester Pays buckets, complete the following steps:

  1. Open the Athena console.
  2. In the navigation pane, choose Workgroups.
  3. Select your workgroup, and then choose Edit.
  4. In Settings, choose Turn on queries on requester pays buckets in Amazon S3. For more information, see Edit a workgroup.

Related information

Configuring Requester Pays on a bucket

Downloading objects from Requester Pays buckets

How do I troubleshoot 403 Access Denied errors from Amazon S3?

AWS OFFICIAL
AWS OFFICIALUpdated a month ago
6 Comments

Which version of EMR supports this hadoop configuration?

Will this work in EMR 6.7?

spark._jsc.hadoopConfiguration().set(
    "fs.s3.useRequesterPaysHeader", "true"
)
replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied a year ago

Hello, is this "useRequesterPaysHeader" configuration supported from Sagemaker studio/data wrangler settings?

AWS
replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied a year ago

Hello, is it possible to set up 'Requester Pays header' somehow for the users, who use AWS Console (Web UI) only? For specific accounts or roles

Nataly
replied 2 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 months ago