Query exhausted resources at this scale factor

0

Got an Query exhausted resources at this scale factor error when running Query Id: f0263908-31e5-48d7-a8d7-e1ec2492ed6b.

I have some gzipped parquet files in S3 in the following layout -

/data
  /day=0
    data_0.parquet
  /day=1
    data_0.parquet
:::

I created the table using the following queries -

create external table if not exists mytablename (
    i1 int,
    i2 int,
    i3 int,
    i4 int,
    i5 int,
    i6 int,
    i7 int,
    i8 int,
    i9 int,
    i10 int,
    i11 int,
    i12 int,
    i13 int,
    s1 string,
    s2 string,
    s3 string,
    s4 string,
    s5 string,
    s6 string,
    s7 string,
    s8 string,
    s9 string,
    s10 string,
    s11 string,
    s12 string,
    s13 string,
    s14 string,
    s15 string,
    s16 string,
    s17 string,
    s18 string,
    s19 string,
    s21 string,
    s22 string,
    s23 string,
    s24 string,
    s25 string,
    s26 string,
    label int
)
PARTITIONED BY (day STRING)
STORED AS PARQUET
LOCATION 's3://my/s3/location/data/'
tblproperties ("parquet.compression"="GZIP");

MSCK REPAIR TABLE mytablename;

And then I ran the following test query where I hit the error:

select i1, s1 from mytablename limit 10;
질문됨 2년 전832회 조회
1개 답변
0

“Query exhausted resources at this scale factor” indicates that you hit memory limits on the Athena cluster that runs your query. Most commonly, this can be probably as the query doesn’t follow the best practices. New Athena query engine version 3 is now available in most regions - try upgrading to that or to Version 2 (You should see a notification at the top of Athena interface with that information). This issue may sometime be transient and it may go away if you retry. If there are joins the query in the you ran or if you have not partitioned the data, I could try to optimizing it - but this does not seem to be the case here. Check the below items as well:

  • How big is your data after compression in the parquet/gzip format? and how is your data accessed? (What the typical WHERE clause elements?).
  • Please check if your partition strategy is right for your data, as you have already partitioned by day
  • Try bucketing your data by specifying CLUSTERED BY (<bucketed columns>) INTO <number of buckets> BUCKETS
  • Use the new query stats visualizer (click Query stats next to Query results, then Execution details) to figure out if it shows any errors.

Please refer to this blog for some tuning tips for your query to perform better: https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/

profile pictureAWS
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠