Query exhausted resources at this scale factor

Question

Got an `Query exhausted resources at this scale factor` error when running Query Id: f0263908-31e5-48d7-a8d7-e1ec2492ed6b.

I have some gzipped parquet files in S3 in the following layout -
```
/data
  /day=0
    data_0.parquet
  /day=1
    data_0.parquet
:::
```
I created the table using the following queries -
```
create external table if not exists mytablename (
    i1 int,
    i2 int,
    i3 int,
    i4 int,
    i5 int,
    i6 int,
    i7 int,
    i8 int,
    i9 int,
    i10 int,
    i11 int,
    i12 int,
    i13 int,
    s1 string,
    s2 string,
    s3 string,
    s4 string,
    s5 string,
    s6 string,
    s7 string,
    s8 string,
    s9 string,
    s10 string,
    s11 string,
    s12 string,
    s13 string,
    s14 string,
    s15 string,
    s16 string,
    s17 string,
    s18 string,
    s19 string,
    s21 string,
    s22 string,
    s23 string,
    s24 string,
    s25 string,
    s26 string,
    label int
)
PARTITIONED BY (day STRING)
STORED AS PARQUET
LOCATION 's3://my/s3/location/data/'
tblproperties ("parquet.compression"="GZIP");

MSCK REPAIR TABLE mytablename;
```

And then I ran the following test query where I hit the error:
```
select i1, s1 from mytablename limit 10;
```

Answer

“Query exhausted resources at this scale factor” indicates that you hit memory limits on the Athena cluster that runs your query. Most commonly, this can be probably as the query doesn’t follow the best practices. New Athena query engine version 3 is now available in most regions - try upgrading to that or to Version 2 (You should see a notification at the top of Athena interface with that information). This issue may sometime be transient and it may go away if you retry. If there are joins the query in the you ran or if you have not partitioned the data, I could try to optimizing it - but this does not seem to be the case here. Check the below items as well:

* How big is your data after compression in the parquet/gzip format? and how is your data accessed? (What the typical WHERE clause elements?). 
* Please check if your partition strategy is right for your data, as you have already partitioned by `day`
* Try bucketing your data by specifying `CLUSTERED BY () INTO  BUCKETS`
* Use the new query stats visualizer (click Query stats next to Query results, then Execution details) to figure out if it shows any errors.

Please refer to this blog for some tuning tips for your query to perform better:   https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/

Query exhausted resources at this scale factor

관련 콘텐츠