Skip to content

Athena CloudWatch Connector Failing with Multiple Errors (INTERNAL_ERROR_QUERY_ENGINE, FUNCTION_NOT_FOUND, RequestEntityTooLarge)

0

We are experiencing critical issues with our CloudWatch Athena connector that are preventing us from querying our CloudWatch Logs. We have conducted extensive troubleshooting and believe the issue is internal to Athena or the connector's registration process.

Summary of the Problem:

We are unable to query a specific CloudWatch Log Group, , using the CloudWatch Athena connector. All attempts have failed with different errors, pointing to multiple underlying issues.

Environment:

  • Athena Engine Version: Athena engine version 3
  • CloudWatch Connector Version: We started with version 2024.6.1 and, as part of our troubleshooting, have updated to the latest available version, 2025.43.1, by redeploying our CDK stack. The issues persist.

Failure Scenarios:

We have identified two distinct failure modes:

1. Direct Table Access Failure (RequestEntityTooLarge)

  • Action: Attempting to query the special all_log_streams table for our log group.
  • Finding: The query runs for ~2 minutes and fails with a TABLE_NOT_FOUND error in Athena. We discovered this is due to our log group containing a very large number of log streams (~1 million). In the connector's Lambda logs (/aws/lambda/cloudwatch-logs), we found the root cause: [ERROR] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413. {"errorMessage":"Exceeded maximum allowed payload size (6291556 bytes).","errorType":"RequestEntityTooLarge"}. The connector is failing because the metadata response exceeds the 6MB Lambda payload limit.

2. Passthrough Query Failure (FUNCTION_NOT_FOUND & INTERNAL_ERROR_QUERY_ENGINE)

  • Action: To work around the large number of log streams, we are trying to use the documented passthrough query feature with system.query.
  • Finding 1 (FUNCTION_NOT_FOUND): With the old connector version (2024.6.1), our queries failed immediately with FUNCTION_NOT_FOUND: line 1:21: Table function 'system.query' not registered.
  • Finding 2 (Confirmation): We confirmed this by running SHOW FUNCTIONS against the connector's data catalog, and system.query was not in the list of registered functions.
  • Finding 3 (INTERNAL_ERROR_QUERY_ENGINE): After updating the connector to the latest version (2025.43.1) and redeploying the stack (with a new catalog name cloudwatch-logs-v2 to avoid a CloudFormation deployment error), we are now getting a different, more severe error when using system.query: [ErrorCode: INTERNAL_ERROR_QUERY_ENGINE] Amazon Athena experienced an internal error while executing this query.

Last Failed Query Execution ID (Internal Error):

  • 0fed79e3-6dec-4f17-ae5f-d5f377691e8a

We have exhausted all troubleshooting steps, including:

  • Verifying IAM permissions (they are broad).
  • Updating the connector to the latest version.
  • Trying different date formats for STARTTIME/ENDTIME in system.query.
  • Checking the Lambda logs for both the old and new connectors, which show no errors related to the INTERNAL_ERROR_QUERY_ENGINE failure.

Our CloudWatch Athena connector is non-functional. We believe there is a bug or configuration issue either in how the connector registers its system.query function with Athena Engine V3, or an internal issue within Athena itself.

Thank you.

  • When calling 'system.query' here is what appears in the log of the in the connector's Lambda logs

    Picked up JAVA_TOOL_OPTIONS: -Djavax.net.ssl.trustStore=/var/lang/lib/security/cacerts
    Transforming org/apache/logging/log4j/core/lookup/JndiLookup (lambdainternal.CustomerClassLoader@179d3b25)
    START RequestId: ec0aca55-369e-40e0-aca4-2680d1c04756 Version: $LATEST
    2025-11-24 14:53:27 ec0aca55-369e-40e0-aca4-2680d1c04756 INFO  BaseAllocator:65 - Debug mode disabled. Enable with the VM option -Darrow.memory.debug.allocator=true.
    2025-11-24 14:53:27 ec0aca55-369e-40e0-aca4-2680d1c04756 INFO  DefaultAllocationManagerOption:91 - allocation manager type not specified, using netty as the default type
    2025-11-24 14:53:27 ec0aca55-369e-40e0-aca4-2680d1c04756 WARN  CheckAllocator:104 - More than one DefaultAllocationManager on classpath. Choosing first found
    2025-11-24 14:53:27 ec0aca55-369e-40e0-aca4-2680d1c04756 INFO  CheckAllocator:99 - Using DefaultAllocationManager at memory/netty/DefaultAllocationManagerFactory.class
    
asked 6 months ago136 views
1 Answer
-1

The INTERNAL_ERROR_QUERY_ENGINE error you're experiencing is an internal Athena error that occurs on the query engine side. This type of error is not typically caused by user input but rather by an issue within Athena's system itself.

Based on your detailed description, you're facing multiple issues with the CloudWatch Athena connector:

  1. RequestEntityTooLarge Error: This is occurring because your log group contains approximately 1 million log streams, causing the metadata response to exceed the 6MB Lambda payload limit. This is a known limitation when dealing with log groups that have an extremely large number of log streams.

  2. FUNCTION_NOT_FOUND Error: With the older connector version, the system.query function wasn't properly registered, which explains why it wasn't showing up in your SHOW FUNCTIONS results.

  3. INTERNAL_ERROR_QUERY_ENGINE Error: After updating to the latest connector version, you're now encountering this internal Athena error, which indicates an issue within Athena's query engine itself.

Here are some potential approaches to resolve these issues:

  1. For the RequestEntityTooLarge issue:
  • Consider using time-based filtering to reduce the number of log streams being processed at once
  • Use the passthrough query functionality (system.query) once it's working properly
  1. For the FUNCTION_NOT_FOUND and INTERNAL_ERROR_QUERY_ENGINE issues:
  • Retry your query, as sometimes these internal errors are transient
  • Simplify your query if possible to identify if a specific part is causing the issue
  • Verify that all tables referenced in your query exist and have the expected structure
  • Check Athena service status for any reported issues in your region

If these steps don't resolve the issue, you should contact AWS Support for further assistance. They have access to more detailed logs and can investigate the root cause of these internal errors. Since INTERNAL_ERROR_QUERY_ENGINE is an internal system error (not caused by your input), AWS Support would be best positioned to help resolve this.

Remember that you won't be charged for queries that result in the INTERNAL_ERROR_QUERY_ENGINE error.
Sources
INTERNAL_ERROR_QUERY_ENGINE | AWS re:Post
ErrorCode: INTERNAL_ERROR_QUERY_ENGINE | AWS re:Post
ATHENA INTERNAL_ERROR_QUERY_ENGINE | AWS re:Post
Athena error catalog - Amazon Athena

answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.