Does Athena support a similar function like STRING_AGG in querying tables?

0

I wanted to concatenate string values from different rows in Athena using STRING_AGG function in SQL. Athena complains that STRING_AGG is not registered.

SYNTAX_ERROR: line 1:8: Function string_agg not registered

Is there a way to register STRING_AGG function? Or, is there an alternative way to get what I want?

Below is my SQL statement:

select STRING_AGG(item, ","), STRING_AGG(event_subtype,","), STRING_ADD(event_dt_utc, ",") from ( ( select user_id as user, advertiser_id as advertiser, tracked_item as item, event_subtype, event_dt_utc from fact_mouse_conversions_glue where total_product_sales > 0 and user_id IS NOT NULL and tracked_item IS NOT NULL ) union all ( select user_id as user, advertiser_id as advertiser, tracked_item as item, event_subtype, event_dt_utc from fact_clickstream_conversions_glue where user_id IS NOT NULL and tracked_item IS NOT NULL ) ) group by user

asked a year ago3029 views
1 Answer
1
Accepted Answer

STRING_AGG is not supported in Athena. To check which functions are supported in Athena, please check https://docs.aws.amazon.com/athena/latest/ug/functions.html based on the V2 or V3 of the engine you are using. Maybe LISTAGG may solve your problem in V3. see https://trino.io/docs/current/functions/aggregate.html#listagg


SELECT id, LISTAGG(value, ',') WITHIN GROUP (ORDER BY o) csv_value
FROM (VALUES
    (100, 1, 'a'),
    (200, 3, 'c'),
    (200, 2, 'b')
) t(id, o, value)
GROUP BY id
ORDER BY id;

Let me get to your other question. Yes, it is possible to register a custom function in Amazon Athena. Athena allows you to create custom scalar functions, which are defined in a Java or Scala JAR file and registered with Athena. These custom functions can then be used in SELECT, FROM, WHERE, and HAVING clauses in Athena queries.

To create a custom function, you will need to follow these steps:

  1. Write the function code in Java or Scala and package it into a JAR file. The function must implement the org.apache.hadoop.hive.ql.exec.UDF interface and include the @UDFType and @Description annotations.
  2. Create an Amazon S3 bucket to store the JAR file.
  3. Upload the JAR file to the S3 bucket.
  4. Create the function in Athena by running a CREATE FUNCTION statement. The syntax for this statement is as follows:
CREATE FUNCTION [function_name]
AS 'com.example.MyFunction'
USING JAR 's3://[bucket_name]/[jar_file_name]';

Replace function_name with the name you want to give to the function, com.example.MyFunction with the fully qualified name of the function class in the JAR file, and s3://[bucket_name]/[jar_file_name] with the location of the JAR file in S3.

Once the function has been created, you can use it in your Athena queries by calling it like any other function. For example:

SELECT my_function(column) FROM table;

I hope this helps! Let me know if you have any questions. If the answer helps, please accept/upvote answer so that others in this forum benefit from this.

profile pictureAWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions