Why does the SELECT COUNT query in Amazon Athena return only one record even though the input JSON file has multiple records?

1 minute read
0

When I run SELECT COUNT(*) FROM TABLE, the output is "1," but the input file has multiple JSON records. I used org.openx.data.jsonserde.JsonSerDe to create the table.

Resolution

When records aren't separated by a newline character (\n), the SELECT COUNT(*) FROM TABLE query returns "1."

Example:

{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}

To correct this problem, be sure that the records are delimited by \n, as shown in the following example:

{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}

{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}

{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}

In this case, SELECT COUNT(*) FROM TABLE returns "3."

Note: The SerDE expects each JSON document to be on a single line of text with no line termination characters separating the fields in the record. For more information, see Best practices for reading JSON data.


Related information

SELECT

CREATE TABLE

Running SQL queries using Amazon Athena

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago