Hey everyone I am trying to query a set of JSON files in S3 with Athena and I am getting the Hive cursor error for invalid JSON even though the files in question are valid single-line JSON. Is there a line-length limitation for Athena, or must a JSON record end with a newline character even if there is only one record in the file?
It looked like the Glue crawler got the schema right I have a bunch of these files all same structure all same keys. JSON validates correctly in my JSON editor but Athena thinks it is invalid JSON.
This is an example of the contents of one of my JSON files:
{"all_keys": ["frequently_bought_together.products[0].asin", "frequently_bought_together.products[0].image", "frequently_bought_together.products[0].link", "frequently_bought_together.products[0].price.currency", "frequently_bought_together.products[0].price.raw", "frequently_bought_together.products[0].price.symbol", "frequently_bought_together.products[0].price.value", "frequently_bought_together.products[0].title", "frequently_bought_together.products[1].asin", "frequently_bought_together.products[1].image", "frequently_bought_together.products[1].link", "frequently_bought_together.products[1].price.currency", "frequently_bought_together.products[1].price.raw", "frequently_bought_together.products[1].price.symbol", "frequently_bought_together.products[1].price.value", "frequently_bought_together.products[1].title", "frequently_bought_together.products[2].asin", "frequently_bought_together.products[2].image", "frequently_bought_together.products[2].link", "frequently_bought_together.products[2].price.currency", "frequently_bought_together.products[2].price.raw", "frequently_bought_together.products[2].price.symbol", "frequently_bought_together.products[2].price.value", "frequently_bought_together.products[2].title", "frequently_bought_together.total_price.currency", "frequently_bought_together.total_price.raw", "frequently_bought_together.total_price.symbol", "frequently_bought_together.total_price.value", "product.asin", "product.attributes[0].name", "product.attributes[0].value", "product.attributes[1].name", "product.attributes[1].value", "product.attributes[2].name", "product.attributes[2].value", "product.attributes[3].name", "product.attributes[3].value", "product.attributes[4].name", "product.attributes[4].value", "product.bestsellers_rank[0].category", "product.bestsellers_rank[0].link", "product.bestsellers_rank[0].rank", "product.bestsellers_rank[1].category", "product.bestsellers_rank[1].link", "product.bestsellers_rank[1].rank", "product.bestsellers_rank[2].category", "product.bestsellers_rank[2].link", "product.bestsellers_rank[2].rank", "product.bestsellers_rank_flat", "product.brand", "product.buybox_winner.availability.dispatch_days", "product.buybox_winner.availability.raw", "product.buybox_winner.availability.stock_level", "product.buybox_winner.availability.type", "product.buybox_winner.condition.is_new", "product.buybox_winner.fulfillment.is_fulfilled_by_amazon", "product.buybox_winner.fulfillment.is_fulfilled_by_third_party", "product.buybox_winner.fulfillment.is_sold_by_amazon", "product.buybox_winner.fulfillment.is_sold_by_third_party", "product.buybox_winner.fulfillment.standard_delivery.date", "product.buybox_winner.fulfillment.standard_delivery.name", "product.buybox_winner.fulfillment.third_party_seller.id", "product.buybox_winner.fulfillment.third_party_seller.link", "product.buybox_winner.fulfillment.third_party_seller.name", "product.buybox_winner.fulfillment.type", "product.buybox_winner.is_amazon_fresh", "product.buybox_winner.is_prime", "product.buybox_winner.is_prime_exclusive_deal", "product.buybox_winner.price.currency", "product.buybox_winner.price.raw", "product.buybox_winner.price.symbol", "product.buybox_winner.price.value", "product.buybox_winner.shipping.currency", "product.buybox_winner.shipping.raw", "product.buybox_winner.shipping.symbol", "product.buybox_winner.shipping.value", "product.categories[0].category_id", "product.categories[0].link", "product.categories[0].name", "product.categories[1].category_id", "product.categories[1].link", "product.categories[1].name", "product.categories[2].category_id", "product.categories[2].link", "product.categories[2].name", "product.categories[3].category_id", "product.categories[3].link", "product.categories[3].name", "product.categories[4].category_id", "product.categories[4].link", "product.categories[4].name", "product.categories_flat", "product.color", "product.description", "product.dimensions", "product.feature_bullets", "product.feature_bullets_count", "product.feature_bullets_flat", "product.first_available.raw", "product.first_available.utc", "product.has_size_guide", "product.images[0].link", "product.images[0].variant", "product.images[1].link", "product.images[1].variant", "product.images[2].link", "product.images[2].variant", "product.images[3].link", "product.images[3].variant", "product.images[4].link", "product.images[4].variant", "product.images[5].link", "product.images[5].variant", "product.images[6].link", "product.images[6].variant", "product.images[7].link", "product.images[7].variant", "product.images[8].link", "product.images[8].variant", "product.images_count", "product.images_flat", "product.is_bundle", "product.keywords", "product.keywords_list", "product.link", "product.main_image.link", "product.material", "product.rating", "product.rating_breakdown.five_star.count", "product.rating_breakdown.five_star.percentage", "product.rating_breakdown.four_star.count", "product.rating_breakdown.four_star.percentage", "product.rating_breakdown.one_star.count", "product.rating_breakdown.one_star.percentage", "product.rating_breakdown.three_star.count", "product.rating_breakdown.three_star.percentage", "product.rating_breakdown.two_star.count", "product.rating_breakdown.two_star.percentage", "product.ratings_total", "product.search_alias.title", "product.search_alias.value", "product.specifications[0].name", "product.specifications[0].value", "product.specifications[10].name", "product.specifications[10].value", "product.specifications[11].name", "product.specifications[11].value", "product.specifications[12].name", "product.specifications[12].value", "product.specifications[13].name", "product.specifications[13].value", "product.specifications[14].name", "product.specifications[14].value", "product.specifications[15].name", "product.specifications[15].value", "product.specifications[16].name", "product.specifications[16].value", "product.specifications[17].name", "product.specifications[17].value", "product.specifications[1].name", "product.specifications[1].value", "product.specifications[2].name", "product.specifications[2].value", "product.specifications[3].name", "product.specifications[3].value", "product.specifications[4].name", "product.specifications[4].value", "product.specifications[5].name", "product.specifications[5].value", "product.specifications[6].name", "product.specifications[6].value", "product.specifications[7].name", "product.specifications[7].value", "product.specifications[8].name", "product.specifications[8].value", "product.specifications[9].name", "product.specifications[9].value", "product.specifications_flat", "product.sub_title.link", "product.sub_title.text", "product.title", "product.videos_additional[0].closed_captions", "product.videos_additional[0].creator_type", "product.videos_additional[0].duration", "product.videos_additional[0].id", "product.videos_additional[0].parent_asin", "product.videos_additional[0].product_asin", "product.videos_additional[0].profile_image_url", "product.videos_additional[0].profile_link", "product.videos_additional[0].public_name", "product.videos_additional[0].related_products", "product.videos_additional[0].title", "product.videos_additional[0].type", "product.videos_additional[0].vendor_code", "product.videos_additional[0].vendor_name", "product.videos_additional[0].vendor_tracking_id", "product.videos_additional[0].video_image_extension", "product.videos_additional[0].video_image_height", "product.videos_additional[0].video_image_id", "product.videos_additional[0].video_image_url", "product.videos_additional[0].video_image_url_unchanged", "product.videos_additional[0].video_image_width", "product.videos_additional[0].video_mime_type", "product.videos_additional[0].video_previews", "product.videos_additional[0].video_url", "product.weight", "request_info.credits_remaining", "request_info.credits_reset_at", "request_info.credits_used", "request_info.credits_used_this_request", "request_info.success", "request_metadata.amazon_url", "request_metadata.created_at", "request_metadata.processed_at", "request_metadata.total_time_taken", "request_parameters.amazon_domain", "request_parameters.asin", "request_parameters.type", "sponsored_products[0].asin", "sponsored_products[0].image", "sponsored_products[0].is_prime", "sponsored_products[0].link", "sponsored_products[0].price.currency", "sponsored_products[0].price.raw", "sponsored_products[0].price.symbol", "sponsored_products[0].price.value", "sponsored_products[0].rating", "sponsored_products[0].title", "sponsored_products[1].asin", "sponsored_products[1].image", "sponsored_products[1].is_prime", "sponsored_products[1].link", "sponsored_products[1].price.currency", "sponsored_products[1].price.raw", "sponsored_products[1].price.symbol", "sponsored_products[1].price.value", "sponsored_products[1].rating", "sponsored_products[1].title", "sponsored_products[2].asin", "sponsored_products[2].image", "sponsored_products[2].is_prime", "sponsored_products[2].link", "sponsored_products[2].price.currency", "sponsored_products[2].price.raw", "sponsored_products[2].price.symbol", "sponsored_products[2].price.value", "sponsored_products[2].rating", "sponsored_products[2].title", "sponsored_products[3].asin", "sponsored_products[3].image", "sponsored_products[3].is_prime", "sponsored_products[3].link", "sponsored_products[3].price.currency", "sponsored_products[3].price.raw", "sponsored_products[3].price.symbol", "sponsored_products[3].price.value", "sponsored_products[3].rating", "sponsored_products[3].title", "sponsored_products[4].asin", "sponsored_products[4].image", "sponsored_products[4].is_prime", "sponsored_products[4].link", "sponsored_products[4].price.currency", "sponsored_products[4].price.raw", "sponsored_products[4].price.symbol", "sponsored_products[4].price.value", "sponsored_products[4].rating", "sponsored_products[4].title", "sponsored_products[5].asin", "sponsored_products[5].image", "sponsored_products[5].link", "sponsored_products[5].price.currency", "sponsored_products[5].price.raw", "sponsored_products[5].price.symbol", "sponsored_products[5].price.value", "sponsored_products[5].rating", "sponsored_products[5].title", "sponsored_products[6].asin", "sponsored_products[6].image", "sponsored_products[6].is_prime", "sponsored_products[6].link", "sponsored_products[6].price.currency", "sponsored_products[6].price.raw", "sponsored_products[6].price.symbol", "sponsored_products[6].price.value", "sponsored_products[6].rating", "sponsored_products[6].title", "sponsored_products[7].asin", "sponsored_products[7].image", "sponsored_products[7].is_prime", "sponsored_products[7].link", "sponsored_products[7].price.currency", "sponsored_products[7].price.raw", "sponsored_products[7].price.symbol", "sponsored_products[7].price.value", "sponsored_products[7].rating", "sponsored_products[7].title", "sponsored_products[8].asin", "sponsored_products[8].image", "sponsored_products[8].is_prime", "sponsored_products[8].link", "sponsored_products[8].price.currency", "sponsored_products[8].price.raw", "sponsored_products[8].price.symbol", "sponsored_products[8].price.value", "sponsored_products[8].rating", "sponsored_products[8].title", "sponsored_products[9].asin", "sponsored_products[9].image", "sponsored_products[9].is_prime", "sponsored_products[9].link", "sponsored_products[9].price.currency", "sponsored_products[9].price.raw", "sponsored_products[9].price.symbol", "sponsored_products[9].price.value", "sponsored_products[9].rating", "sponsored_products[9].title"], "elpased_time": "14.403965", "missing_keys": "[]", "product_id": "B0C2QPPVM6", "retailer": "Amazon", "scraper": "rainforest", "status_code": 200, "time_of_api_call": "2024-06-01T00:53:51"}
hey Osvaldo thanks for your answer, this turned out to be more complex than I thought so I will explain in a subsequent answer