How do I resolve "unable to create input format" errors in Athena?

3 minute read
0

When I run a query in Amazon Athena, I get the error "unable to create input format."

Short description

An "unable to create input format" error occurs for one of the following reasons:

  • The data source in your Athena query isn't supported
  • Athena doesn't support the data format
  • The AWS Glue crawler can't classify the data format
  • One or more of the AWS Glue table definition properties are empty

Resolution

The data source in your Athena query isn't supported

Athena can only query tables that are stored in Amazon Simple Storage Service (Amazon S3). If you query a data source that isn't stored in S3, then you get an "unable to create input format" error.

To resolve this error, use the Athena Query Federation SDK. This SDK allows you to customize Athena with your own code. With the Athena Federation SDK, you can integrate with different data sources and proprietary data formats. You can also build new user-defined functions. For more information, see Query any data source with Amazon Athena's new federated query.

Athena doesn't support the data format

You can run an AWS Glue crawler to create tables in Athena from files in S3, but some file types aren't supported by Athena. For example, Athena doesn't support file types such as .ion or .xml.

If you query a table in Athena from a file type that isn't supported, then you get a "HIVE_UNKNOWN_ERROR: Unable to create input format" error. To resolve this error, use a data format that Athena supports.

The AWS Glue crawler can't classify the data format

If your AWS Glue crawler doesn't recognize a column data type from the table schema, then the crawler classifies the column as UNKNOWN. You get the error "HIVE_UNKNOWN_ERROR: Unable to create input format" when you query an Athena table that has UNKNOWN data type columns. This classification error occurs when you use a built-in classifier for your Glue crawler that doesn't recognize a data type in your schema.

To resolve this error, use data types that are supported by a built-in classifier. If the data format can't be classified by a built-in classifier, then consider a custom classifier.

One or more of the AWS Glue table definition properties are empty

You might also get input format errors when you query tables in Athena that aren't created by AWS Glue. For example, an error might occur if you create the table manually on the AWS Glue console. If one of the following properties in the AWS Glue table definition are empty, then you get a "HIVE_UNKNOWN_ERROR: Unable to create input format" error:

  • Input format
  • Output format
  • Serde name

Confirm that these properties are set correctly for the SerDe and data format. Keep in mind that the SerDe that you specify defines the table schema. The SerDe can override the DDL configuration that you specify in Athena when you create your table.

To update the table definition properties, complete the following steps:

  1. Open the AWS Glue console.
  2. Select the table that you want to update.
  3. Choose Action, and then choose View details.
  4. Choose Edit table.
  5. Update the settings for Input format, Output format, or Serde name.
  6. Choose Apply.

Related information

Defining and managing classifiers

Use SerDes

Connect to data sources

Use Amazon Athena Federated Query

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago