By using AWS re:Post, you agree to the AWS re:Post Terms of Use

How do I resolve the "FAILED: NullPointerException Name is null" error when I query a table in Athena?

6 minute read
0

When I query my Amazon Athena table, I get the "FAILED: NullPointerException Name is Null" error.

Short description

The following error messages are types of FAILED: NullPointerException errors that you can receive.

NullPointerException Name is null

You get this error when the TableType attribute isn't defined for the queried table in the AWS Glue Data Catalog. The TableType attribute defines whether the table is an external table or a view. You can define the attribute with values such as, EXTERNAL_TABLE and VIRTUAL_VIEW.

To run DDL queries, such as SHOW CREATE TABLE or MSCK REPAIR TABLE, you must define the TableType attribute.

You might also get this error when you use an AWS CloudFormation template or the AWS Glue API and don't specify the TableType property.

java.lang.NullPointerException: Cannot invoke "java.util.Map.entrySet()" because the return value of "org.apache.hadoop.hive.metastore.api.SerDeInfo.getParameters()" is null

You get this error when the SerDeInfo parameters aren't defined for the queried table in the Data Catalog. The SerDeInfo parameters are the key-value pairs that define initialization parameters for the SerDe. You can define the attribute with the value "serialization.format": "1". To run DDL queries, such as SHOW CREATE TABLE, you must define the SerDeInfo attribute parameters.

You might also get this error when you use a CloudFormation template or the AWS Glue API and don't specify the SerDeInfo attribute.

Resolution

Complete the troubleshooting steps that best fit the error message that you received.

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

NullPointerException Name is null

To resolve this error, complete one or more of the following actions based on your use case.

Add the attribute during table creation

When you create the table, add the TableType attribute.

Note: If you use a DDL statement or an AWS Glue crawler to create the table, then the TableType property is defined automatically.

Update the CloudFormation template

When you use a CloudFormation template to create an AWS Glue Table, specify the TableType attribute in the TableInput properties of the AWS Glue Table resource. Set the TableType to 'EXTERNAL_TABLE'. For more information, see AWS::Glue::Table TableInput.

Update the AWS Glue API call

When you use the AWS Glue API call to create or update tables, make sure to include the TableType parameter in your TableInput properties. When you call the CreateTable or UpdateTable operations, set the TableType parameter to 'EXTERNAL_TABLE'.

Use the AWS CLI to update the table

To update the TableType attribute for your table, run the update-table AWS CLI command. To run this command, you must have the TableInput object that defines the entire table architecture.

To get the TableInput object for your table, run the get-table AWS CLI command. Then, complete the following steps to update the output of this command:

  1. On your table, run a command that's similar to the following example:

    aws glue get-table --catalog-id 1111222233334444 --database doc_example_database --name doc_example_table

    Example output:

    {    "Table": {            "StorageDescriptor": {
                "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                "SortColumns": [],
                "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                "SerdeInfo": {
                        "SerializationLibrary": "org.apache.hadoop.hive.serde2.OpenCSVSerde",
                        "Parameters": {
                        "serialization.format": "1"
                            }
                },
                "Parameters": {
                    "separatorChar": ","
                },
                "Location": "s3://doc_example_bucket/doc_example_prefix/",
                "NumberOfBuckets": 0,
                "StoredAsSubDirectories": false,
                "Columns": [
                    {
                        "Type": "int",
                        "Name": "id"
                    },
                    {
                        "Type": "string",
                        "Name": "name"
                    }
                ],
                "Compressed": false
            },
            "UpdateTime": 1620508098.0,
            "IsRegisteredWithLakeFormation": false,
            "Name": "doc_example_table",
            "CreatedBy": "arn:aws:iam::1111222233334444:user/Administrator",
            "DatabaseName": "doc_example_database",
            "Owner": "1111222233334444",
            "Retention": 0,
            "CreateTime": 1619909955.0,
            "Description": "tb description"
        }
    }
  2. In the output, remove the UpdateTime, IsRegisteredWithLakeFormation, CreatedBy, DatabaseName, and CreateTime parameters. AWS Glue doesn't support these parameters.
    If you include these parameters in the TableInput attribute when you run the update-table command, then you might get the following errors:

    "Parameter validation failed:Unknown parameter in TableInput: "UpdateTime", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters"
    "Unknown parameter in TableInput: "IsRegisteredWithLakeFormation", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters"
    "Unknown parameter in TableInput: "CreatedBy", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters"
    "Unknown parameter in TableInput: "DatabaseName", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters"
    "Unknown parameter in TableInput: "CreateTime", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters"

  3. Add the "TableType": "EXTERNAL_TABLE" parameter to the output.

  4. Use the output as the TableInput parameter to run the following command:

    aws glue update-table --catalog-id 1111222233334444 --database-name doc_example_database --table-input '{        "StorageDescriptor": {            "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                "SortColumns": [],
                "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                "SerdeInfo": {
                    "SerializationLibrary": "org.apache.hadoop.hive.serde2.OpenCSVSerde",
                    "Parameters": {
                        "serialization.format":"1"
                    }
                },
                "Parameters": {
                    "separatorChar":","
                },
                "Location": "s3://doc_example_bucket/doc_example_prefix/",
                "NumberOfBuckets": 0,
                "StoredAsSubDirectories": false,
                "Columns": [
                    {
                        "Type": "int",
                        "Name": "id"
                    },
                    {
                        "Type": "string",
                        "Name": "name"
                    }
                ],
                "Compressed": false
            },
            "Name": "doc_example_table",
            "TableType": "EXTERNAL_TABLE",
            "Owner": "1111222233334444",
            "Retention": 0,
            "Description": "tb description"
        }'

    Note: Replace the following variables:
    doc_example_database with the name of your database
    doc_example_table with the name of your table
    1111222233334444 with your AWS account ID
    s3://doc_example_bucket/doc_example_prefix/ with the Amazon Simple Storage Service (Amazon S3) location where your table is stored
    After you run the preceding command, the TableType parameter updates, and the DDL queries are successful.

java.lang.NullPointerException: Cannot invoke "java.util.Map.entrySet()" because the return value of "org.apache.hadoop.hive.metastore.api.SerDeInfo.getParameters()" is null

To resolve this error, complete one or more of the following actions based on your use case.

Add the SerDeInfo parameters during table creation

When you create the table, add SerDeInfo parameters, such as "serialization.format": "1", "field.delim":",".

Update the CloudFormation template

When you use a CloudFormation template to create an AWS Glue Table, specify the SerDeInfo parameters. In the TableInput section of your Glue Table resource, navigate to StorageDescriptor. Choose SerDeInfo, and then choose Parameters. Add {"serialization.format": "1"} as a parameter.

For more information, see AWS::Glue::Table SerdeInfo.

Update the AWS Glue API call

When you use the AWS Glue API call to create or update tables, include the SerDeInfo parameters in the StorageDescriptor of your TableInput. Set the Parameters field for SerDeInfo to {"serialization.format": "1"}.

This parameter is used when you call operations like CreateTable or UpdateTable. For more information, see StorageDescriptor structure.

Use the AWS Glue console to update the table

To update the properties of the table in the Data Catalog, complete the following steps:

  1. Open the AWS Glue console.
  2. In the navigation pane, under Data Catalog, choose Tables.
  3. Select the table that you want to update.
  4. Choose Action, and then choose Edit table.
  5. In the SerDe parameters section, choose Add.
  6. For Key, enter "serialization.format", and for Value enter "1".
  7. Choose Save.

Use the AWS CLI to update the table

To update the SerDeInfo parameters for your table, run the update-table AWS CLI command. To run this command, you must have the TableInput object that defines the entire table architecture.

To get the TableInput object for your table, run the get-table AWS CLI command. Then, update the output of this command to include the SerDeInfo parameters.

Related information

Troubleshoot issues in Athena

AWS OFFICIAL
AWS OFFICIALUpdated 13 days ago