How do I resolve the "FAILED: NullPointerException Name is null" error when I query a table in Athena?

6 minute read
0

I get the "FAILED: NullPointerException Name is Null" error when I query my Amazon Athena table.

Short description

The following error messages are types of FAILED: NullPointerException errors that you can receive.

NullPointerException Name is null

You get this error when the TableType attribute isn't defined for the queried table in the AWS Glue Data Catalog. The TableType attribute defines whether the table is an external table or a view. You can define the attribute with values such as, EXTERNAL_TABLE and VIRTUAL_VIEW.

To run DDL queries, such as SHOW CREATE TABLE or MSCK REPAIR TABLE, you must define the TableType attribute.

You might also get this error when you use an AWS CloudFormation template or the AWS Glue API and don't specify the TableType property.

java.lang.NullPointerException: Cannot invoke "java.util.Map.entrySet()" because the return value of "org.apache.hadoop.hive.metastore.api.SerDeInfo.getParameters()" is null

You get this error when the SerdeInfo parameters aren't defined for the queried table in the Data Catalog. The SerdeInfo parameters are the key-value pairs that define initialization parameters for the SerDe. You can define the attribute with values such as, serialization.format": "1". To run DDL queries, such as SHOW CREATE TABLE, you must define the SerdeInfo attribute parameters.

You might also get this error when you use an CloudFormation template or the AWS Glue API and don't specify these attributes.

Resolution

Follow the troubleshooting steps for the error message that you received.

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

NullPointerException Name is null

To resolve this error, complete one or more of the following tasks based on your use case.

Add the attribute during table creation

Add the TableType attribute when you create the table.

Note: If you use a DDL statement or an AWS Glue crawler to create the table, then the TableType property is defined automatically.

Update the CloudFormation template or the AWS Glue API call

If you used a CloudFormation template or the AWS Glue API to define the table and didn't specify the TableType, then add the TableType attribute.

Use the AWS CLI to update the table

To update the TableType attribute for your table, run the update-table AWS CLI command. To run this command, you must have the TableInput object that defines the entire table architecture.

To get the TableInput object for your table, run the get-table command. Then, complete the following steps to update the output of this command:

  1. On your table, run a command that's similar to the following one:

    aws glue get-table --catalog-id 1111222233334444 --database doc_example_database --name doc_example_table
    

    Example output:

    {    "Table": {
                "StorageDescriptor": {
                "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                "SortColumns": [],
                "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                "SerdeInfo": {
                        "SerializationLibrary": "org.apache.hadoop.hive.serde2.OpenCSVSerde",
                        "Parameters": {
                        "serialization.format": "1"
                            }
                },
                "Parameters": {
                    "separatorChar": ","
                },
                "Location": "s3://doc_example_bucket/doc_example_prefix/",
                "NumberOfBuckets": 0,
                "StoredAsSubDirectories": false,
                "Columns": [
                    {
                        "Type": "int",
                        "Name": "id"
                    },
                    {
                        "Type": "string",
                        "Name": "name"
                    }
                ],
                "Compressed": false
            },
            "UpdateTime": 1620508098.0,
            "IsRegisteredWithLakeFormation": false,
            "Name": "doc_example_table",
            "CreatedBy": "arn:aws:iam::1111222233334444:user/Administrator",
            "DatabaseName": "doc_example_database",
            "Owner": "1111222233334444",
            "Retention": 0,
            "CreateTime": 1619909955.0,
            "Description": "tb description"
        }
    }
    
  2. In the output, remove the UpdateTime, IsRegisteredWithLakeFormation, CreatedBy, DatabaseName, and CreateTime parameters. AWS Glue doesn't support these parameters.

    If you include these parameters in the TableInput attribute when you run the update-table command, then you might get the following errors:

    Parameter validation failed:Unknown parameter in TableInput: "UpdateTime", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters
    Unknown parameter in TableInput: "IsRegisteredWithLakeFormation", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters
    Unknown parameter in TableInput: "CreatedBy", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters
    Unknown parameter in TableInput: "DatabaseName", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters
    Unknown parameter in TableInput: "CreateTime", must be one of: Name, Description, Owner, LastAccessTime, LastAnalyzedTime, Retention, StorageDescriptor, PartitionKeys, ViewOriginalText, ViewExpandedText, TableType, Parameters
    
  3. Add the "TableType": "EXTERNAL_TABLE" parameter to the output.

  4. Use the output as the TableInput parameter to run the update-table command:

    aws glue update-table --catalog-id 1111222233334444 --database-name doc_example_database --table-input'{        "StorageDescriptor": {
                "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
                "SortColumns": [],
                "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
                "SerdeInfo": {
                    "SerializationLibrary": "org.apache.hadoop.hive.serde2.OpenCSVSerde",
                    "Parameters": {
                        "serialization.format":"1"
                    }
                },
                "Parameters": {
                    "separatorChar":","
                },
                "Location": "s3://doc_example_bucket/doc_example_prefix/",
                "NumberOfBuckets": 0,
                "StoredAsSubDirectories": false,
                "Columns": [
                    {
                        "Type": "int",
                        "Name": "id"
                    },
                    {
                        "Type": "string",
                        "Name": "name"
                    }
                ],
                "Compressed": false
            },
            "Name": "doc_example_table",
            "TableType": "EXTERNAL_TABLE",
            "Owner": "1111222233334444",
            "Retention": 0,
            "Description": "tb description"
        }

Note: Replace the following variables with your variables:

  • doc_example_database with the name of your database
  • doc_example_table with the name of your table
  • 1111222233334444 with your AWS account ID
  • s3://doc_example_bucket/doc_example_prefix/ with the Amazon Simple Storage Service (Amazon S3) location where you stored the table

After you run the preceding command, the TableType parameter gets updated, and the DDL queries are successful.

java.lang.NullPointerException: Cannot invoke "java.util.Map.entrySet()" because the return value of "org.apache.hadoop.hive.metastore.api.SerDeInfo.getParameters()" is null

To resolve this error, complete one or more of the following tasks based on your use case.

Add the SerdeInfo parameters during table creation

When you create the table, add SerdeInfo parameters, such as "serialization.format": "1", "field.delim":",".

Update the CloudFormation template or the AWS Glue API call

If you used a CloudFormation template or the AWS Glue API to define the table and didn't specify the SerdeInfo parameters, then add the SerdeInfo parameters.

Use the AWS Glue console to update the table

To update the properties of the table in the Data Catalog, complete the following steps:

  1. Open the AWS Glue console.
  2. In the navigation pane, choose Tables.
  3. Select the table that you want to update.
  4. Choose Action, and then choose Edit table.
  5. In the Serde parameters section, choose Add.
  6. For Key, enter "serialization.format", and for Value enter "1".
  7. Choose Save.

Use the AWS CLI to update the table

To update the SerdeInfo parameters for your table, run the update-table AWS CLI command. To run this command, you must have the TableInput object that defines the entire table architecture.

To get the TableInput object for your table, run the get-table command. Then, update the output of this command to include the SerdeInfo parameters.

Related information

Troubleshooting in Athena

AWS OFFICIAL
AWS OFFICIALUpdated 2 months ago