Glue job failing with Null Pointer Exception when writing df

0

Running a job to fetch data from S3 and write to GCP BQ using Glue BQ connector by AWS. Everything else is fine, but for one table the second runs seems to fail always with below error. First time it runs fine, I have bookmarks enabled to fetch new data added in S3 and write to BQ. It fails with below error on write function.

Unable to understand the null pointer exception thrown.

Caused by: java.lang.NullPointerException
	at com.google.cloud.bigquery.connector.common.BigQueryClient.loadDataIntoTable(BigQueryClient.java:532)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.loadDataToBigQuery(BigQueryWriteHelper.scala:87)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.writeDataFrameToBigQuery(BigQueryWriteHelper.scala:66)
	... 42 more

asked 2 years ago276 views
1 Answer
0

Hi,

if you have bookmark enabled, are you sure you have new data in S3 for the second run?

If not the read step will create an empty dataframe that might cause the write to BigQuery to fail.

Currently you might want to implement a try/catch or conditional logic to test if the dataframe you read has data and writes to bigquery only if true otherwise only log a message that there is no available input at the moment.

Hope this helps,

AWS
EXPERT
answered 2 years ago
  • Yes, more data is present in S3, I have printed the data and checked just before writing, but still it is throwing this error. I thought maybe something around nullability of the columns, but have fixed that too, by setting the nullable property of source to True same as target , but still the same error. I am clueless now!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions