Hi,
I am experimenting with loading openCypher data from S3 (2 MB of node data, and about 12 MB of edge data) into a Neptune instance we have set up. I am using the %load line magic in a Neptune Notebook to perform the load. The loads are successful, but the freeable memory of our Writer instance (db.t3.medium) does not recover after successfully loading the data, which eventually leads to failed loads due to out-of-memory errors.
These are the out-of-memory errors I get when loading additional data using the %load line magic (output from %load_status <load-id> --details --errors
):
{
"status": "200 OK",
"payload": {
"feedCount": [
{
"LOAD_FAILED": 1
}
],
"overallStatus": {
"fullUri": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"runNumber": 1,
"retryNumber": 0,
"status": "LOAD_FAILED",
"totalTimeSpent": 7,
"startTime": 1663677230,
"totalRecords": 153958,
"totalDuplicates": 0,
"parsingErrors": 0,
"datatypeMismatchErrors": 0,
"insertErrors": 153958
},
"failedFeeds": [
{
"fullUri": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"runNumber": 1,
"retryNumber": 0,
"status": "LOAD_FAILED",
"totalTimeSpent": 4,
"startTime": 1663677233,
"totalRecords": 153958,
"totalDuplicates": 0,
"parsingErrors": 0,
"datatypeMismatchErrors": 0,
"insertErrors": 153958
}
],
"errors": {
"startIndex": 1,
"endIndex": 5,
"loadId": "<load-id>",
"errorLogs": [
{
"errorCode": "OUT_OF_MEMORY_ERROR",
"errorMessage": "Out of memory error. Resume load and try again.",
"fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"recordNum": 0
},
{
"errorCode": "OUT_OF_MEMORY_ERROR",
"errorMessage": "Out of memory error. Resume load and try again.",
"fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"recordNum": 0
},
{
"errorCode": "OUT_OF_MEMORY_ERROR",
"errorMessage": "Out of memory error. Resume load and try again.",
"fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"recordNum": 0
},
{
"errorCode": "OUT_OF_MEMORY_ERROR",
"errorMessage": "Out of memory error. Resume load and try again.",
"fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"recordNum": 0
},
{
"errorCode": "OUT_OF_MEMORY_ERROR",
"errorMessage": "Out of memory error. Resume load and try again.",
"fileName": "s3://<bucket_name>/neptune_ingest_test_data/node_df_neptune_2022_04_02.csv",
"recordNum": 0
}
]
}
}
}
This is the freeable memory metric of the Writer instance at the time I recieved the out-of-memory errors above:
After restarting the Writer Instance and loading some data from S3, the same starts to happen again.
Can you please provide further details on the exact error(s) that you're receiving on the failed bulk load jobs? If using the notebooks, you can use the %load_status <load_id> --details --errors to see the full error output. Is this where you are seeing out-of-memory errors? Or are you seeing those errors when running other queries? If the latter, can you provide examples of the types of queries that you're attempting to execute when receiving out-of-memory errors. Thank you!
Thanks Taylor! I have added the errors on the loads that I received when the freeable memory metric was at its lowest point yesterday. Queries on the data that was successfully loaded into Neptune always worked without issues.
Likely would need some further info on your account and your cluster/instance(s) to look into this further. Are you able to open a support case? If so, please do and I'll be on the lookout for that.