How to work with MD5checksum for manually edited manifesto JSON file

0

This is case where I am generating manifesto JSON file via Inventory Management over my AWS S3 bucket, it has generated Manifest.json Manifesto.checksum.

{
  "sourceBucket" : "dev-djool-xyubd4",
  "destinationBucket" : "arn:aws:s3:::dev-djool-s3-reports",
  "version" : "2016-11-30",
  "creationTimestamp" : "1713056400000",
  "fileFormat" : "CSV",
  "fileSchema" : "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, LastModifiedDate, ETag",
  "files" : [ {
    "key" : "dev-djool-xyubd4/dev-djool-xyubd4-copy-job/data/da32c56a-714a-4d14-8123-fd2a09466ccc.csv.gz",
    "size" : 32981,
    "MD5checksum" : "a50e27f226b1a68a7496466c9303bd6a"
  } ]
}

Now with this, I am trying to edit this CSV file to include additional column treat as a Meta data object here is a snippet

{
  "sourceBucket" : "dev-djool-xyubd4",
  "destinationBucket" : "arn:aws:s3:::dev-djool-s3-reports",
  "version" : "2016-11-30",
  "creationTimestamp" : "1713056400000",
  "fileFormat" : "CSV",
  "fileSchema" : "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, LastModifiedDate, ETag, x-amz-meta-x-version",
  "files" : [ {
    "key" : "dev-djool-xyubd4/dev-djool-xyubd4-copy-job/data/da32c56a-714a-4d14-8123-fd2a09466NEW.csv.gz",
    "size" : 32981,
    "MD5checksum" : "a50e27f226b1a68a7496466c9303bd6a"
  } ]
}

Note: CSV file also edited and uploaded with .csv.gz file format. CSV file edit with required changes and with the help of 7ZIP tool archived file into `.csv.gz. format.

With this When I am trying to create AWS S3 batch operation Jobs, it is Failing always with checksum value. Now I am not sure how to exactly deal with checksum value when I am manually editing the files, or may be how should I overwrite or tell AWS that this is my configuration file.

Attached screen shot of Failure S3 Batch Operation job Enter image description here

profile picture
asked 17 days ago219 views
1 Answer
2
Accepted Answer

This is bit tricky and doable but not advisable. I have done this in past for many of my use cases where I couldn't wait for manifest file to be created automatically. First create your own manifest file or make changes in already created manifest file and then make change in chksum of manifest.json and manifest.checksum file content. I'll walk you through, how to achieve this:

  • You'd have your manifest file in .csv format, assuming you are using gz format, you will gzip it and let's say manifest file name is abc90a13-5e5p-4746-bfc7-a772e931d438.csv.gz

  • Now get chksum of this .gz file:

         openssl md5 abc90a13-5e5p-4746-bfc7-a772e931d438.csv.gz
    
  • Update this chksum value in manifest.json file and also update the size value in it with size of .gz file. Save this file.

  • Now get the chksum value of updated manifest.json file as below:

       openssl md5 manifest.json
    
  • Update this chksum value in manifest.checksum file along with size of manifest.json file in it. Save this file.

  • Now upload all these three files(.gz, .json and .checksum) in respective locations in S3 bucket based on your S3 batch job specification.

  • Create batch job again with these files and you should see batch job succeeded without any complaints of chksum.

Hope it helps, comment here if you have additional questions.

Happy to help.

Abhishek

profile pictureAWS
EXPERT
answered 17 days ago
  • Thank you sir, this works for me, AWS accepted those checksum, and nice workaround.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions