How to work with MD5checksum for manually edited manifesto JSON file

0

This is case where I am generating manifesto JSON file via Inventory Management over my AWS S3 bucket, it has generated Manifest.json Manifesto.checksum.

{
  "sourceBucket" : "dev-djool-xyubd4",
  "destinationBucket" : "arn:aws:s3:::dev-djool-s3-reports",
  "version" : "2016-11-30",
  "creationTimestamp" : "1713056400000",
  "fileFormat" : "CSV",
  "fileSchema" : "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, LastModifiedDate, ETag",
  "files" : [ {
    "key" : "dev-djool-xyubd4/dev-djool-xyubd4-copy-job/data/da32c56a-714a-4d14-8123-fd2a09466ccc.csv.gz",
    "size" : 32981,
    "MD5checksum" : "a50e27f226b1a68a7496466c9303bd6a"
  } ]
}

Now with this, I am trying to edit this CSV file to include additional column treat as a Meta data object here is a snippet

{
  "sourceBucket" : "dev-djool-xyubd4",
  "destinationBucket" : "arn:aws:s3:::dev-djool-s3-reports",
  "version" : "2016-11-30",
  "creationTimestamp" : "1713056400000",
  "fileFormat" : "CSV",
  "fileSchema" : "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, LastModifiedDate, ETag, x-amz-meta-x-version",
  "files" : [ {
    "key" : "dev-djool-xyubd4/dev-djool-xyubd4-copy-job/data/da32c56a-714a-4d14-8123-fd2a09466NEW.csv.gz",
    "size" : 32981,
    "MD5checksum" : "a50e27f226b1a68a7496466c9303bd6a"
  } ]
}

Note: CSV file also edited and uploaded with .csv.gz file format. CSV file edit with required changes and with the help of 7ZIP tool archived file into `.csv.gz. format.

With this When I am trying to create AWS S3 batch operation Jobs, it is Failing always with checksum value. Now I am not sure how to exactly deal with checksum value when I am manually editing the files, or may be how should I overwrite or tell AWS that this is my configuration file.

Attached screen shot of Failure S3 Batch Operation job Enter image description here

profile picture
已提問 1 個月前檢視次數 232 次
1 個回答
2
已接受的答案

This is bit tricky and doable but not advisable. I have done this in past for many of my use cases where I couldn't wait for manifest file to be created automatically. First create your own manifest file or make changes in already created manifest file and then make change in chksum of manifest.json and manifest.checksum file content. I'll walk you through, how to achieve this:

  • You'd have your manifest file in .csv format, assuming you are using gz format, you will gzip it and let's say manifest file name is abc90a13-5e5p-4746-bfc7-a772e931d438.csv.gz

  • Now get chksum of this .gz file:

         openssl md5 abc90a13-5e5p-4746-bfc7-a772e931d438.csv.gz
    
  • Update this chksum value in manifest.json file and also update the size value in it with size of .gz file. Save this file.

  • Now get the chksum value of updated manifest.json file as below:

       openssl md5 manifest.json
    
  • Update this chksum value in manifest.checksum file along with size of manifest.json file in it. Save this file.

  • Now upload all these three files(.gz, .json and .checksum) in respective locations in S3 bucket based on your S3 batch job specification.

  • Create batch job again with these files and you should see batch job succeeded without any complaints of chksum.

Hope it helps, comment here if you have additional questions.

Happy to help.

Abhishek

profile pictureAWS
專家
已回答 1 個月前
  • Thank you sir, this works for me, AWS accepted those checksum, and nice workaround.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南