So I have started using the Macie service to do some data classification for a project I am building. I originally tried it with some JSON data that was put into a text file that contained US SSN numbers. I ran the Macie service to have it try and find this data and the scan did not return any findings.

Next I figured that I would try the Macie scan on a an excel file with three columns. The first column had 5 first names, the second column had 5 last names, and the third column had 5 SSN numbers. Nothing else was in the excel file. I ran the Macie scan again and it still failed to find any sensitive data. I tried using the all managed identifiers scan and just the individual SSN scan and neither of them returned any findings.

Does anyone know what I might be doing wrong and why Macie cant find simple SSN numbers? I am happy to provide more context as well as share the files if it will be helpful (all the SSNs are fake numbers for testing).

Hi - I used a simple .xlsx file with some fake first/last name and fake ssn. Created a job and it was able to identify as below

The object contains personal information such as first or last names, addresses, or identification numbers.

Also review some of the requirements mentioned here

You can also you can monitor and analyze specific events that occur as a job progresses

  • Hi Nitin thanks for your response. Could you share with me the excel doc you used and what the data looked like? Thank you for those links I have seen those as I have been investigating this problem. The cloudwatch logs just say that the scan was running and was completed but does not give me any insight into why Macie cant pick up on these sensitive data types.


Could you provide the sample file in the form of comma separated values in a comment?

It's worth noting that Macie does have some validation built in to filter out fake numbers. For example, if you entered 123-45-6789 or 000-00-0000 as the SSN, it wouldn't trigger.

