Quickest way to locate s3 object

0

Hi, we are constructing lambda to search for a s3 object and adding tag to it.

  • first of all, the lambda knows the bucket and "directory" from certain data source (understand s3 does not really have a directory structure. Just simulate the directory by Key"). so the lambda can construct bucket_name=my_bucket and part of the key as /company_name/department_name. There are about 170million objects under /company_name/department_name in some cases. The object to be located has a form of unique_id.Certain_Formate_DateTime.json under /company_name/depart_name. The unique_id is also known from certain data source. Hence, we write our code based on boto3 paging as :

       bucket_name = 'my-bucket'
      directory_prefix = 'company_name/deparment_name/'  # Include a trailing slash
      file_pattern = 'Unique_id_123'  # Example: match all text files but we might use regret here
      paginator = s3.get_paginator('list_objects_v2')
      page_iterator = paginator.paginate(Bucket=bucket, Prefix=directory_prefix)
    
      for page in page_iterator:
          for object in page.get('Contents', []):
              if object['Key'].startswith(file_pattern):
                  print(object['Key'])  # Print the object key (full path)
                  return object
    

we might replace the line object['Key'].startswith(file_pattern) with python regex pattern matching. Above code fed bucket_name and directory_prefix to an iterator and page through all s3 objects under /company_name/department_name. Is there any other way to locate s3 object faster? From AWS web console, when we clicked into an s3 bucket's sub directory, there is a search box for us to type in partial object name then search. Is it using the same or similar paging algorithm? Again, we got around 170 million objects under certain directory so we wish to search object in the most efficient way. Thank you.

asked 3 months ago139 views
1 Answer
0

If searching for objects is something you do regularly, you might see if S3 Inventory works for you.

profile pictureAWS
EXPERT
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions