Delayed cold start report in AWS Lambda function

0

Consider the following AWS Lambda function:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()
started_init = ms_now()
cold_start = True
init_time = timer.stop()

... # Some heavy machine learning models are loaded here

def compute(event):
    ... # The actual computation is here

def lambda_handler(event, context):
    global cold_start
    global init_time
    global started_init
    
    stats = {'cold_start': cold_start,
             'started_init': started_init,
             'init_time': init_time}
    cold_start = False
    init_time = 0
    started_init = 0

    stats['started'] = ms_now()
    result = compute(event)
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'result': result, 
        'stats': stats
    }

The stats returned as part of the response has fields that show whether a cold start has occurred and, if so, when the initialization of global variables began and how long it took.

Another Lambda function calls the one above, processes the stats in the response and adds a row to the output string:

started = stats['started'] - exec_begin    # exec_begin is the timestamp just before above AWS Lambda is invoked.
init = stats['init_time']
if init > 0:
    started_init_str = f"started_init: {stats['started_init'] - exec_begin}   "
else:
    started_init_str = ""
time = stats['finished'] - stats['started']
now = ms_now() - exec_begin
globals.worker_trace += (
    f"{job['description']}-{job['id']}   cold: {stats['cold_start']}   {started_init_str}init: {init}ms   started: {started}   time: {time}ms   now: {now}\n")

I tested this latter function manually from the web console about every 3 minutes. For a warm start, the output is like this:

cold: False   init: 0ms   started: 239   time: 96ms   now: 408

The weird thing happens when a cold start occurs:

cold: True   started_init: -184189   init: 6852ms   started: 101   time: 169ms   now: 408

Question 1. How does started_init end up negative as if the cold start occurred during the previous run? CloudWatch shows the the cold start occurs during the current one.

Question 2. Both the invocation that reports the cold start with the initialization time of over six seconds and the previous invocation return a response in under half-a-second. So, when does this cold-start initialization actually take place?

P.S. The question at SO

asked 6 months ago96 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions