DMS task running out of memory

0

Hi, I am trying to migrate my postgresql DB data to s3 in Parquet format .. I have used the below json for my task, I am getting the error "Last Error Replication task out of memory. Stop Reason FATAL_ERROR Error Level FATAL" , machine type: t3.medium , I have tried reducing the commit rate but got same error . task json:

{
  "StreamBufferSettings": {
    "StreamBufferCount": 3,
    "CtrlStreamBufferSizeInMB": 5,
    "StreamBufferSizeInMB": 8
  },
  "ErrorBehavior": {
    "FailOnNoTablesCaptured": true,
    "ApplyErrorUpdatePolicy": "LOG_ERROR",
    "FailOnTransactionConsistencyBreached": false,
    "RecoverableErrorThrottlingMax": 1800,
    "DataErrorEscalationPolicy": "SUSPEND_TABLE",
    "ApplyErrorEscalationCount": 0,
    "RecoverableErrorStopRetryAfterThrottlingMax": true,
    "RecoverableErrorThrottling": true,
    "ApplyErrorFailOnTruncationDdl": false,
    "DataTruncationErrorPolicy": "LOG_ERROR",
    "ApplyErrorInsertPolicy": "LOG_ERROR",
    "EventErrorPolicy": "IGNORE",
    "ApplyErrorEscalationPolicy": "LOG_ERROR",
    "RecoverableErrorCount": -1,
    "DataErrorEscalationCount": 0,
    "TableErrorEscalationPolicy": "STOP_TASK",
    "RecoverableErrorInterval": 5,
    "ApplyErrorDeletePolicy": "IGNORE_RECORD",
    "TableErrorEscalationCount": 0,
    "FullLoadIgnoreConflicts": true,
    "DataErrorPolicy": "LOG_ERROR",
    "TableErrorPolicy": "SUSPEND_TABLE"
  },
  "TTSettings": {
    "TTS3Settings": null,
    "TTRecordSettings": null,
    "EnableTT": false
  },
  "FullLoadSettings": {
    "CommitRate": 500,
    "StopTaskCachedChangesApplied": false,
    "StopTaskCachedChangesNotApplied": false,
    "MaxFullLoadSubTasks": 4,
    "TransactionConsistencyTimeout": 600,
    "CreatePkAfterFullLoad": false,
    "TargetTablePrepMode": "DO_NOTHING"
  },
  "TargetMetadata": {
    "ParallelApplyBufferSize": 0,
    "ParallelApplyQueuesPerThread": 0,
    "ParallelApplyThreads": 0,
    "TargetSchema": "",
    "InlineLobMaxSize": 0,
    "ParallelLoadQueuesPerThread": 0,
    "SupportLobs": true,
    "LobChunkSize": 0,
    "TaskRecoveryTableEnabled": false,
    "ParallelLoadThreads": 0,
    "LobMaxSize": 6400,
    "BatchApplyEnabled": false,
    "FullLobMode": false,
    "LimitedSizeLobMode": true,
    "LoadMaxFileSize": 0,
    "ParallelLoadBufferSize": 0
  },
  "BeforeImageSettings": null,
  "ControlTablesSettings": {
    "historyTimeslotInMinutes": 5,
    "HistoryTimeslotInMinutes": 5,
    "StatusTableEnabled": false,
    "SuspendedTablesTableEnabled": false,
    "HistoryTableEnabled": false,
    "ControlSchema": "",
    "FullLoadExceptionTableEnabled": false
  },
  "LoopbackPreventionSettings": null,
  "CharacterSetSettings": null,
  "FailTaskWhenCleanTaskResourceFailed": false,
  "ChangeProcessingTuning": {
    "StatementCacheSize": 50,
    "CommitTimeout": 1,
    "BatchApplyPreserveTransaction": true,
    "BatchApplyTimeoutMin": 1,
    "BatchSplitSize": 0,
    "BatchApplyTimeoutMax": 30,
    "MinTransactionSize": 500,
    "MemoryKeepTime": 600,
    "BatchApplyMemoryLimit": 2048,
    "MemoryLimitTotal": 3500
  },
  "ChangeProcessingDdlHandlingPolicy": {
    "HandleSourceTableDropped": true,
    "HandleSourceTableTruncated": true,
    "HandleSourceTableAltered": true
  },
  "PostProcessingRules": null
} 
asked 3 months ago237 views
2 Answers
0
Accepted Answer

Hi,

If the machine on which you create the replication task is machine of type: t3.medium, try to go with a bigger machine (.large, .xlarge, etc.) They will have more memory and you task will succeed when you find the type that is big enough.

If it's a 1-shot exercise, go directly to a very big instance: it will save you time, hence money.

If this task executor will last, you may want to optimize its size to the smallest possible one to remain as frugal as possible.

Best,

Didier

profile pictureAWS
EXPERT
answered 3 months ago
profile picture
EXPERT
reviewed 3 months ago
  • here I am trying Full load + ongoing replication. so once the full replication is complete using a bigger instance type Can I switch to the smaller instance type for ongoing replication ?

  • Hi, thanks for accepting my answer. Yes, if your ongoing replication activity coming after the initial load is lower, you can reduce the instance type to a smaller one to remain frugal. In that case, you may want to directly work with DMS serverless. which will take care of auto-scaling (up & down) for your so that you are optimal all the time.

0

The t3.medium instance type has 4GB of memory MemoryLimitTotal is set 3500 which closure to 4GB. You can adjust this to lower value. Please check the CloudWatch metrics metrics and logs of the DMS instance for further information on how the resources are being used. If required you may need to upscale the DMS instance. https://repost.aws/knowledge-center/dms-troubleshoot-errors is trobleshooting guide.

Joseph
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions