DMS task running out of memory

0

Hi, I am trying to migrate my postgresql DB data to s3 in Parquet format .. I have used the below json for my task, I am getting the error "Last Error Replication task out of memory. Stop Reason FATAL_ERROR Error Level FATAL" , machine type: t3.medium , I have tried reducing the commit rate but got same error . task json:

{
  "StreamBufferSettings": {
    "StreamBufferCount": 3,
    "CtrlStreamBufferSizeInMB": 5,
    "StreamBufferSizeInMB": 8
  },
  "ErrorBehavior": {
    "FailOnNoTablesCaptured": true,
    "ApplyErrorUpdatePolicy": "LOG_ERROR",
    "FailOnTransactionConsistencyBreached": false,
    "RecoverableErrorThrottlingMax": 1800,
    "DataErrorEscalationPolicy": "SUSPEND_TABLE",
    "ApplyErrorEscalationCount": 0,
    "RecoverableErrorStopRetryAfterThrottlingMax": true,
    "RecoverableErrorThrottling": true,
    "ApplyErrorFailOnTruncationDdl": false,
    "DataTruncationErrorPolicy": "LOG_ERROR",
    "ApplyErrorInsertPolicy": "LOG_ERROR",
    "EventErrorPolicy": "IGNORE",
    "ApplyErrorEscalationPolicy": "LOG_ERROR",
    "RecoverableErrorCount": -1,
    "DataErrorEscalationCount": 0,
    "TableErrorEscalationPolicy": "STOP_TASK",
    "RecoverableErrorInterval": 5,
    "ApplyErrorDeletePolicy": "IGNORE_RECORD",
    "TableErrorEscalationCount": 0,
    "FullLoadIgnoreConflicts": true,
    "DataErrorPolicy": "LOG_ERROR",
    "TableErrorPolicy": "SUSPEND_TABLE"
  },
  "TTSettings": {
    "TTS3Settings": null,
    "TTRecordSettings": null,
    "EnableTT": false
  },
  "FullLoadSettings": {
    "CommitRate": 500,
    "StopTaskCachedChangesApplied": false,
    "StopTaskCachedChangesNotApplied": false,
    "MaxFullLoadSubTasks": 4,
    "TransactionConsistencyTimeout": 600,
    "CreatePkAfterFullLoad": false,
    "TargetTablePrepMode": "DO_NOTHING"
  },
  "TargetMetadata": {
    "ParallelApplyBufferSize": 0,
    "ParallelApplyQueuesPerThread": 0,
    "ParallelApplyThreads": 0,
    "TargetSchema": "",
    "InlineLobMaxSize": 0,
    "ParallelLoadQueuesPerThread": 0,
    "SupportLobs": true,
    "LobChunkSize": 0,
    "TaskRecoveryTableEnabled": false,
    "ParallelLoadThreads": 0,
    "LobMaxSize": 6400,
    "BatchApplyEnabled": false,
    "FullLobMode": false,
    "LimitedSizeLobMode": true,
    "LoadMaxFileSize": 0,
    "ParallelLoadBufferSize": 0
  },
  "BeforeImageSettings": null,
  "ControlTablesSettings": {
    "historyTimeslotInMinutes": 5,
    "HistoryTimeslotInMinutes": 5,
    "StatusTableEnabled": false,
    "SuspendedTablesTableEnabled": false,
    "HistoryTableEnabled": false,
    "ControlSchema": "",
    "FullLoadExceptionTableEnabled": false
  },
  "LoopbackPreventionSettings": null,
  "CharacterSetSettings": null,
  "FailTaskWhenCleanTaskResourceFailed": false,
  "ChangeProcessingTuning": {
    "StatementCacheSize": 50,
    "CommitTimeout": 1,
    "BatchApplyPreserveTransaction": true,
    "BatchApplyTimeoutMin": 1,
    "BatchSplitSize": 0,
    "BatchApplyTimeoutMax": 30,
    "MinTransactionSize": 500,
    "MemoryKeepTime": 600,
    "BatchApplyMemoryLimit": 2048,
    "MemoryLimitTotal": 3500
  },
  "ChangeProcessingDdlHandlingPolicy": {
    "HandleSourceTableDropped": true,
    "HandleSourceTableTruncated": true,
    "HandleSourceTableAltered": true
  },
  "PostProcessingRules": null
} 
已提問 4 個月前檢視次數 309 次
2 個答案
0
已接受的答案

Hi,

If the machine on which you create the replication task is machine of type: t3.medium, try to go with a bigger machine (.large, .xlarge, etc.) They will have more memory and you task will succeed when you find the type that is big enough.

If it's a 1-shot exercise, go directly to a very big instance: it will save you time, hence money.

If this task executor will last, you may want to optimize its size to the smallest possible one to remain as frugal as possible.

Best,

Didier

profile pictureAWS
專家
已回答 4 個月前
profile picture
專家
已審閱 4 個月前
  • here I am trying Full load + ongoing replication. so once the full replication is complete using a bigger instance type Can I switch to the smaller instance type for ongoing replication ?

  • Hi, thanks for accepting my answer. Yes, if your ongoing replication activity coming after the initial load is lower, you can reduce the instance type to a smaller one to remain frugal. In that case, you may want to directly work with DMS serverless. which will take care of auto-scaling (up & down) for your so that you are optimal all the time.

0

The t3.medium instance type has 4GB of memory MemoryLimitTotal is set 3500 which closure to 4GB. You can adjust this to lower value. Please check the CloudWatch metrics metrics and logs of the DMS instance for further information on how the resources are being used. If required you may need to upscale the DMS instance. https://repost.aws/knowledge-center/dms-troubleshoot-errors is trobleshooting guide.

Joseph
已回答 4 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南