1 Answer
- Newest
- Most votes
- Most comments
1
Hi,
For such large datasets Sagemaker Data Wrangler seems quite appropriate to prepare it. In https://aws.amazon.com/blogs/machine-learning/process-larger-and-wider-datasets-with-amazon-sagemaker-data-wrangler/ you have it benchmarked on a dataset of around 100 GB with 80 million rows and 300 columns.
About the training of large models with Amazon SageMaker, see this video: https://www.youtube.com/watch?v=XKLIhIeDSCY
Also, re. training of your model, this post helps you choose the best datasource: https://aws.amazon.com/blogs/machine-learning/choose-the-best-data-source-for-your-amazon-sagemaker-training-job/
Best,
Didier
Relevant content
- asked 4 years ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago