MemberAugust 12, 2021 at 7:44 am
Hi, are you sure DC worth establishing to transfer just 5TB of data? It’s not impacting this question directly, but after taking the DB specialty such configuration seems odd:)
A company has established an AWS Direct Connect connection to support the migration of their 5 TB data warehouse hosted on-premises to an Amazon S3 bucket. The Data Analyst needs to perform data curation against the raw data in Amazon S3 to prepare them for model training in Amazon Sagemaker. The datasets will undergo multiple data cleaning steps, such as dropping null fields, resolving choice, and splitting fields. The cleaned data is stored in a separate S3 bucket for data curation and ML processing.
AdministratorAugust 13, 2021 at 6:36 am
Why shouldn’t it be not worth establishing? You’re billed on-demand (hourly) which is cheaper than upgrading your internet connection plan.
Carlo @ Tutorials Dojo
MemberAugust 13, 2021 at 7:07 pm
For 5TB one-time migration of a datacenter, Snowball (if not several Snowcones) will be a much faster and CHEAPER option. Again, that’s not a critical element in a DA question; but in a DB question set it would look very weird.
Log in to reply.