Find answers, ask questions, and connect with our
community around the world.

  • Klimok

    Member
    August 29, 2021 at 9:21 pm

    Hi, another ambiguous question on data formats

    A Machine Learning Specialist has been using Amazon EC2 for quite some time to train classification and regression models. The Specialist wants to simplify the training job by leveraging Amazon SageMaker’s built-in algorithms. However, he is unsure if SageMaker can support the format of his training data.

    Classification and regression models dont work with JPEG files. At the same time, as of late 2019 ‘Amazon SageMaker Batch Transform now supports TFRecord format as a supported SplitType, enabling datasets to be split by TFRecord boundaries. This adds to the list of supported formats including RecordIO, CSV, and Text.’

    So either question needs to be tuned or answers adjusted to remove ambiguity. Thanks

  • Carlo-TutorialsDojo

    Administrator
    August 31, 2021 at 5:27 am

    Hello Klimok,

    Thanks for sharing your insights.

    You can use a TFRecord data format to train models using custom algorithms in SageMaker. However, it is not included in the list of supported training data formats for built-in algorithms which can be found here.

    https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html#cdf-common-content-types

    Classification was meant in a general sense which could include image classification, binary classification, and so on.

    Let me know if this helps.

    Regards,

    Carlo @ Tutorials Dojo

Viewing 1 - 2 of 2 replies

Log in to reply.

Original Post
0 of 0 posts June 2018
Now