Skip to content

fix: Upgrade to SageMaker v2 SDK #76

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 23, 2020
Merged

fix: Upgrade to SageMaker v2 SDK #76

merged 9 commits into from
Sep 23, 2020

Conversation

brightsparc
Copy link
Contributor

@brightsparc brightsparc commented Aug 10, 2020

Issue #, if available: #69 Feature request: support SageMaker SDK 2.0

Related to #73

Description of changes:

Manually upgrade the step functions SDK to support the breaking changes relevant to this library specifically.

  • image -> image_uri
  • train_instance_count -> instance_count
  • train_instance_type -> instance_type
  • train_max_run -> max_run
  • train_max_run_wait -> max_run_wait
  • train_volume_size -> volume_size
  • sagemaker.session.s3_input -> sagemaker.inputs.TrainingInput

I upgrade dependencies to:

sagemaker>=2.1.0`
boto3>=1.14.38

I upgrade package version to: 2.0.0

I have an integration tests that validates this code works with - but have a few issues running the complete test suite.

  • TrainingStep
  • ModelStep
  • ProcessingStep

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

* image -> image_uri
* train_instance_count -> instance_count
* train_instance_type -> instance_type
* train_max_run -> max_run
* train_max_run_wait  -> max_run_wait
* train_volume_size  ->  volume_size
* sagemaker.session.s3_input -> sagemaker.inputs.TrainingInput
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: 15f37b0
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

get_image_uri -> sagemaker.image_uris.retrieve()
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: 9a88915
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

* Removed sagemaker_session for SKLearn
* Moved checkpoint_path into hyper parameters (https://sagemaker.readthedocs.io/en/v2.0.0.rc0/frameworks/tensorflow/upgrade_from_legacy.html)
* Added framework_version and py_version
* Update entry_point and renamed image_name to image_uri for TensorFlow
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@brightsparc
Copy link
Contributor Author

brightsparc commented Aug 11, 2020

I have validated that there are no more upgrades required by the tool after performing upgrade-v2 to all files and using black to format consistency given the upgrade tool messes with formatting. (although I haven't re-formatted the code using black as part of this branch).

upgrade_dir () {    
    shopt -s nullglob dotglob

    for pathname in "$1"/*; do
        if [ -d "$pathname" ]; then
            upgrade_dir "$pathname"
        else
            case "$pathname" in
                *.py)
                    sagemaker-upgrade-v2 --in-file "$pathname" --out-file "$pathname"
            esac
        fi
    done
}

# install black for formatting
pip install black

# apply upgrades
upgrade_dir ./src
upgrade_dir ./tests

# re-apply black to src and test
black -S ./src
black -S ./tests

git diff -w

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: 8b97898
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@vaib-amz vaib-amz self-requested a review September 4, 2020 19:36
@vaib-amz
Copy link
Contributor

vaib-amz commented Sep 4, 2020

Thanks a lot for this PR @brightsparc ! I don't have any feedback besides the docstring. I will ask another team member to look at this PR, and will also have to get a confirmation regarding the version bump.

@pouyanhoss
Copy link

pouyanhoss commented Sep 11, 2020

@brightsparc
Hi Julian,

I am using DeepAR in SageMaker and recently upgraded it to SDK v2. I was able to modify everything except the following part:

predictor = DeepARPredictor(endpoint=best_tuning_job_name, sagemaker_session=sess, content_type='application/json')

Based on the documentation, I modified it as following:

predictor = DeepARPredictor(endpoint_name=best_tuning_job_name, sagemaker_session=sess,serializer=sagemaker.serializers.JSONSerializer())

However, I get a TypeError when I call the predictor in the next cell:

predictor.set_prediction_parameters(freq, prediction_length)
list_of_df = predictor.predict(violation_list_training[:2])
TypeError: Object of type bytes is not JSON serializable

Any help would be much appreciated.

yoodan93
yoodan93 previously approved these changes Sep 17, 2020
Copy link
Contributor

@yoodan93 yoodan93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@wong-a
Copy link
Contributor

wong-a commented Sep 17, 2020

Thanks for the contribution @brightsparc

A few things to take care of before merging:

  1. Resolve merge conflicts in VERSION, requiremenets.txt, and setup.py
  2. Make sure acceptance and unit tests pass

For the Step Functions team, we need to figure out the plan for maintaining v1 of the stepfunctions SDK and user guidance for migration from v1 to v2.

@brightsparc
Copy link
Contributor Author

For the Step Functions team, we need to figure out the plan for maintaining v1 of the stepfunctions SDK and user guidance for migration from v1 to v2.

I have resolved the conflict to stick with the v2 target version. @wong-a I can revise this as required.

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: 24f1b61
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

metrizable
metrizable previously approved these changes Sep 23, 2020
Copy link

@metrizable metrizable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! is there interest in using an auto-formatter like black?

@@ -36,12 +36,12 @@ def __init__(self, state_id, estimator, job_name, data=None, hyperparameters=Non
data: Information about the training data. Please refer to the ``fit()`` method of the associated estimator, as this can take any of the following forms:

* (str) - The S3 location where training data is saved.
* (dict[str, str] or dict[str, sagemaker.session.s3_input]) - If using multiple
* (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) - If using multiple

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested edit: is it worth considering for the future is to specify the generic Dict from typing (or even Mapping)? I know these are just docstring comments, but in the future, you may want to include type hinting and the change will be minor in the future at that point:

Suggested change
* (dict[str, str] or dict[str, sagemaker.inputs.TrainingInput]) - If using multiple
* (Dict[str, str] or Dict[str, sagemaker.inputs.TrainingInput]) - If using multiple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an auto-formatter is a good idea. I used black when I was ensuring I had applied the correct v2 SM upgrades, but the diff made it hard to compare changes so left it as is.

Those doc comments could be updated with a seperate PR as I suspect there are a few.

…teps, fix Rule evaluator image, and instance count/type
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: 50fada4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@vaib-amz
Copy link
Contributor

Updating version to a pre-release candidate. The tests were all okay, only the version has changed. Since the changed were approved already, merging in.

@vaib-amz vaib-amz merged commit dbcf358 into aws:master Sep 23, 2020
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: StepFunctionsPythonSDK-integtests
  • Commit ID: c5cabdc
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants