Merge Documentation changes to main for Launch#196
Conversation
… main (#190) * Fix training test (#184) * Fix SDK training test: Add wait time before refresh * Fix training tests in canaries * Update logging information for submitting and deleting training job (#189) Co-authored-by: pintaoz <pintaoz@amazon.com> --------- Co-authored-by: Zhaoqi <zhaoqiwang.baruch@gmail.com> Co-authored-by: pintaoz-aws <167920275+pintaoz-aws@users.noreply.github.com> Co-authored-by: pintaoz <pintaoz@amazon.com>
Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
* Fix training test (#184) * Fix SDK training test: Add wait time before refresh * Fix training tests in canaries * Update logging information for submitting and deleting training job (#189) Co-authored-by: pintaoz <pintaoz@amazon.com> --------- Co-authored-by: Zhaoqi <zhaoqiwang.baruch@gmail.com> Co-authored-by: pintaoz-aws <167920275+pintaoz-aws@users.noreply.github.com> Co-authored-by: pintaoz <pintaoz@amazon.com>
* Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
| /.mypy_cache | ||
|
|
||
| /doc/_apidoc/ | ||
| doc/_build/ |
There was a problem hiding this comment.
Does this needs to be /doc/_build/ here?
There was a problem hiding this comment.
This is mainly to make sure _build is ignored in git Version control system
| source {venv-name}/bin/activate | ||
| ``` | ||
| ```{note} | ||
| Remember to activate your virtual environment (source {venv-name}/bin/activate) each time you want to use the HyperPod CLI and SDK if you chose the virtual environment installation method. |
There was a problem hiding this comment.
Add code quote around source {venv-name}/bin/activate
| --image pytorch/pytorch:latest \ | ||
| ``` | ||
| ```` | ||
| ````{tab-item} SDK |
There was a problem hiding this comment.
Is SDK code keeping parity with CLI here?
There was a problem hiding this comment.
This will be a fast-follow item
| ``` | ||
| ```` | ||
|
|
||
| ````{tab-item} SDK |
There was a problem hiding this comment.
Seems like SDK code here is still using some optional variables
| ```` | ||
|
|
||
| ````{tab-item} SDK | ||
| ```python |
There was a problem hiding this comment.
Need to update SDK code here too
| # Custom endpoint | ||
| hyp list-pods hyp-custom-endpoint | ||
| ``` | ||
| ```` |
There was a problem hiding this comment.
Missing SDK code here
| # Custom endpoint | ||
| hyp get-logs hyp-custom-endpoint --pod-name <pod-name> | ||
| ``` | ||
| ```` |
There was a problem hiding this comment.
Missing SDK code here
|
|
||
| List all HyperPod PyTorch jobs in a namespace. | ||
|
|
||
| #### Syntax |
There was a problem hiding this comment.
Seems like Syntax is even bigger then hyp list hyp-pytorch-job, not sure why the rendering is like that
There was a problem hiding this comment.
yup, mainly CSS changes required.
would be a fast follow as well.
| ::: | ||
|
|
||
| :::{grid-item-card} HyperPod Developer Guide | ||
| :link: https://catalog.workshops.aws/sagemaker-hyperpod-eks/en-US |
There was a problem hiding this comment.
Link seems to be the same as the workshop. Maybe needs an update?
There was a problem hiding this comment.
yes, checking with Shweta on this.
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
doc/inference.md
Outdated
| When creating an inference endpoint, you'll need to specify: | ||
|
|
||
| - **endpoint-name**: Unique identifier for your endpoint | ||
| - **instance-type**: The EC2 instance type to use | ||
| - **model-id** (JumpStart): ID of the pre-trained JumpStart model | ||
| - **image-uri** (Custom): Docker image containing your inference code | ||
| - **model-name** (Custom): Name of model to create on SageMaker | ||
| - **model-source-type** (Custom): Source type: fsx or s3 | ||
| - **model-volume-mount-name** (Custom): Name of the model volume mount | ||
| - **container-port** (Custom): Port on which the model server listens |
There was a problem hiding this comment.
Can we separate this into 2
- Parameters required for Jumpstart
- Parameters required for Custom
doc/installation.md
Outdated
| ### Supported ML Frameworks | ||
| - PyTorch (version ≥ 1.10) |
There was a problem hiding this comment.
Nit: Supported ML Frameworks for Training maybe
| def test_set_cluster_context(self, cluster_name): | ||
| """Test setting cluster context.""" | ||
| result = execute_command([ | ||
| "hyp", "set-cluster-context", | ||
| "--cluster-name", cluster_name | ||
| ]) | ||
| assert result.returncode == 0 | ||
| context_line = result.stdout.strip().splitlines()[-1] | ||
| assert any(text in context_line for text in ["Updated context", "Added new context"]) | ||
|
|
There was a problem hiding this comment.
Looks like this change is from other commits. Can you rebase to main to clean it up?
There was a problem hiding this comment.
I merged in the latest changes from main and this change is shown up as diff. Change is from this PR: https://github.com/aws/sagemaker-hyperpod-cli/pull/184/files
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
* Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes * Documentation Fixes --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>
PR to merge all the documentation change to main branch for public launch
PR Approval Steps
For Requester
For Reviewer
For Requestersection to double check each item.