Skip to content

Conversation

@TetyanaYahodska
Copy link
Contributor

@TetyanaYahodska TetyanaYahodska commented Oct 15, 2024

Description

Sample in Python feat: TPU *Create / *get / *create_with_script / *delete Samples by Thoughtseize1 · Pull Request #12690 · GoogleCloudPlatform/python-docs-samples
Documentation - https://cloud.google.com/tpu/docs/managing-tpus-tpu-vm
Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

  • I have followed Sample Format Guide
  • pom.xml parent set to latest shared-configuration
  • Appropriate changes to README are included in PR
  • These samples need a new API enabled in testing projects to pass (let us know which ones)
  • These samples need a new/updated env vars in testing projects set to pass (let us know which ones)
  • Tests pass: mvn clean verify required
  • Lint passes: mvn -P lint checkstyle:check required
  • Static Analysis: mvn -P lint clean compile pmd:cpd-check spotbugs:check advisory only
  • This sample adds a new sample directory, and I updated the CODEOWNERS file with the codeowners for this sample
  • This sample adds a new Product API, and I updated the Blunderbuss issue/PR auto-assigner with the codeowners for this sample
  • Please merge this PR for me once it is approved

@product-auto-label product-auto-label bot added api: tpu Issues related to the Cloud TPU API. samples Issues that are directly related to samples. labels Oct 15, 2024
@TetyanaYahodska TetyanaYahodska added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 15, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 16, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 16, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. labels Oct 16, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 16, 2024
@TetyanaYahodska TetyanaYahodska marked this pull request as ready for review October 16, 2024 13:16
@TetyanaYahodska TetyanaYahodska assigned Sita04 and unassigned minherz Oct 16, 2024
@snippet-bot
Copy link

snippet-bot bot commented Oct 16, 2024

Here is the summary of changes.

You are about to add 3 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@AfterAll
public static void cleanup() throws Exception {
DeleteTpuVm.deleteTpuVm(PROJECT_ID, ZONE, TPU_VM_NAME);
TimeUnit.MINUTES.sleep(5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really takes 5 minutes to delete a TPU VM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends :) This is long term operation. I've got error and after set timeout it disappeared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented OperationTimedPollAlgorithm with RetrySettings for TpuClient which permits to call delete method and don't use any timeouts.

static String javaVersion = System.getProperty("java.version").substring(0, 2);
private static final String TPU_VM_NAME = "test-tpu-" + javaVersion + "-"
+ UUID.randomUUID().toString().substring(0, 8);
private static final String ACCELERATOR_TYPE = "v5litepod-1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have access to v5lite? Can we rather use some older and cheaper TPU model for testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed accelerator type and version.

String creationTime = formatTimestamp(node.getCreateTime());
String name = node.getName().substring(node.getName().lastIndexOf("/") + 1);
if (containPrefixToDeleteAndZone(node, prefixToDelete, zone)
&& isCreatedBeforeThresholdTime(creationTime)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the test case takes < 5 minutes, do we just not delete the VM we created? Can we guarantee there won't be a TPU VM left running after the test runs once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Fixed time to 30 minutes. After implementing all samples for TPU we can set it to 24 hours for example.

@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 16, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 16, 2024
@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 17, 2024
@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 30, 2024
@TetyanaYahodska TetyanaYahodska added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 30, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 30, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 30, 2024
@Sita04 Sita04 assigned Sita04 and unassigned minherz Oct 30, 2024
@Sita04
Copy link
Contributor

Sita04 commented Oct 30, 2024

@TetyanaYahodska Please ping for review post @m-strzelczyk's LGTM.

@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Oct 31, 2024
@TetyanaYahodska TetyanaYahodska added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Oct 31, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Nov 1, 2024
@TetyanaYahodska TetyanaYahodska requested review from Sita04 and removed request for Sita04 November 1, 2024 08:53
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 1, 2024
@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 1, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 1, 2024
@kokoro-team kokoro-team removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 7, 2024
@TetyanaYahodska TetyanaYahodska added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 7, 2024
@TetyanaYahodska TetyanaYahodska merged commit 282ec81 into main Nov 7, 2024
@TetyanaYahodska TetyanaYahodska deleted the tpu-vm-crud-operations branch November 7, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: tpu Issues related to the Cloud TPU API. kokoro:run Add this label to force Kokoro to re-run the tests. samples Issues that are directly related to samples.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants