-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vertex AI Experiment Tracker Integration #3260
base: develop
Are you sure you want to change the base?
Vertex AI Experiment Tracker Integration #3260
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great already @nkhusainov!
src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py
Outdated
Show resolved
Hide resolved
src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py
Outdated
Show resolved
Hide resolved
src/zenml/integrations/gcp/flavors/vertex_experiment_tracker_flavor.py
Outdated
Show resolved
Hide resolved
…n VertexExperimentTracker
8f4b2ee
to
a8e3d84
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- MLOps
Note: We resolved prior Hyperlint review comments because:
We updated our inline suggestion AI.
We do this to avoid keeping outdated or irrelevant comments around. We'll leave a new review with current comments below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the docs only. Looks so great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great - but just as a small wish I'd love to see a screenshot of how it looks like on Vertex just to show it to users.
Also there is nothing about Tensorboard, and I know that that is a parameter in the implementation. Should that maybe stand out a bit as a special case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, good suggestions! I'll add Tensorboard examples and several screenshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added 2 examples:
- Logging Metrics Using Built-in Methods This example works well.
- Uploading TensorBoard Logs
This example is an interesting use case as it allows uploading TensorBoard logs. However, there are several technical challenges to address, and I’d like your advice on how best to resolve them:
- Accessing Experiment and Run Names
The
start_upload_tb_log
method requires experiment and run names, which are set at runtime.- Code Reference 1
- Code Reference 2
Question: What is the best way to access these variables? One option is to set them as object attributes during prepare_step_run, but this approach feels suboptimal. Are there cleaner alternatives?
- Handling Credentials Currently, there is no way to pass credentials explicitly to the start_upload_tb_log method. It relies on GCP default credentials, which are environment-dependent. Question: What would be a better approach to manage credentials for this method?
- Accessing Experiment and Run Names
The
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good questions. I'm also not sure - probably the best way right now is to set them in the experiment tracker class and then fetch them using client.active_stack.experiment_tracker.XYZ...
In case of credentials, one could leverage service connectors, but I think its a safe assumption that the vertex execution role has the required credentials .. so i'd leave that be for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Co-authored-by: Hamza Tahir <[email protected]>
Co-authored-by: Hamza Tahir <[email protected]>
…xamples and UI visuals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- MLOps
Note: We resolved prior Hyperlint review comments because:
We updated our inline suggestion AI.
We do this to avoid keeping outdated or irrelevant comments around. We'll leave a new review with current comments below.
@nkhusainov i think you can check for it maybe in code and throw an error rather than having it as part of the gcp integration |
At which point in time do you see this check happening?
Do you have a utility I can reuse (maybe this one ) Or should I create my own? |
Probably I’d do during the initialization of the run and check if they
passed in tensorboard dir and if so throw an error if the module isn’t
installed ?
I think we have some utilities in source_utils that you can leverage, e.g., [this one](https://github.com/zenml-io/zenml/blob/c68275beec2a5571117a4c47500890fbc9c675b1/src/zenml/utils/source_utils.py#L567)
|
Apologies for the extended discussions, but I feel it’s important to provide more clarity on this topic:
Points 2 and 3 work without needing extra flavor configurations or settings. This means there’s no way to determine at runtime whether the user will use functionality from autologging or TensorBoard within the step. Given this, we face a choice:
|
@nkhusainov I think requiring them to install all dependencies (mlflow, tb) is out of the question, so that only leaves us with the other solution. |
So, to confirm: your suggestion is that there will be no runtime checks in the code to ensure the necessary dependencies are installed for specific features. Instead, the focus will be entirely on documentation, clearly specifying the dependency requirements for each feature. Is that correct? |
@nkhusainov Correct. I don't see an easy way we could do runtime checks for the dependencies if they're only needed in some cases when a user calls a specific function. Ideally those runtime checks would happen in the |
…Vertex AI Experiment Tracker guide
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 files reviewed, 1 total issue(s) found.
The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:
- MLOps
Done. Is there anything else that needs to be fixed in this PR? |
Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com>
Describe changes
This PR contains implementation of Vertex AI Experiment Tracker
TO DO
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes