Skip to content

[SC-15520] Expose qualitative text agent to vm library#493

Open
juanmleng wants to merge 7 commits intomainfrom
juan/sc-15520/expose-qualitative-text-agent-to-vm-library
Open

[SC-15520] Expose qualitative text agent to vm library#493
juanmleng wants to merge 7 commits intomainfrom
juan/sc-15520/expose-qualitative-text-agent-to-vm-library

Conversation

@juanmleng
Copy link
Copy Markdown
Contributor

@juanmleng juanmleng commented Apr 2, 2026

Pull Request Description

What and why?

Implemented programmatic qualitative text generation in the ValidMind library so documentation text blocks can now be generated from code instead of only through the UI. This adds support for generating text for a single content_id, customizing generation with a prompt and narrowing the generation context to selected sections.

Before this change, users had to manually write text or trigger AI generation section by section in the UI; after this change, they can generate and log qualitative documentation directly from Python in the same workflow used to run quantitative tests.

How to test

  • Run pytest tests/test_api_client.py tests/test_client.py tests/test_results.py.
  • Open notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb and run the notebook end to end against a model with the Customer Churn template applied.
  • Verify that vm.run_text_generation() works for a single text block with default behavior, with a custom prompt, and with section-specific context.
  • Verify that generated text can be logged back to the document and that looping over all configured text blocks populates the qualitative sections of the document.

customer_churn_template.yaml

What needs special review?

Dependencies, breaking changes, and deployment notes

https://github.com/validmind/frontend/pull/2390
https://github.com/validmind/backend/pull/2925

Release notes

Added support for programmatic AI generation of qualitative documentation text through the ValidMind library. Users can now generate and log text for individual documentation blocks, customize the output with prompts, control generation context with selected sections, and populate qualitative sections directly from notebooks.

Checklist

  • What and why
  • Screenshots or videos (Frontend)
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied
  • PR linked to Shortcut
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required)

@juanmleng juanmleng self-assigned this Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@juanmleng juanmleng requested a review from cachafla April 2, 2026 20:26
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Pull requests must include at least one of the required labels: internal (no release notes required), highlight, enhancement, bug, deprecation, documentation. Except for internal, pull requests must also include a description in the release notes section.

@juanmleng juanmleng added the enhancement New feature or request label Apr 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

PR Summary

This PR introduces comprehensive support for programmatically generating and logging qualitative documentation text using the ValidMind library. Key functional changes include:

  • A new Jupyter notebook that demonstrates how to generate qualitative content for model documentation using the AI-assisted function vm.run_text_generation(). The notebook explains how to customize prompts, control context via content IDs, and automate the documentation of text blocks.

  • Enhancements to the API client: New helper functions are added for validating input context (e.g., ensuring that context['content_ids'] is a non-empty list of non-empty strings) and for building the request payload for text generation. The text is normalized by converting Markdown to HTML with MathML support when necessary.

  • Introduction of asynchronous logging support through the alog_text function in the API client, facilitating non-blocking operations. This function works in tandem with synchronous wrappers to generate and log text, ensuring consistency between manual logging and generated output.

  • Modifications in the client module to expose new utility functions such as get_content_ids for retrieving content IDs from the documentation template, and a new run_text_generation function that wraps text generation functionality into a result object with metadata (including timing information).

  • Expanded test coverage: Several new tests have been added in the API client, client, and results modules to validate the text generation functionality. These tests check for correct logging behavior, proper argument validation (e.g., rejecting use of both text and prompt together), and correct handling of error conditions.

Overall, these changes streamline the integration of AI-generated qualitative content with documentation tests and enhance the end-to-end automated documentation workflow in the ValidMind Library.

Test Suggestions

  • Add integration tests that simulate end-to-end flow of run_text_generation including generation, logging, and metadata verification.
  • Test failure cases by providing invalid context (e.g., an empty string in the content_ids list) and ensuring that appropriate exceptions are raised.
  • Verify that providing both text and prompt raises a ValueError, and that asynchronous logging via alog_text behaves as expected when the content_id is missing.
  • Check that Markdown text is correctly transformed to HTML when text is provided in a non-HTML format.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content_id for model_overview_text should be model_overview, otherwise you would not be reusing the pre-existing block:

Image

Similarly, I get an error for:

vm.run_text_generation(
    content_id="dataset_description_text",
    context={"content_ids": vm.get_content_ids("data_description")},
).log()

Error:

SectionNotFoundError: Section for content dataset_description_text not found
[NOTE] During task with name 'qualitative_text_generation' and id '028072b6-3431-e084-5f94-a0f6ec353894'

The correct content_id should be dataset_summary_text:

Image

Similarly, the last cell should be fixed so it can run on the current version of the chun template, without modifications. This will require 2 changes:

  • Using the correct content_ids i.e. model_overview instead of model_overview_text
  • Adding a small improvement to the current feature (I know it adds a bit of scope) so that we can pre-assign a content block if the given content_id does not exist in the template. For example, the block intended_use_text does not exist in the template, so we should be able to specify the section_id where it should be appended. Note that this is already supported by test results blocks, e.g.:
perf_comparison_result.log(section_id="model_evaluation")
roc_curve_result.log(section_id="model_evaluation")

For the case of intended_use_text we could call .log(section_id="intended_use").

With these changes in place we should be able to populate 100% of the churn document with this notebook.

Copy link
Copy Markdown
Contributor

@johnwalz97 johnwalz97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is sick!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants