Skip to content

test(schema): 3-way spec drift checker (Go ↔ Schema ↔ Docs)#194

Open
pradhyum6144 wants to merge 1 commit intomodelpack:mainfrom
pradhyum6144:feat/three-way-spec-drift-checker
Open

test(schema): 3-way spec drift checker (Go ↔ Schema ↔ Docs)#194
pradhyum6144 wants to merge 1 commit intomodelpack:mainfrom
pradhyum6144:feat/three-way-spec-drift-checker

Conversation

@pradhyum6144
Copy link
Contributor

Summary

  • Extends the 2-way sync checker from PR add Go struct and JSON schema sync test #173 (Go structs ↔ JSON Schema) with two additional drift detection layers
  • Layer 1: Go structs ↔ JSON Schema — reflection-based comparison of all 5 struct types (Model, ModelDescriptor, ModelConfig, ModelCapabilities, ModelFS) against their schema definitions, catching missing fields, extra properties, or type mismatches
  • Layer 2: JSON Schema ↔ Doc example — verifies the JSON example in docs/config.md demonstrates every schema field and vice versa (catches undocumented fields)
  • Layer 3: Doc field descriptions ↔ JSON Schema — parses the markdown property list (- **fieldName** _type_, REQUIRED) and cross-checks against schema properties (catches documented fields missing from schema or schema fields missing from docs)

What this catches

Drift scenario Caught by
Field added to Go struct but missing from JSON Schema Layer 1
Field added to JSON Schema but missing from Go struct Layer 1
Type mismatch between Go and Schema (e.g. string vs array) Layer 1
Schema field not shown in docs/config.md example Layer 2
Example field not defined in schema Layer 2
Field documented in markdown but missing from schema Layer 3
Schema field not documented in markdown Layer 3

Test plan

  • 15 subtests across 3 test functions all pass
  • Full go test ./... passes with no regressions
  • go vet clean
  • Import ordering follows gci standard/default convention

Builds on #173
Relates to #91

Extends the 2-way sync checker from PR modelpack#173 to catch drift in three
directions at once:

1. Go structs ↔ JSON Schema: reflection-based field/type comparison
2. JSON Schema ↔ Doc example: verifies docs/config.md JSON example
   covers every schema field (and vice versa)
3. Doc descriptions ↔ JSON Schema: parses markdown property
   definitions and cross-checks against schema properties

15 subtests across 3 test functions. Prevents silent spec mismatches
when fields are added, removed, or renamed in any of the three sources.

Signed-off-by: pradhyum6144 <pradhyum314@gmail.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's schema integrity by introducing a robust 3-way drift detection system. It ensures that Go struct definitions, their corresponding JSON Schema, and the user-facing documentation remain perfectly synchronized. This prevents inconsistencies that could lead to bugs or confusion, thereby improving the overall reliability and maintainability of the configuration schema.

Highlights

  • Go Structs ↔ JSON Schema Sync: Implemented reflection-based comparison for five core struct types (Model, ModelDescriptor, ModelConfig, ModelCapabilities, ModelFS) against their JSON schema definitions, identifying missing fields, extra properties, or type mismatches.
  • JSON Schema ↔ Documentation Example Sync: Introduced checks to verify that the JSON example in docs/config.md accurately demonstrates every field defined in the JSON schema, and conversely, that every field in the example is defined in the schema, preventing undocumented fields.
  • Documentation Field Descriptions ↔ JSON Schema Sync: Added parsing of markdown property lists in docs/config.md to cross-check against schema properties, ensuring that documented fields exist in the schema and all schema fields are documented.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new test file, schema/sync_test.go, which adds comprehensive synchronization tests between Go structs, JSON schema definitions, and documentation examples/field descriptions. The tests aim to prevent drift by verifying that Go struct fields match schema properties, schema properties are covered in documentation examples, and documented fields exist in the schema. Review comments highlight potential brittleness in the isRefField's $ref matching logic, the heuristic used in extractJSONExample to find JSON blocks, and the special handling for 'capabilities' within the parseDocFields function.

Comment on lines +326 to +328
for _, p := range props {
if p.Ref != "" && strings.HasSuffix(p.Ref, "/"+strings.ToUpper(name[:1])+name[1:]) {
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic within isRefField for matching the $ref target appears to be incorrect. The strings.ToUpper(name[:1])+name[1:] transformation of the field name (e.g., "capabilities" to "Capabilities") will not correctly match the full definition name in the $ref path (e.g., "ModelCapabilities").

While this function is currently only called for fields not directly present in schemaProps (i.e., unexpected fields), and thus effectively returns false for all current cases, the flawed matching logic could lead to issues if the schema structure or naming conventions change, or if the function's intended purpose evolves.

Comment on lines +375 to +377
if strings.Contains(block, "\"descriptor\"") {
return block
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The heuristic strings.Contains(block, "\"descriptor\"") to identify the correct JSON example block is brittle. If the example JSON in docs/config.md changes and no longer contains the string "descriptor", or if another JSON block happens to contain it, this function might fail or return the wrong block.

Consider using a more robust method, perhaps by leveraging the title or mediatype attributes from the markdown fence (e.g., ````json,title=Model%20Config%20JSON...`) if possible, or by making the search criteria more specific to the expected structure of the main configuration example.

Comment on lines +477 to +481
if fieldName == "capabilities" {
// capabilities is both a field of config AND a sub-section
result[currentSection] = append(result[currentSection], fieldName)
currentSection = "capabilities"
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The special handling for the "capabilities" field, where it's treated as both a field of config and a new currentSection, makes the parseDocFields function somewhat brittle. This logic is highly dependent on the exact structure and nesting of the docs/config.md file.

While it correctly parses the current document, any future changes to the markdown's hierarchy or the introduction of other fields with similar dual roles could easily break this parsing logic. Consider if a more generalized or declarative approach could be used to define these section-field relationships.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant