test(schema): 3-way spec drift checker (Go ↔ Schema ↔ Docs)#194
test(schema): 3-way spec drift checker (Go ↔ Schema ↔ Docs)#194pradhyum6144 wants to merge 1 commit intomodelpack:mainfrom
Conversation
Extends the 2-way sync checker from PR modelpack#173 to catch drift in three directions at once: 1. Go structs ↔ JSON Schema: reflection-based field/type comparison 2. JSON Schema ↔ Doc example: verifies docs/config.md JSON example covers every schema field (and vice versa) 3. Doc descriptions ↔ JSON Schema: parses markdown property definitions and cross-checks against schema properties 15 subtests across 3 test functions. Prevents silent spec mismatches when fields are added, removed, or renamed in any of the three sources. Signed-off-by: pradhyum6144 <pradhyum314@gmail.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the project's schema integrity by introducing a robust 3-way drift detection system. It ensures that Go struct definitions, their corresponding JSON Schema, and the user-facing documentation remain perfectly synchronized. This prevents inconsistencies that could lead to bugs or confusion, thereby improving the overall reliability and maintainability of the configuration schema. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new test file, schema/sync_test.go, which adds comprehensive synchronization tests between Go structs, JSON schema definitions, and documentation examples/field descriptions. The tests aim to prevent drift by verifying that Go struct fields match schema properties, schema properties are covered in documentation examples, and documented fields exist in the schema. Review comments highlight potential brittleness in the isRefField's $ref matching logic, the heuristic used in extractJSONExample to find JSON blocks, and the special handling for 'capabilities' within the parseDocFields function.
| for _, p := range props { | ||
| if p.Ref != "" && strings.HasSuffix(p.Ref, "/"+strings.ToUpper(name[:1])+name[1:]) { | ||
| return true |
There was a problem hiding this comment.
The logic within isRefField for matching the $ref target appears to be incorrect. The strings.ToUpper(name[:1])+name[1:] transformation of the field name (e.g., "capabilities" to "Capabilities") will not correctly match the full definition name in the $ref path (e.g., "ModelCapabilities").
While this function is currently only called for fields not directly present in schemaProps (i.e., unexpected fields), and thus effectively returns false for all current cases, the flawed matching logic could lead to issues if the schema structure or naming conventions change, or if the function's intended purpose evolves.
| if strings.Contains(block, "\"descriptor\"") { | ||
| return block | ||
| } |
There was a problem hiding this comment.
The heuristic strings.Contains(block, "\"descriptor\"") to identify the correct JSON example block is brittle. If the example JSON in docs/config.md changes and no longer contains the string "descriptor", or if another JSON block happens to contain it, this function might fail or return the wrong block.
Consider using a more robust method, perhaps by leveraging the title or mediatype attributes from the markdown fence (e.g., ````json,title=Model%20Config%20JSON...`) if possible, or by making the search criteria more specific to the expected structure of the main configuration example.
| if fieldName == "capabilities" { | ||
| // capabilities is both a field of config AND a sub-section | ||
| result[currentSection] = append(result[currentSection], fieldName) | ||
| currentSection = "capabilities" | ||
| } else { |
There was a problem hiding this comment.
The special handling for the "capabilities" field, where it's treated as both a field of config and a new currentSection, makes the parseDocFields function somewhat brittle. This logic is highly dependent on the exact structure and nesting of the docs/config.md file.
While it correctly parses the current document, any future changes to the markdown's hierarchy or the introduction of other fields with similar dual roles could easily break this parsing logic. Consider if a more generalized or declarative approach could be used to define these section-field relationships.
Summary
docs/config.mddemonstrates every schema field and vice versa (catches undocumented fields)- **fieldName** _type_, REQUIRED) and cross-checks against schema properties (catches documented fields missing from schema or schema fields missing from docs)What this catches
Test plan
go test ./...passes with no regressionsgo vetcleanBuilds on #173
Relates to #91