Skip to content

Add LAD framework review of highdicom#396

Draft
yarikoptic wants to merge 3 commits intoImagingDataCommons:masterfrom
yarikoptic:docs/lad-review
Draft

Add LAD framework review of highdicom#396
yarikoptic wants to merge 3 commits intoImagingDataCommons:masterfrom
yarikoptic:docs/lad-review

Conversation

@yarikoptic
Copy link
Copy Markdown
Contributor

Review evaluating highdicom for adoption in heudiconv using the LAD (LLM-Assisted Development) framework by @chrisfoulon along dimensions: code quality, testing, documentation, risk management, and integration feasibility.

Intended as a discussion document for the community, feel free to close... I just did it anyways so decided to share as a PR since easier to comment on specific aspects

Review evaluating highdicom for adoption in heudiconv using the LAD
(LLM-Assisted Development) framework dimensions: code quality,
testing, documentation, risk management, and integration feasibility.

Intended as a discussion document for the community.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CPBridge
Copy link
Copy Markdown
Collaborator

Hi @yarikoptic

Thanks for this interesting review. It is heartening to see that the framework gives generally very positive comments about the highdicom! Certainly we are aiming for a tool that is stable and better engineered than typical research code, and it seems to think we have achieved that

For the sake of engaging in the discussion, there is probably one point where I think the review is a little generous. It describes our test data as "1.3GB of real DICOM test files covering modalities and transfer syntaxes" and "real-world DICOM, not synthetic". I'm not sure where the 1.3GB came from, we definitely have less than that within highdicom, though we also draw heavily on pydicom's test data as well as our own. I would note though that many of the test files for derived objects (segmentations, SRs) was generated with the library itself, which is certainly a limitation. This is an area where I have been wanting to improve things for a while by bringing in data created with other tools, e.g. from the Imaging Data Commons, during testing to ensure compatibility with other frameworks. Nevertheless, I do think our test suite remains a fairly comprehensive set of tests that covers a very large range of the likely variation that could be seen.

Verified all factual claims against codebase. Corrections:
- Test data: 1.3GB -> ~9 MB (30 files)
- Test modules: 17 -> 20
- Test lines: 26,428 -> ~26,300
- .rst files: 29 -> 31
- Release count: ~28 -> 46
- Commits: 1686 -> 1132; CPBridge 685 (65%) -> ~779 (69%)
- Contributors: 19 -> 28
- hackermd: 20% -> 14%
- mypy: "checked in CI" -> "in test deps, not run in CI"
- Jupyter notebooks (plural) -> notebook (singular)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@yarikoptic
Copy link
Copy Markdown
Contributor Author

yarikoptic commented Mar 23, 2026

Thank you @CPBridge for giving it a read! I made claude check where he got that info and answered

● Fixed. The "1.3GB" was hallucinated by the exploration subagent — the actual test data is ~9 MB (30 .dcm
files). Updated in both the local highdicom-lad-review.md and the worktree's docs/lad_review.md (committed as
b97c06e).

so I asked to check for other hallucinations -- here is what it found

  Confirmed hallucinations / errors (now fixed):

  ┌──────────────────────┬──────────────────────────────────┬──────────────────────────────────────────────┐
  │        Claim         │             Reality              │                    Source                    │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ 1.3GB test data      │ ~9 MB, 30 files                  │ Already fixed earlier                        │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ 26,428 lines / 17    │ ~26,300 lines / 20 modules       │ wc -l tests/test_*.py                        │
  │ modules              │                                  │                                              │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ 29 .rst files        │ 31                               │ ls docs/*.rst | wc -l                        │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ ~28 releases         │ 46 tags                          │ git tag | wc -l                              │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ CPBridge: 685/1686   │ Chris Bridge: ~779/1132 (69%)    │ git shortlog -sn (two name variants)         │
  │ (65%)                │                                  │                                              │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ hackermd 20%         │ 159/1132 = 14%                   │ same                                         │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ 19 contributors      │ 28 unique authors                │ git log --format='%aN' | sort -u             │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ mypy "checked in CI" │ Listed in [test] deps but never  │ .github/workflows/run_unit_tests.yml has no  │
  │                      │ invoked in the workflow          │ mypy step                                    │
  ├──────────────────────┼──────────────────────────────────┼──────────────────────────────────────────────┤
  │ "Jupyter notebooks"  │ 1 notebook                       │ find examples/ -name "*.ipynb"               │
  │ (plural)             │                                  │                                              │
  └──────────────────────┴──────────────────────────────────┴──────────────────────────────────────────────┘

  Verified as correct: flake8 line length 80, pydicom >=3.0.1, Python 3.10-3.14 (3.12 skipped),
  --cov-fail-under=80, seg 7k/sr 6k lines, peer-reviewed paper in J Digit Imaging (Springer s10278-022-00683-y),
   6 releases in 2025, Dockerized examples, get_volume_from_series() exists, NumPy-style docstrings via
  napoleon, MIT license, _modules.py ~27MB.

some of them are still not fully kosher -- like among others there are duplicates.. someone should produce .mailmap (I thought I already crafted a skill.... seems not!)

But do those look 'kosher'? (I did push them here)

edit: found - it was a command not skill... PR might be coming

791/1134 (70%) for Christopher P. Bridge, 20 unique contributors
(down from inflated 28 due to duplicate author identities).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CPBridge
Copy link
Copy Markdown
Collaborator

Yes this looks better now I think

A note on the mypi thing: we used to have mypi enabled in the CI but then some shenanigans in a certain pydicom version led to an unworkable number of unhelpful errors and we had to disable it. This is overdue a revisit, so this serves as a helpful reminder!

@yarikoptic
Copy link
Copy Markdown
Contributor Author

A note on the mypi thing: we used to have mypi enabled in the CI but then some shenanigans in a certain pydicom version led to an unworkable number of unhelpful errors and we had to disable it. This is overdue a revisit, so this serves as a helpful reminder!

I think that's where smth like claude code with good instructions could be an indispensible tool to just get it done to concentrate on interesting/cool stuff instead ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants