Skip to content

Support for Score Ranges with an Unspecified Classification#414

Merged
bencap merged 3 commits intorelease-2025.1.2from
feature/bencap/413/not-specified-score-range
Mar 28, 2025
Merged

Support for Score Ranges with an Unspecified Classification#414
bencap merged 3 commits intorelease-2025.1.2from
feature/bencap/413/not-specified-score-range

Conversation

@bencap
Copy link
Copy Markdown
Collaborator

@bencap bencap commented Mar 26, 2025

To support all pillar project data sets, it is necessary to support score ranges without an explicit normal/abnormal classification. This implies some changes to existing validation logic:

  • The wild type score is no longer a required field.
  • If you provide a wild type score, it is required you provide at least one range with normal classification which contains the wild type score within it.
  • If you have provided at least one score range with normal classification, a wild type score is required, with the same containment requirements as above.
  • Users may provide a new not_specified classification, along with any combination of ranges with the existing classifications
  • All other validation restrictions remain in place and apply to the new not_specified classification

Elimination of the wild type score requirement might have some impact on work for VariantEffect/mavedb-ui#346.

As part of these changes, a new file utils.py has also been added to MaveDB lib code. This file at present contains only one new function to help with string sanitation for score ranges, but should be used for other shared library utilities. At some point, we should make an effort to refactor shared utilities into it.

To support all pillar project data sets, it is necessary to support score ranges without an explicit
classification. This requires some changes to existing validation logic:
- The wild type score is no longer required at all times. If you have provided a score range with `normal` classification,
the wild type score is required and is required to be within this range.
- If you do provide a wild type score, it is required you provide at least one `normal` classification.
- Users may provide a new `Not Specified` classification, which comes free of normal and abnormal connotations.
- All other validation restrictions remain in place and also apply to the new classification

As part of these changes, a new file `utils.py` has been added to mavedb lib code. This file at present contains only one
new function to help with string sanitization for score ranges, but should be used for other shared library utilities. At
some point, we should make an effort to refactor shared utilities into it.
import re


def sanitize_string(s: str):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wish I knew what to call this other than "sanitize." It's something like formatting a string for use as a symbol (in Ruby).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I guess it allows the client to submit a value like "not specified" instead of the actual value, "not_specified." I'm not sure we need to allow this; I would typically treat options in the UI as a set of objects with values and titles like {value: 'not_specified', title: 'Not specified'}. But I don't really object to this flexibility in the API, either.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it looks like maybe we're storing the actual value submitted by the client, rather than the result of sanitize_string. Is that right?

Copy link
Copy Markdown
Collaborator Author

@bencap bencap Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we are storing the actual value, but when you put it like that I don't really love the behavior either.
I think this is a good point, and I'm inclined to just do it right. It would eliminate some custom validation code to just make it a literal so it's more maintainable too.

@bencap bencap force-pushed the feature/bencap/413/not-specified-score-range branch from 25523f0 to 254c338 Compare March 27, 2025 20:10
@bencap bencap merged commit 096b851 into release-2025.1.2 Mar 28, 2025
5 checks passed
@bencap bencap deleted the feature/bencap/413/not-specified-score-range branch March 28, 2025 19:03
@bencap bencap mentioned this pull request Mar 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a Not Specified Score Range Classification

2 participants