Support for Score Ranges with an Unspecified Classification#414
Conversation
To support all pillar project data sets, it is necessary to support score ranges without an explicit classification. This requires some changes to existing validation logic: - The wild type score is no longer required at all times. If you have provided a score range with `normal` classification, the wild type score is required and is required to be within this range. - If you do provide a wild type score, it is required you provide at least one `normal` classification. - Users may provide a new `Not Specified` classification, which comes free of normal and abnormal connotations. - All other validation restrictions remain in place and also apply to the new classification As part of these changes, a new file `utils.py` has been added to mavedb lib code. This file at present contains only one new function to help with string sanitization for score ranges, but should be used for other shared library utilities. At some point, we should make an effort to refactor shared utilities into it.
src/mavedb/lib/utils.py
Outdated
| import re | ||
|
|
||
|
|
||
| def sanitize_string(s: str): |
There was a problem hiding this comment.
Wish I knew what to call this other than "sanitize." It's something like formatting a string for use as a symbol (in Ruby).
There was a problem hiding this comment.
And I guess it allows the client to submit a value like "not specified" instead of the actual value, "not_specified." I'm not sure we need to allow this; I would typically treat options in the UI as a set of objects with values and titles like {value: 'not_specified', title: 'Not specified'}. But I don't really object to this flexibility in the API, either.
There was a problem hiding this comment.
But it looks like maybe we're storing the actual value submitted by the client, rather than the result of sanitize_string. Is that right?
There was a problem hiding this comment.
I suppose we are storing the actual value, but when you put it like that I don't really love the behavior either.
I think this is a good point, and I'm inclined to just do it right. It would eliminate some custom validation code to just make it a literal so it's more maintainable too.
25523f0 to
254c338
Compare
To support all pillar project data sets, it is necessary to support score ranges without an explicit
normal/abnormalclassification. This implies some changes to existing validation logic:normalclassification which contains the wild type score within it.normalclassification, a wild type score is required, with the same containment requirements as above.not_specifiedclassification, along with any combination of ranges with the existing classificationsnot_specifiedclassificationElimination of the wild type score requirement might have some impact on work for VariantEffect/mavedb-ui#346.
As part of these changes, a new fileutils.pyhas also been added to MaveDB lib code. This file at present contains only one new function to help with string sanitation for score ranges, but should be used for other shared library utilities. At some point, we should make an effort to refactor shared utilities into it.