Adding a new metric BRAS(Batch Removal Adapted Silhouette)#62
Adding a new metric BRAS(Batch Removal Adapted Silhouette)#62mumichae merged 3 commits intoopenproblems-bio:mainfrom
Conversation
|
Hi there, This is Pia, the first author of the publication introducing BRAS writing. What a coincidence! I was just reviewing the contribution guidelines for OpenProblems and came across your (@seohyonkim) pull request implementing the metric. It's wonderful to see its adoption! I wanted to let you know that in addition to the basic BRAS implementation included in the reproducibility notebook you are using, there is now a scalable implementation available as part of the latest release of the scib-metrics package: https://github.com/YosefLab/scib-metrics. I don't know how large the benchmark datasets are in the batch integration task, but it might be worth considering the scib-metrics implementation. Let me know if you have any questions! Best, P.S.: Thanks to all maintainers and contributors for this valuable community effort! |
|
Hello @prauten ! Thanks for checkout out this PR and leaving a comment. And also leaving the link to the newer version of the BRAS! I'd love to implement the newer one, do the original metric and the newer version reproduce the same results? |
|
Hi @seohyonkim, my pleasure! And yes, I verified the equivalence of the scalable implementation of BRAS with the naive implementation presented in the preprint (see https://github.com/YosefLab/scib-metrics/blob/12b6f354ac0305c8c81859696ed57f7e3c75927c/tests/test_BRAS_metric.py). |
mumichae
left a comment
There was a problem hiding this comment.
LGTM! Removing the boilerplate code would be preferable
There was a problem hiding this comment.
Could you remove the boilerplate comments for better readability?
| The BRAS (Batch Removal Adapted Silhouette) metric modifies the standard silhouette score to account for batch effects in single-cell data integration benchmarking. | ||
| Instead of measuring how well a cell matches its biological label cluster compared to other clusters (as in regular silhouette), BRAS compares how well it matches its biological cluster in its own batch versus the same biological cluster in other batches. | ||
| For each cells, BRAS computes the ai = average distance to cells with the same label in the same batch, and bi = the average distance to cells with the same label in different batches. | ||
| It then uses ai and bi for the standard silhoueette formula. |
Describe your changes
This PR is for implementing a new metric, BRAS.
Checklist before requesting a review
I have performed a self-review of my code
Check the correct box. Does this PR contain:
Proposed changes are described in the CHANGELOG.md
CI Tests succeed and look good!