-
Notifications
You must be signed in to change notification settings - Fork 340
[WIP] Add Tutorial and Derivations Notebooks for VALMOD #585 #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
@seanlaw |
|
I had a miscalculation. Although there is a typo in the paper, it seems the eq(2) of paper is correct. I fixed the typo of paper when I was doing calculation. However, I had a miscalculation somewhere else...so I corrected it and I got the eq(2)... for |
Codecov ReportPatch and project coverage have no change.
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #586 +/- ##
=======================================
Coverage 99.24% 99.24%
=======================================
Files 82 82
Lines 12956 12956
=======================================
Hits 12858 12858
Misses 98 98 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
Notebook is ready. The notebook covers the first 12 pages of VALMOD_2020 paper. I FIXed my miscalculation and things are good! I also implemented the Low-Bound distance profile function for now to see how it performs (we may use it later in VALMOD algorithm). |
|
Just wanted to let you know that you can ignore the function '_calc_LB_dist_profile' at the end of notebook (it is working..but I think it is not clean. I may probably remove it as VALMOD algorithm does not use such function. I just created it to get Lower-Bound of distance profile for now to show the result) |
|
I will first need to go over the initial 12 pages myself and then I will review the notebook :) |
|
@NimaSarajpoor I've gone over your notebook quickly but haven't verified the derivation. Usually, with derivations, I like to write things out fully without skipping any steps (see https://github.com/TDAmeritrade/stumpy/blob/main/docs/Matrix_Profile_Derivation.ipynb). Some of your equations don't seem to be rendering for me and it's a bit hard for me to follow. I can try to find some time to work through the derivation to verify your work if that's helpful? |
|
I see. Please let me re-write it. I will try to follow the same approach/style you used in the link you provided. I will check ReviewNB and if it is rendered well, I will let you know. Sounds good? |
|
Yes, that would be great! Personally, I think writing out the derivation clearly will help (me) and others reduce any doubt in understanding. Also, I find that it provides an opportunity to help maintain consistency in the code regarding variable names. |
|
Weird...still not rendering well.... please let me do some investigation on my end to see what's going on... |
|
@seanlaw I guess you wrote the notebooks on your end and pushed them to the repo...and things were good when rendered locally in .ipynb. Did you, by any chance, try to check your notebooks via ReviewerNB of Github? It seems the problem is related to the ReviewerNB of Github. I enclosed the math equations with |
|
@seanlaw |
|
Sounds good |
|
Apologies, these comments are for an older commit. I forgot to hit "Finish Review" along with my last comment. |
|
@NimaSarajpoor I provided some comments and stopped at the "Expanding (3)" line |
|
@seanlaw
I found two of those comments in the ReviewNB. Maybe they got mixed together (?). I will address those two comments and then I ignore/resolve the rest. Please let me know if I miss anything. |
|
I think we are all set. I can push commits after revising the notebook. |
|
@seanlaw
|
- Fix typos - replace variable name n with k
So, should I now go and study |
|
@NimaSarajpoor Yes, I also added a new issue #592 where we can discuss it in more detail |
|
A couple of notes that are confirmed by the main author of VALMOD:
(This is to make sure that we do not lose this information later) |
We've come along way @NimaSarajpoor! I wonder how easy/hard it would be to implement VALMOD now that we have top-k nearest neighbors? |
We have indeed!
I took a quick look at the paper. I don't remember the details but I think the first four algorithms are the core ones. The first two algorithms are easy. The third one is already done (the top-k feature added to STUMPY). In my opinion, the main remaining task is algorithm 4. I think its implementation should be straightforward. |
|
I think the algorithm presented in a paper has a flaw. (I sent an email to the main author and am waiting for his response.) In the page14 of the paper, the following can be read:
Let's assume It is also possible that the author is aware of this, and considered it in other algorithms provided in the paper, like the ones presented for discovering motifs / discords. In other words, the aforementioned paragraph might just be written badly. |
The author,
As a side: |
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the authors meant to avoid considering two subsequences (of different length) that start from the same index.
Add reference please
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the missing "Z-"
"What about Z-normalized euclidean distance?"
===
===
If we follow the notation in the paper, this lower bound can be...
Instead, simply say: "The paper uses the notation .... However, here, we prefer to add subscript m to show the subsequence length based on which the LB is calculated.
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to mention this formula as I think we are not going to use it later (and when we do, we can mention it then). It is better to replace it with a formula that shows the relationship between rho and z-normalized distance.
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "THEN" part is huge. Better to provide the "WHAT" part, and then the "HOW"/"WHY" part (the proof)
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note that the VALMOD's output is NOT the same as naive_VALMOD because, IIRC, VALMOD provides approximate matrix profile that has exact value for discord(s)
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #45. is_mp_valid[:] = True
Do we need to include this output inside function? This is always all True. Maybe we should use it outside of this function, maybe in the caller function.
Reply via ReviewNB
| @@ -0,0 +1,1056 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #50. def _VALMOD_stump_partial(T, m, k, LB_σr, LB_I):
This function is very hard to follow. Start with the original algorithm (see Algorithm 4), and see if it can be broken down to smaller pieces.
Reply via ReviewNB
This notebook addresses issue #585. In this notebook, we would like to implement the VALMOD method proposed in VALMOD_2018 and VALMOD_2020.
What I have done so far:
np.random.uniformtime series data.For now, I calculated LB given
q>0(see eq(2) in paper.) However, we still need to find LB whenq <= 0.