Skip to content

[WIP] Discovering Discords of arbitrary length using MERLIN #417 #505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 213 commits into from
Apr 9, 2022
Merged
Changes from 3 commits
Commits
Show all changes
213 commits
Select commit Hold shift + click to select a range
4bee517
add introduction part to better understand the idea behind MERLIN
NimaSarajpoor Dec 28, 2021
7d2b794
Implement first phase of DRAG (DRAG is first part of MERLIN)
NimaSarajpoor Dec 28, 2021
a0e8d97
fix some typos and small enhancement in explanation/comments
NimaSarajpoor Dec 28, 2021
efffc23
fix exclusion zone to make it consistent with STUMPY
NimaSarajpoor Dec 31, 2021
0a67162
change MatrixProfile to matrix profile
NimaSarajpoor Dec 31, 2021
402983a
use a more understandable function for select_candidates.Also, correc…
NimaSarajpoor Dec 31, 2021
1ff88f0
fix a typo and delete some incorrect discussion about min_dist
NimaSarajpoor Dec 31, 2021
c859bf6
add calculation behind the min_dist in normalized and non-normalized …
NimaSarajpoor Dec 31, 2021
0e5b716
do slight improvement on the explanation of calculating non-normalize…
NimaSarajpoor Jan 1, 2022
e81b122
explain the 'how' of updating min_dist in both normalized and non-nor…
NimaSarajpoor Jan 1, 2022
ecc253e
fix small things and improve explanation to make sure the main issues…
NimaSarajpoor Jan 1, 2022
603e4e8
implement second phase of DRAG (DRAG is first part of MERLIN).
NimaSarajpoor Jan 1, 2022
eef6bdd
review and fix small errors
NimaSarajpoor Jan 2, 2022
39d4a80
change the name cand_indices to cand_index to make it consistent with…
NimaSarajpoor Jan 2, 2022
eb7bc7c
small enhancement in the _select_candidates and _prune_candidates fun…
NimaSarajpoor Jan 3, 2022
83ed308
implement third phase to find discord from chosen candidates
NimaSarajpoor Jan 3, 2022
62d90e3
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 3, 2022
013f770
fix and improve Docstrings
NimaSarajpoor Jan 3, 2022
28c9ad2
remove the redundant argument min_dist to make the function cleaner
NimaSarajpoor Jan 3, 2022
f75c577
find discord by using matrix profile to show the output of MERLIN-par…
NimaSarajpoor Jan 3, 2022
6d32762
change argument from boolean array is_cands to cands that contains on…
NimaSarajpoor Jan 3, 2022
c90d15b
change dtype from uint32 to int64 to make it the same as the dtype of…
NimaSarajpoor Jan 3, 2022
1a46b9e
review and small enhancement
NimaSarajpoor Jan 3, 2022
7efb8e4
revise _find_discord function to find discord on the fly instead of s…
NimaSarajpoor Jan 6, 2022
a130ec6
refactor the code for finding discord via matrix profile
NimaSarajpoor Jan 6, 2022
993994a
add discussion on top-k discord
NimaSarajpoor Jan 7, 2022
51d74b3
implement top-k discord function
NimaSarajpoor Jan 7, 2022
19cf26c
review and small enhancement in the notebook
NimaSarajpoor Jan 7, 2022
c228056
add function to retrieve top-k discords using STUMPY matrix profile
NimaSarajpoor Jan 7, 2022
22a691f
correct the update of eligible_cands in for-loop for finding top-k di…
NimaSarajpoor Jan 7, 2022
61962e1
fix a small error in updating min_dist in _find_top_k_discords and ad…
NimaSarajpoor Jan 7, 2022
6909c2f
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 7, 2022
4377067
reivew and improve discussion of top-k discords
NimaSarajpoor Jan 7, 2022
e060cda
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 8, 2022
466c023
change name of function to murlin
NimaSarajpoor Jan 8, 2022
7e3d346
use STUMPY core.apply_exclusion_zone instead of manually applying exc…
NimaSarajpoor Jan 8, 2022
7cde10c
use np.NINF as initial value for distance of a discord to its Nearest…
NimaSarajpoor Jan 8, 2022
1abe872
replace the rate of change 0.5 with 0.99. The value 0.5 will be used …
NimaSarajpoor Jan 8, 2022
f893e56
fix a small typo
NimaSarajpoor Jan 8, 2022
769b34b
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 8, 2022
cb3fe82
change dtype of P to np.float64 so it can be used in apply_excl_zone
NimaSarajpoor Jan 9, 2022
41d8390
use exclude instead of eligible_cands to improve readability. Also: c…
NimaSarajpoor Jan 9, 2022
e9888fb
change the update of min_dist in murlin function by adding two new ar…
NimaSarajpoor Jan 10, 2022
d721c2f
Improve explanation and revise docstring
NimaSarajpoor Jan 10, 2022
06f2391
add a note to explain the reason behind updating min_dist before chec…
NimaSarajpoor Jan 11, 2022
0727ac7
minor improvements in notebook
NimaSarajpoor Jan 14, 2022
4a12f5e
change while- to for- loop to increase stability of the code
NimaSarajpoor Jan 14, 2022
be7df2f
add break to for-loop to stop iteration when all indices are exluded
NimaSarajpoor Jan 14, 2022
3d08b2e
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 14, 2022
24b81e7
implementat merlin (with notes) that find discords of differentlengths
NimaSarajpoor Jan 14, 2022
591ff57
change the format of output of merlin function. Also, provide an exam…
NimaSarajpoor Jan 14, 2022
dbf629e
create a function to discover top-k discords of arbitary length using…
NimaSarajpoor Jan 14, 2022
a11cb62
use np.testing module to compare the results of merlin and stumy-base…
NimaSarajpoor Jan 14, 2022
e4bfbc2
review notebook and minor changes
NimaSarajpoor Jan 14, 2022
dc7d024
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 19, 2022
8ca3b67
change variable exclude to include to improve readability
NimaSarajpoor Jan 22, 2022
e0a8ab5
use arguments T (timeseries) and m (window size) as the input instead…
NimaSarajpoor Jan 22, 2022
50dd9a9
use stumpy.mass to calculate distance between subsequences on the fly…
NimaSarajpoor Jan 23, 2022
9ad0fc7
add function to get ranges(chunks) of continuous segements of array
NimaSarajpoor Jan 23, 2022
eb3ae25
use stumpy.mass to calculate distance between subsequences on the fly
NimaSarajpoor Jan 23, 2022
9f74ab5
use stumpy.mass instead of dot product to calculate distance between …
NimaSarajpoor Jan 23, 2022
2088230
minor refactoring. Also, use _get_chunks_ranges instead of get_mask_s…
NimaSarajpoor Jan 23, 2022
bd27dba
minor refactoring. Also, add 1e-6 tolerance for finding discord
NimaSarajpoor Jan 23, 2022
770e500
minor revision to improve readability
NimaSarajpoor Jan 23, 2022
8a69264
use variable decay as the rate of change of min_dist
NimaSarajpoor Jan 23, 2022
0bc384b
remove some if/else to make the code cleaner
NimaSarajpoor Jan 23, 2022
fa2de76
clean some codes and explanation
NimaSarajpoor Jan 24, 2022
db86742
use core._mass instead of stumpy.mass to reduce computation time
NimaSarajpoor Jan 24, 2022
65765cb
remove 1e-6 as it was no more needed after changing stumpy.mass to co…
NimaSarajpoor Jan 24, 2022
9ea3049
use flatnonzero() to convert boolean mask to array of indices
NimaSarajpoor Jan 24, 2022
ee7ca0c
minor changes
NimaSarajpoor Jan 24, 2022
61bd947
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 26, 2022
efc9c37
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Jan 30, 2022
01b3aa6
use name M_T and Σ_T to be consistent with STUMPY, and make them requ…
NimaSarajpoor Jan 30, 2022
4c36528
update docstring and remove redundant branching logic
NimaSarajpoor Jan 30, 2022
cb5ec84
add note about finding Tmax/Tmin in the presence of infinite values i…
NimaSarajpoor Jan 30, 2022
05a795a
rename some variables to be consistent with STUMPY and revise some ot…
NimaSarajpoor Jan 30, 2022
0ea67c0
remove temporary, inermediate variable to increase readability
NimaSarajpoor Jan 30, 2022
35b5eda
change k to n_discords to make it consistent with the name of oaramet…
NimaSarajpoor Jan 31, 2022
5a89a83
rename murlin to _murlin (make it private function), and rename merli…
NimaSarajpoor Jan 31, 2022
47b85d2
add pre-processing on T
NimaSarajpoor Jan 31, 2022
ff1a5b6
fix bug on checking values of decay
NimaSarajpoor Jan 31, 2022
b88b29f
make a required parameter and add an optional parameter include in _…
NimaSarajpoor Jan 31, 2022
96a76e3
consider np.nan/np.inf values in T when discovering discords
NimaSarajpoor Jan 31, 2022
fbf5a6b
consider np.nan/np.inf values in T when discovering discords
NimaSarajpoor Jan 31, 2022
6dc9307
revise docstrings and some minor changes
NimaSarajpoor Jan 31, 2022
1341005
add an example of discords discovery in the presence of np.inf values…
NimaSarajpoor Jan 31, 2022
db6729b
fix bug when checking values of decay
NimaSarajpoor Jan 31, 2022
756ffe5
chanage min_/max_L to min_/max_m to be consistent with STUMPY
NimaSarajpoor Jan 31, 2022
c15aed3
chanage min_L to min_m and max_L to max_m to be consistent with STUMPY
NimaSarajpoor Jan 31, 2022
2366a14
add check on min_dist so if it is negatative or zero, no need to move…
NimaSarajpoor Jan 31, 2022
1820c9c
apply _murlin on real data set (taxi data) and compare its outcome wi…
NimaSarajpoor Jan 31, 2022
a4b371b
minor changes throughout notebook
NimaSarajpoor Feb 1, 2022
4c999e8
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Feb 1, 2022
b0b8981
fix docstring
NimaSarajpoor Feb 1, 2022
4b9b85d
delete an unrelated file
NimaSarajpoor Feb 6, 2022
855d7b0
return only finite-values discords with _murlin
NimaSarajpoor Feb 6, 2022
de67233
return finite-values discords with STUMPY
NimaSarajpoor Feb 6, 2022
b1a642d
recalculate T_subseq_isfinite to make code cleaner. Also some minor c…
NimaSarajpoor Feb 6, 2022
4c999c5
add comments to avoid making mistake about subsequence index space
NimaSarajpoor Feb 6, 2022
f1af668
minor changes
NimaSarajpoor Feb 11, 2022
3b6d3a3
compare STUMPY / MERLIN on noisy data next to each other
NimaSarajpoor Feb 11, 2022
ced9c5f
use prescrump to refine min_dist and include
NimaSarajpoor Feb 11, 2022
2958933
add new case_ randomly-generated time series
NimaSarajpoor Feb 11, 2022
081f0b5
minor changes
NimaSarajpoor Feb 11, 2022
f6ae11e
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Feb 11, 2022
6aed062
correct markdown
NimaSarajpoor Feb 11, 2022
6b31c92
refine min_dist after each iteration of finding discord
NimaSarajpoor Feb 11, 2022
f1bc390
add verbose to _murlin to track the progress
NimaSarajpoor Feb 12, 2022
642b029
minor refactoring
NimaSarajpoor Feb 12, 2022
d5a6bfb
use higher initial value of min_dist in case study to show benefit of…
NimaSarajpoor Feb 12, 2022
0cfc89a
revise docstrings
NimaSarajpoor Feb 13, 2022
f6acaba
revise docstrings
NimaSarajpoor Feb 13, 2022
c275f08
use core.preprocess to increase safety of the code
NimaSarajpoor Feb 13, 2022
c8b3edb
minor change to increase readability
NimaSarajpoor Feb 14, 2022
fe33c9b
minor changes throughout the notebook
NimaSarajpoor Feb 14, 2022
ec22503
fix bugs to get same result from both STUMPY and _murlin
NimaSarajpoor Feb 14, 2022
1b30e35
refactor _select_candidates and _prune_candidates
NimaSarajpoor Feb 14, 2022
70b6bdc
Set all elements of decay to 0.01
NimaSarajpoor Feb 14, 2022
485a75e
set all elements of decay to 0.01
NimaSarajpoor Feb 14, 2022
6ef87f6
change versbose to better track results and see further details
NimaSarajpoor Feb 14, 2022
061049d
use (approx) matrix profile to skip some of candidates
NimaSarajpoor Feb 14, 2022
e7673c4
minor changes
NimaSarajpoor Feb 15, 2022
c90d4a1
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Feb 16, 2022
14b2048
revised refactoring of DRAG phase I
NimaSarajpoor Feb 18, 2022
26b6edf
add _get_approx_P module to calculate approx matrix profile
NimaSarajpoor Feb 18, 2022
e3fab91
revise _murlin
NimaSarajpoor Feb 18, 2022
8cf333e
removed unncessary intermediate variable
NimaSarajpoor Feb 18, 2022
07d86ca
fix small bug in _murlin
NimaSarajpoor Feb 18, 2022
4245724
test new changes on random generated data
NimaSarajpoor Feb 18, 2022
78af44b
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Feb 18, 2022
4e6e625
small changes throughout notebook
NimaSarajpoor Feb 18, 2022
95f6bdf
change type of argument to bool to improve readability
NimaSarajpoor Feb 24, 2022
4f9d832
add parameter finite to allow user control to consider infinitediscor…
NimaSarajpoor Feb 24, 2022
9a7ed1f
add parameter shift to control the stop index of the slices
NimaSarajpoor Feb 24, 2022
27f1448
minor changes throughout notebook
NimaSarajpoor Feb 24, 2022
4926009
revise markdown and docstring
NimaSarajpoor Feb 25, 2022
d3aba75
Fix Docstrongs of private functions
NimaSarajpoor Feb 25, 2022
8f00b4c
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Feb 25, 2022
7342048
change design of _get_chunks_ranges and fix docstring
NimaSarajpoor Feb 25, 2022
cefc65d
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 4, 2022
05fd8a2
slight change to increase readability
NimaSarajpoor Mar 4, 2022
4f6ad7f
update functions to use their parallelized version
NimaSarajpoor Mar 4, 2022
5b793dd
change variable names
NimaSarajpoor Mar 4, 2022
7d27253
merge two if-block
NimaSarajpoor Mar 4, 2022
0f5cf9e
slight Changes
NimaSarajpoor Mar 4, 2022
d652389
Refactored _discord
NimaSarajpoor Mar 4, 2022
d248229
slight changes
NimaSarajpoor Mar 4, 2022
61838cf
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 4, 2022
c1208fe
Several minor Changes
NimaSarajpoor Mar 4, 2022
32361fb
improve docstring
NimaSarajpoor Mar 4, 2022
ab20830
improve markdown and docstring
NimaSarajpoor Mar 4, 2022
e3375ed
rearrange lines to improve readability
NimaSarajpoor Mar 6, 2022
6ee2808
Fixed error message
NimaSarajpoor Mar 13, 2022
890e713
write top-level function discords (with MERLIN / with STUMPY)
NimaSarajpoor Mar 14, 2022
7187519
Undo some changes, and ignore inf values when getting max
NimaSarajpoor Mar 14, 2022
638287a
replace np.concatenate with np.r_
NimaSarajpoor Mar 14, 2022
3c819f0
revise discords function
NimaSarajpoor Mar 14, 2022
ff348c3
Added case where data has inf values, And some minor changes
NimaSarajpoor Mar 14, 2022
23893e6
small changes to codes, docstrings, markdown
NimaSarajpoor Mar 14, 2022
7a6b209
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 14, 2022
5dc3e83
Added missing inputs, improve docstring and markdown
NimaSarajpoor Mar 14, 2022
b3e7604
remove the term noisy from name of variable for np.inf case
NimaSarajpoor Mar 14, 2022
a7701ae
make input min_dist optional in _discords
NimaSarajpoor Mar 16, 2022
a0ade64
remove trivial function _get_max_dist
NimaSarajpoor Mar 16, 2022
ec12be1
remove error message
NimaSarajpoor Mar 16, 2022
37538ad
Removed versbose
NimaSarajpoor Mar 16, 2022
b85dfa6
use default value None for min_dist
NimaSarajpoor Mar 16, 2022
0d2bb70
use list to collect outputs
NimaSarajpoor Mar 16, 2022
dc9d7a9
clean function discords_alternative
NimaSarajpoor Mar 16, 2022
6d5fd58
Removed unnecessary mask
NimaSarajpoor Mar 16, 2022
1b16966
minor changes
NimaSarajpoor Mar 16, 2022
b8fba55
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 16, 2022
4315014
rename variable top_1_finite_discord_dist
NimaSarajpoor Mar 16, 2022
4f0d0cb
apply changes to matrix profile-based discords
NimaSarajpoor Mar 16, 2022
e9fa027
change variable name for consistency
NimaSarajpoor Mar 16, 2022
07256e7
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 16, 2022
04bf914
Revise function _discords
NimaSarajpoor Mar 23, 2022
f8a6ce6
Correct some coding styles
NimaSarajpoor Mar 23, 2022
1dc767b
ADD condition for parameter min_dist
NimaSarajpoor Mar 23, 2022
6948346
Retrieve decay format, ADD else-break in for-loop
NimaSarajpoor Mar 23, 2022
3357959
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Mar 23, 2022
0936def
revise function _discords by removing unncessary for-loop
NimaSarajpoor Mar 28, 2022
343d0f9
revise comments in function _discords
NimaSarajpoor Mar 28, 2022
0a7d149
slight changes in _discords
NimaSarajpoor Mar 28, 2022
595e0e8
revise function _find_discords
NimaSarajpoor Mar 30, 2022
bf995cf
revise function _discords
NimaSarajpoor Mar 30, 2022
cdf45b1
change variable name min_dist to r
NimaSarajpoor Mar 30, 2022
d6c6d83
use variable name P and I to be consistent with STUMPY
NimaSarajpoor Mar 31, 2022
7a40846
change min_dist to r/ r_init, depending on function
NimaSarajpoor Mar 31, 2022
adb4901
slight changes in code and docstrings
NimaSarajpoor Mar 31, 2022
409f6b2
minor changes
NimaSarajpoor Mar 31, 2022
34be459
Use max instead of if-block
NimaSarajpoor Apr 3, 2022
c3136ef
change variable name r_init to r, and r to r_copy
NimaSarajpoor Apr 3, 2022
1587d7c
retrieve if-block for checking r_copy
NimaSarajpoor Apr 3, 2022
016c5bf
change variable name r_copy to r_updated
NimaSarajpoor Apr 3, 2022
1ceef88
use max instead of if-block
NimaSarajpoor Apr 3, 2022
33ecadb
return numpy ndarray instead of three lists
NimaSarajpoor Apr 3, 2022
ab2cf6d
Revise function _discords
NimaSarajpoor Apr 3, 2022
463b680
minor changes throughout notebook
NimaSarajpoor Apr 3, 2022
c0c66d3
Major change in top-level function discords
NimaSarajpoor Apr 3, 2022
6636050
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Apr 3, 2022
8becba7
minor changes in the notebook
NimaSarajpoor Apr 3, 2022
ba97dff
retrieve if-check and remove indexer
NimaSarajpoor Apr 4, 2022
ba081a2
remove indexer and use append instead
NimaSarajpoor Apr 4, 2022
52c972e
Merge branch 'main' into Discord_MERLIN
NimaSarajpoor Apr 5, 2022
831f64b
minor Changes
NimaSarajpoor Apr 7, 2022
59fa1db
Revise while-loop structure to increase readability
NimaSarajpoor Apr 7, 2022
16b28bd
minor changes
NimaSarajpoor Apr 7, 2022
053f38f
minor changes throughout notebook
NimaSarajpoor Apr 7, 2022
20593b8
minor changes
NimaSarajpoor Apr 9, 2022
a25e307
improve the whole notebook
NimaSarajpoor Apr 9, 2022
169a07e
minor change in a comment
NimaSarajpoor Apr 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
410 changes: 410 additions & 0 deletions docs/Tutorial_DiscordMERLIN.ipynb

Large diffs are not rendered by default.