Speed up DAMP #904

NimaSarajpoor · 2023-08-17T02:55:10Z

NimaSarajpoor
Aug 17, 2023
Maintainer

[If you are not familiar with DAMP, first see #606]

According to the paper, DAMP algorithm is designed to perform early abondoning, which can speed up discord discovery. Recently, I have been thinking about whether it would be possible to take advantage of parallelism as well or not. So, I have an idea but have not tested it yet. To avoid premature optimization, I prefer to just explain the idea here for our future reference.

Let's say I have a time series T, with length 1250; and the first 250 data points are considered as the training part. And, also let's say I am looking for the top discord with length m in T[250: ]. The existing DAMP algo starts at the subsequecne T[250:250+m], and does backward processing to see if its distance to 1-[left]nearest-neighbor is greater than the best-so-far discord distane or not. It also performs forward processing to prune some of the forthcoming subsequences. It then jumps to the next have-not-been-pruned-yet subsequence and does the backward/ forward processing, and so on.

Let's see if we can use more than one thread to expedite the process. Let's say I have four threads. Therefore:

# T has 1250 data points

# first 250 points are considered as training

# thread 1: T[250:500]
# thread 2: T[500:750]
# thread 3: T[750:1000]
# thread 4: T[1000:1250]

For each thread, we perform DAMP locally:

# thread 1: T[250:500] --> best discord at index i, with nearest neighbor nn_i, and d_i = distance(S_i, S_nn_i)
# thread 2: T[500:750] --> best discord at index j, with nearest neighbor nn_j, and d_j = distance(S_j, S_nn_j)
# thread 3: T[750:1000] --> ...
# thread 4: T[`000:1250] --> ...

Let's consider the second thread, which explores subsequences in T[500:750]. Note that for each subsequence in this slice, all subsequences in T[250:500] are considered as its left neighbors but they were not considered in the discord-discovery process in the second thread. But, there is an interesting point here: the distance d_j is an upperbound for subsequences in the slice T[500:750]. So, we can divide T and perform DAMP locally on one slice per thread, and then we start from thread 1 to 4, to see if can skip a whole slice. So, let's say the best-so-far distance according to thread 1 is 10, i.e. d_i = 10. If d_j < d_i (e.g. d_j == 9), then we can see that there is no need to explore T[500:750]. If d_j > d_i , then that means we should explore T[500:750] by considering the left neighbors in T[250:500].

seanlaw · 2023-08-17T10:51:39Z

seanlaw
Aug 17, 2023
Maintainer

But, there is an interesting point here: the distance d_j is an upperbound for subsequences in the slice T[500:750]. So, we can divide T and perform DAMP locally on one slice per thread, and then we start from thread 1 to 4, to see if can skip a whole slice. So, let's say the best-so-far distance according to thread 1 is 10, i.e. d_i = 10. If d_j < d_i (e.g. d_j == 9), then we can see that there is no need to explore T[500:750]. If d_j > d_i , then that means we should explore T[500:750] by considering the left neighbors in T[250:500].

Let's see if we can use more than one thread to expedite the process.

Maybe I'm not getting your point but I'm wondering if you are conflating two things here:

Using multiple threads
Abandoning other threads when d_j < d_i

So, the best place to use multiple threads is when you have processes (or slices) that are completely independent of each other and also do not rely on the (final or intermediate) outputs of the other threads. In your case, it appears that there is a global/shared d_i that all other threads need to compare against at run time and so I'm not certain that this makes it a good candidate for multi-threading. Did I misunderstand something?

0 replies

NimaSarajpoor · 2023-08-17T12:02:47Z

NimaSarajpoor
Aug 17, 2023
Maintainer Author

but I'm wondering if you are conflating two things here:

Using multiple threads

Abandoning other threads when d_j < d_i

The item (2) occurs after finishing local DAMP process for all threads, each on their own without needing the best-so-far distance from other threads.

So, Let's say I have two threads, and my time series has the length 100M. According to the original DAMP, we should start from the beginning, and update best-so-far discord distance as we move towards to the end of the time series. This computation happens in one thread. However, while we are going through the first half (i.e. first 50M datapoints), we can try to find the best-so-far distance (bsf) on the second half of the time series in parallel [This procedure does not depend on the best-so-far distance of the first half]. The best-so-far distance computed in the second thread is just an approximate and it is basically an upperbound of the discord distance for a potential discord located in the slice T[50M: ].

At the end, when the computation in both threads finish, I can compare bsf of the first half [which is exact] with the bsf of the second half [which is approx and it is an upperbound]. If the former is still larger, I can skip the next 50M data points.

5 replies

seanlaw Aug 17, 2023
Maintainer

Are you able to demonstrate (even using a single thread and iterating across the slices that you've proposed), that:

It must produce the correct answer
It is faster

NimaSarajpoor Aug 17, 2023
Maintainer Author

I will try to show 1, but I am not sure how I can show 2 when I am using one thread only. In fact, if it is one thread, and I iterate through the slice, I think the running time might be higher if I apply DAMP independently becaus I am not taking advantage of the best-so-far-discord-distance information.

seanlaw Aug 17, 2023
Maintainer

I think the running time might be higher if I apply DAMP independently becaus I am not taking advantage of the best-so-far-discord-distance information.

I think that's the point. However, the goal is to establish a baseline using that single thread and then be able to show how much faster adding more threads might be.

NimaSarajpoor Aug 17, 2023
Maintainer Author

Ahh. I see.. so, when you said faster, I think you meant faster than naive approach, and not faster than original DAMP (i.e. no slicing). Please correct me if I am mistaken

seanlaw Aug 17, 2023
Maintainer

I was just curious if it was simply faster in any way with a single thread. This way, we aren't confused if it's the algorithm that's faster, the multi-threading that's faster, or BOTH. If, for some reason, the algorithm is slower and it's slow enough that it cancels out the multi-threading then I'd rather go single threaded (or go back to simpler code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up DAMP #904

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Speed up DAMP #904

NimaSarajpoor Aug 17, 2023 Maintainer

Replies: 2 comments · 5 replies

seanlaw Aug 17, 2023 Maintainer

NimaSarajpoor Aug 17, 2023 Maintainer Author

seanlaw Aug 17, 2023 Maintainer

NimaSarajpoor Aug 17, 2023 Maintainer Author

seanlaw Aug 17, 2023 Maintainer

NimaSarajpoor Aug 17, 2023 Maintainer Author

seanlaw Aug 17, 2023 Maintainer

NimaSarajpoor
Aug 17, 2023
Maintainer

Replies: 2 comments 5 replies

seanlaw
Aug 17, 2023
Maintainer

NimaSarajpoor
Aug 17, 2023
Maintainer Author

seanlaw Aug 17, 2023
Maintainer

NimaSarajpoor Aug 17, 2023
Maintainer Author

seanlaw Aug 17, 2023
Maintainer

NimaSarajpoor Aug 17, 2023
Maintainer Author

seanlaw Aug 17, 2023
Maintainer