Add more columns for uppervalid alerts #785

JulienPeloton · 2024-01-16T14:56:59Z

IMPORTANT: Please create an issue first before opening a Pull Request.
Linked to issue(s): Closes #778

What changes were proposed in this pull request?

In this PR, we add necessary columns to reconstruct apparent magnitude for uppervalid alerts.

We also optimise the way we push data to HBase by reducing the number of elements to push (require fink-utils>=0.13.11). Basically, we were pushing before all the history, no matter the history was unchanged from the point of view of uppervalid or upper. This was leading to redundant write night after night, and it was putting a lot of load in the database when writing million of elements unnecessarily.

Now the strategy is different. We first look in the history if the last element is uppervalid or upper, and only in this case, we push the history in the database. We could further optimise by looking at the corresponding history, and selecting only latest elements (by opposition to all elements), but this would complexify too much the analysis for a very moderate gain.

Here are some numbers for one night (2024/01/04, 237,104 alerts):

uppervalid

	old strategy	new strategy
# alert history to explode	237,104	7,980
# rows to push to HBase	1,196,247	20,965

upper

	old strategy	new strategy
# alert history to explode	237,104	165,810
# rows to push to HBase	2,804,673	2,168,893

Conclusion

The number of elements to push for uppervalid is drastically smaller! This is why, we can afford more columns now. For upper, this number does not decrease much, because (1) the stream contains a lot of new objects each night (history full of upper limits), and (2) there are a lot of upper limits in between two valid measurements (this is probably where a more complex analysis would reduce the number of write -- but that's for later, maybe).

How was this patch tested?

Cloud & CI

karpov-sv · 2024-01-16T15:36:37Z

It seems you forgot to add distnr to the fields to keep for bad quality measurements - it is also important for reconstructing the lightcurve (to decide whether to apply magnr or not).

Also - do we need magpsf and sigmapsf for pure upper limits? I don't remember whether they actually contain anything or not - but they should be empty if I do not miss anything obvious

JulienPeloton · 2024-01-16T15:49:39Z

It seems you forgot to add distnr to the fields to keep for bad quality measurements - it is also important for reconstructing the lightcurve (to decide whether to apply magnr or not).

right, thanks! Added.

Also - do we need magpsf and sigmapsf for pure upper limits? I don't remember whether they actually contain anything or not - but they should be empty if I do not miss anything obvious

No we don't need as they are NaN -- but I do not include them:

# line 240
df_index = df_index.drop(*['magpsf', 'sigmapsf'])

sonarqubecloud · 2024-01-16T17:19:54Z

Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

1 New issue
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

JulienPeloton added 5 commits January 16, 2024 15:37

Update dependencies

4376a0f

Update HBase deps in the config file

14c7cc6

Keep updating the conf file

39106a1

Limit concurrency

2f81833

Fix typo

25e233e

Add distnr for uppervalid

63fe456

Fix typo

3a609da

JulienPeloton merged commit 5392714 into master Jan 30, 2024
19 of 22 checks passed

JulienPeloton added the index table label Jan 30, 2024

JulienPeloton added this to the 3.2 milestone Jan 30, 2024

JulienPeloton mentioned this pull request Apr 16, 2024

Data is not correctly uploaded in the database? #841

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more columns for uppervalid alerts #785

Add more columns for uppervalid alerts #785

JulienPeloton commented Jan 16, 2024

karpov-sv commented Jan 16, 2024

JulienPeloton commented Jan 16, 2024

sonarqubecloud bot commented Jan 16, 2024

Add more columns for uppervalid alerts #785

Add more columns for uppervalid alerts #785

Conversation

JulienPeloton commented Jan 16, 2024

What changes were proposed in this pull request?

uppervalid

upper

Conclusion

How was this patch tested?

karpov-sv commented Jan 16, 2024

JulienPeloton commented Jan 16, 2024

sonarqubecloud bot commented Jan 16, 2024

Quality Gate passed