Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Change partitioning strategy for online processing (#793)
* Remove dropduplicates which is no more needed * Remove partitioning for online data * Add timings in the alert packets * Fix path for distribution * Modify database script with new paths * Enable DB integration tests * Rename paths for test data * Fix missing import * Update configuration file with dependencies * Update paths * Check and lint * Fix typo when importing module * Remove unused code * [Fink-MM x Fink-Broker] Add the fink_mm pipeline into raw2science (#811) * add the fink_mm pipeline into raw2science * pep8 requirements and add documentation and comments * fix bugs and problems with fink CI, restore the stream test, preparation for fink-mm test * pep8 * unit test fixed * add stream_integration argument * add echo path in test * fixed pythonpath * fixed pythonpath * install fink-mm dev version, to remove after the test dev phase * add datatest for all topics * add datasim for join with gcn * add gcn data test * pep8 * integrate fink-mm distribution to the broker * update fink-mm commit in workflow, pep8 * add mechanism to avoid bad schema inference of spark dataframe with fink-mm * raw2science too short in CI to generate MM join data * add tests for fink-mm offline * review modification, fix the fink_mm offline test conf * fix distribution CI, drop new timestamp column for fink_mm, convert new timestamp column into string for fink-broker * pep8 * fix parser default * Format files * Remove NIGHT declaration duplicate * Style * Fix headers * Ruff formatting * Fix module path * Merge mm utils into a single module mm_utils.py * Refactor the fink-mm section in raw2science * Refactor distribute * Cleaning files (#849) * Remove the need for SCRAM * Update fink bin * Increase the number of shuffle partition for SSO * Push all alerts in once * Update science elasticc * Use subscribePattern instead of subscribe * Update scheduler * Format code * Discard alerts with i band measurements (#839) * Add new argument in configuration file * Update conf files * Add missing argument * Format raw2science * Better path management * Fix bug in path * Check if files exist -- not just the folder (#851) * Check if files exist -- not just the folder * Bump fink-filters to 3.29, and test it on CI * Bump fink-filters to 3.30 * Improve verbosity when trying to launch fink-mm * Apply ruff * Add missing parameter in the conf file * Check HDFS folder is not empty before launching services * Wait for one batch to complete before launching * Switch Docker image * Use the streaming DF to infer schema (#853) --------- Co-authored-by: JulienPeloton <[email protected]> * Do not filter alerts containing i-band only in the history (#855) * Update ZTF schedule (#857) * Update the rowkey construction (#859) * Update the rowkey construction * Update tester to enable capability to test one file * Improve CD process - increase argoCD usage - use Spark operator - use Minio operator - add Helm chart for fink-broker - Improve logging management - Use finkctl to create kafka secret - Bump ciux to v0.0.3-rc4 - Bump ktbx to v1.1.3-rc1 - Increase sync checks in CI * Wait for input topic to exist * Wait for fink-producer secret to appear * Add temporary hack to CI * Improve code format an linting * add delta time (#860) * Fix typo * Fix column name when constructing the rowkey (#864) * Fix column name when constructing the rowkey * Reformat * Remove unused (and wrong) row key addition * PEP8 * Fix bug in column names * Improve logging message * Trigger GHA build via cron * Remove tmate session in ci * Remove sudo for docker prune in ci * Improve pip dependencies management Add Dockerfile to ciux source pathes Increase parameters management Add separate log level for spark Improve build script configuration * Improve fink-broker configuration * Fix ciux init in CI * Improve fink startup script * Use finkctl new release * Document release management * Ruff * clean CI yaml * Ruff * Force ipv4 for Kafka * Change the path to the fink alert simulator * Restore the path. We have a problem because schema cannot be read. * Update the configuration for the schema * Change get_fink_logger into init_logger --------- Co-authored-by: Fabrice Jammes <[email protected]> Co-authored-by: Anais Möller <[email protected]> Co-authored-by: Fabrice Jammes <[email protected]> * Update the topic value for helm * Add night as asrgument for stream2raw in the CI * Update conf * Increase the default Kafka delivery.timeout.ms * Change topic name for Sentinel * Use the night argument to make the partitioning instead of the alert timestamp * Relaunch CI * Add a new argument --noscience to bin/fink * Cast only if fields exist * Update distribute with old Kafka setup * Put on hold e2e tests on VD until a solution is found. Only relying on Sentinel for the science and e2e-gha for the noscience * Check the output topic in Kafka for Sentinel * Add script to detect hostless candidates (#870) * Add script to detect hostless candidates * notes * updated model (#869) * Bump fink-science and fink-filters * Switch to Telegram bot for hostless detection * Ruff * Bump requirements for fink-utils * Fix missing args * Split database operations --------- Co-authored-by: Anais Möller <[email protected]> * Ignore unecessary rule --------- Co-authored-by: FusRoman <[email protected]> Co-authored-by: Fabrice Jammes <[email protected]> Co-authored-by: Anais Möller <[email protected]> Co-authored-by: Fabrice Jammes <[email protected]>
- Loading branch information