-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implemented label in cluster estimation #231
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed
c43b88f
to
3134b2a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed the code and it looks very complect. My suggestion is to use composition: Leave the original class alone, and create a new class (e.g. ClusterEstimationByLabel
or something). This new class contains a dictionary of label to cluster estimation object, and is responsible for sorting the incoming points into the correct object in the dictionary and collecting the individual return values, which it then returns to the worker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed
@@ -277,7 +288,7 @@ def __sort_by_weights( | |||
@staticmethod | |||
def __convert_detections_to_point( | |||
detections: "list[detection_in_world.DetectionInWorld]", | |||
) -> "list[tuple[float, float]]": | |||
) -> "list[tuple[float, float, int]]": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we change this function signature? You never changed the function so it still returns a (float, float)
if min_activation_threshold < 1: | ||
return False, None | ||
|
||
return True, ClusterEstimationByLabel( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check ClusterEstimation
's restrictions. Either apply them again here or by invoking creating cluster estimation
labels_to_object_clusters: dict[int, list[object_in_world.ObjectInWorld] or None. | ||
Dictionary where the key is a label and the value is a list of all cluster detections with that label | ||
""" | ||
label_to_detections: dict[int, list[detection_in_world.DetectionInWorld]] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment saying sorting detections by label
tests/unit/test_cluster_detection.py
Outdated
self, cluster_model_by_label: cluster_estimation_by_label.ClusterEstimationByLabel | ||
) -> None: | ||
""" | ||
Five clusters with small standard devition that have different labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit doc string
tests/unit/test_cluster_detection.py
Outdated
1: [100, 100, 100], | ||
2: [100, 100, 100], | ||
3: [100, 100, 100], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make each point different weights and different number of clusters, so it's a little more random?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do weight and confidence mean the same thing?
tests/unit/test_cluster_detection.py
Outdated
Five clusters with small standard devition that all have the same label | ||
""" | ||
# Setup | ||
labels_to_n_samples_per_cluster = {1: [100, 100, 100, 100, 100]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you test this with different number of points per cluster?
tests/unit/test_cluster_detection.py
Outdated
@@ -488,3 +566,97 @@ def test_position_regular_data( | |||
break | |||
|
|||
assert is_match | |||
|
|||
|
|||
class TestCorrectClusterEstimationByLabel: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe test some error cases, like what happens if things violate your input conditions? For example, what happens when only 1 new point is passed to your Cluster Estimation By Label, it still runs. However, Cluster Estimation has a higher min_new_points, so nothing ever gets run and that point is just lost?
d8b63f7
to
2106bbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed.
""" | ||
Checks if a valid cluster estimation object can be constructed. | ||
|
||
See `ClusterEstimation` for parameter descriptions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to:
See `create()` for parameter descriptions.
Return: Whether the arguments are valid.
|
||
ATTRIBUTES | ||
---------- | ||
min_activation_threshold: int | ||
Minimum total data points before model runs. Must be at least max_num_components. | ||
|
||
min_new_points_to_run: int | ||
Minimum number of new data points that must be collected before running model. | ||
|
||
max_num_components: int | ||
Max number of real landing pads. Must be at least 1. | ||
|
||
random_state: int | ||
Seed for randomizer, to get consistent results. | ||
|
||
local_logger: Logger | ||
For logging error and debug messages. | ||
|
||
METHODS | ||
------- | ||
run() | ||
Cluster estimation filtered by label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this.
RETURNS | ||
------- | ||
model_ran: bool | ||
True if ClusterEstimation object successfully ran its estimation model, False otherwise. | ||
|
||
labels_to_objects: dict[int, list[object_in_world.ObjectInWorld] or None. | ||
Dictionary where the key is a label and the value is a list of all cluster detections with that label. | ||
ObjectInWorld objects don't have a label property, but they are sorted into label categories in the dictionary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplify:
Return: Success, labels and their associated objects.
|
||
if not label in labels_to_objects: | ||
labels_to_objects[label] = [] | ||
labels_to_objects[label] += clusters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line above this line.
|
||
if not label in labels_to_objects: | ||
labels_to_objects[label] = [] | ||
labels_to_objects[label] += clusters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the desired behaviour? The cluster estimation objects already hold a record of all points, so they will always generate an updated version of the cluster centres. I think this should be an unconditional assignment instead.
""" | ||
|
||
import random | ||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add empty line between system and 3rd party imports.
|
||
from modules.cluster_estimation import cluster_estimation_by_label | ||
from modules.common.modules.logger import logger | ||
from modules import detection_in_world |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this above (shorter is higher in alphabetical order).
from modules.common.modules.logger import logger | ||
from modules import detection_in_world | ||
|
||
MIN_TOTAL_POINTS_THRESHOLD = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 empty lines total between imports and global constants.
RNG_SEED = 0 | ||
CENTRE_BOX_SIZE = 500 | ||
|
||
# Test functions use test fixture signature names and access class privates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 empty lines total above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this testing is unnecessary and brittle (it will break if the implementation of ClusterEstimation
is changed). The goal of these tests is ensuring that the labelled points go to the correct cluster estimation objects. The tests themselves are for various conditions. For example:
- Never before seen labelled point (is the cluster estimation object created correctly?)
- Existing labelled point goes to the correct cluster estimation object
- Multiple points go to the correct places
- Non consecutive labels (i.e. label values that skip numbers (e.g.
{0, 1, 2, 5}
)
Verifying that the points go to the correct objects can be done by accessing the ClusterEstimationByLabel
and ClusterEstimation
members.
Verifying the outputs is a little trickier, but can be done by making the objects return a number of cluster centres corresponding to their label (e.g. the object at 3
returns 3 centres when run()
is called). When constructing them, the objects can be directly modified by creating a huge number of points exactly at the same location at each of the 3 centres, no need for any fancy cluster generation.
There is also no need to check whether the object actually ran with the thresholds either (basically, this test should still pass if someone removes __decide_to_run()
from cluster estimation).
No description provided.