-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tag stability, counter loss and flipping #19
Comments
I've noticed it as well. This is definitely a bug. If I'm correct, the Z axis should be on the contrary very stable. |
I suspect it has to do with the order of the corners... Guilty until proven innocent. |
Sounds like a good suspect, indeed. |
Sorry to give bad news guys, but I think I've figured it out. It's probably the same phenomenon as the famous upside-down optical illusion (or whatever it's called). I think this picture summarizes the situation perfectly: http://www.mindmotivations.com/images/optical-illusion1.jpg The order of the corners is the same in both up and down cases. I guess the only way to solve this problem with a single tag is to determine the perspective of the tag good enough so that we can distinguish whether the "up" corner is closer to us than the "down" corner. Currently the only way to do this is to determine lines that are between the left-up corners and right-up corners are longer or shorter than the lines that are between the left-down corners and right-down corners. This becomes more and more noise-prone as the tag gets flatter and flatter in the perspective, i.e all lines get shorter, hence the flipping we observe. Please observe that the flipping diminishes and then stops when you get the camera closer and closer to the tag, and doesn't happen at all when the tag is "looking towards" the camera, i.e not too flat. In my application, I'm planning to solve this issue by using multiple tags that are fixed referenced beforehand among themselves and all referenced to the camera + outlier detection and elimination. |
There is no bug as we're currently doing nothing wrong. This will be more of an enhancement if we achieve to solve this some other way, so I'm changing the labels. |
Hum, interesting. Considering the following tag:
What about the computing the cross-product of |
I drew some geometric diagrams to convince myself. If my reasoning is right, this is due to slight misdetection of corner locations on the screen (e.g due to pixel resolution/lighting etc.) which would result the same no matter which method of calculation we use. Further, this should result in calculating the following cross product to find the +Z axis: ( But, if there is misdetection only on the I currently see two "solutions":
If it is the case that the corners are such that |
Please note that this whole issue is also caused by the actual bending of the paper. On second thought, the major culprit is probably the bending of the paper, which means that the above method (voting 4 corners) could actually work. |
After discussion, the current proposal is:
|
Suits me well. |
FYI, my initial experiments on the Cellulo side suggest that median filtering on a time window is very robust against flipping. |
Very interesting... That could very well replace the average filtering, and "fix" the z-flipping issue more simply than the proposal above. The advantage is that it works also with a single tag, the disadvantage is that it needs a few frames... which is OK, because the tag flipping means that there are several frames already. So you just take a median of the last values for each component of the transformation matrix, or is it a bit fancier ? How big is your time window ? |
It is a bit fancier :) Here is what I do: Get the translations (3-vector) and rotations (quaternion) in their own windows and calculate their respective medians. Median in more than one dimensions is defined as a "geometric median" (point in space whose sum of L1 distances to the window points is the least). The catch is that geometric median is proven to have neither an explicit formula nor an exact time algorithm, but it is known that it sums up to the convex optimization of a convex function. The way I calculate them is using the Weiszfeld-Ostresh algorithm which is iterative and is basically a case of gradient descent, which might be off-putting in terms of performance. I've had runs that converged in 5 iterations and runs that converged in 20 iterations. It can be tuned by setting the initial point as the mean and playing with the step size. This works well for 3-vectors but you need to express quaternions as points in a Riemannian manifold for it to work. This is a bit beyond my mathematical knowledge, but it turns out to be again the convex optimization of a convex function. You only need special treatment for them, such as having different distance measures and different maps. Once you have both medians, stick them into a transform matrix and you're good to go. You can find my implementations in here (there are also references to the papers I got the algorithms from): |
By the way, I used a 10 sample window, but I think it can be lowered a bit more. And, this can also be applied to the scale (3-vector) of a transform matrix. Scale doesn't make sense in the Chilitags world but I just wanted to put it out there. |
Time for this issue to rise from the dead: During the NCCR meetings, I had the chance to talk to a guy from ETHZ's Agile and Dexterous Robotics Lab who implemented a similar tag-based application (based on another tag library). They are using a Kalman filter on the tag pose + IMU data when available in order to counter flipping issues as well as the loss of the tag due to blurry camera image. He said that they are getting very good results from this. I think @severin-lemaignan also mentioned trying a Kalman filter at some point. He said the code was open source too. We should definitely look at this at some point, it will be very cheap to calculate. The exact same thing goes for chilitrack: chili-epfl/chilitrack#4 Also changing the name to reflect the issue better. |
Related issue in OpenCV: opencv/opencv#8813 And here's a workaround. |
Thanks for the tip :) |
I haven't investigated yet, but the Z-axis seems to flip on estimate3d-gui, especially when the camera is almost perpendicular to the tag.
This issue is a reminder to investigate more ;)
The text was updated successfully, but these errors were encountered: