Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc - Update nullspace projection formula. #2366

Merged

Conversation

abussy-aldebaran
Copy link
Contributor

I think there's a missing P_1 in the null-space projection formula in the "Inverse Kinematics" tutorial.
I did the math and it checks out. And the simulation works better.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hi,
This is a reminder message to please assign a proper label to this Pull Request.
The possible labels are:

  • build_collision (build pinocchio with coal support)
  • build_casadi (build pinoochio with casadi support)
  • build_autodiff (build pinocchio with cppad support)
  • build_codegen (build pinocchio with cppadcg support)
  • build_extra (build pinocchio with extra algorithms)
  • build_mpfr (build pinocchio with Boost.Multiprecision support)
  • build_sdf (build pinocchio with sdf parser)
  • build_accelerate
  • build_all (build pinocchio with all the options stated above)

Thanks.

@stephane-caron
Copy link
Collaborator

Thank you for raising this point. Can you detail the math you've done and what problem you see with this formula?

$$vq_2 = vq_1 + (J_2 P_1)^+ ( v_2^* - J_2 vq_1)$$

Here is one way it checks out: we want $J_2 (vq_1 + x) = v_2^2$ with $x \in \mathrm{Ker}(J_1)$. Let $x = P_1 z \in \mathrm{Ker}(J_1)$. Then $J_2 x = J_2 P_1 z = v_2^* - J_2 vq_1$ whose least-squares solution is $z = (J_2 P_1)^+ (v_2^* - J_2 vq_1)$.

@abussy-aldebaran
Copy link
Contributor Author

abussy-aldebaran commented Aug 12, 2024

I did the same calculations. It seems to me that you are replacing $x$ with $z$ in the end. If you replace $x$ with $P_1 z$ as defined, you get
$$vq_2 = vq_1 + P_1 (J_2 P_1)^+ (v_2^* - J_2 vq_1)$$

@stephane-caron stephane-caron self-requested a review August 12, 2024 12:20
@stephane-caron stephane-caron force-pushed the feature/doc_null_space branch from e989b6b to 9ddb70e Compare August 12, 2024 12:28
@stephane-caron
Copy link
Collaborator

Ah yes OK, that's right 👍 Thank you for this correction 😃

Copy link
Collaborator

@stephane-caron stephane-caron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct 👌

@jcarpent jcarpent force-pushed the feature/doc_null_space branch from 9ddb70e to 4bf3fca Compare August 13, 2024 08:32
@jcarpent
Copy link
Contributor

Thanks a lot, @abussy-aldebaran, for this documentation fix. Very helpful.

@jcarpent jcarpent merged commit 0ad681a into stack-of-tasks:devel Aug 13, 2024
5 checks passed
@ymontmarin
Copy link

ymontmarin commented Aug 16, 2024

@abussy-aldebaran Your calculation is indeed correct, however, the formula given by @stephane-caron is simpler and also correct because we have the property $P_1(J_2P_1)^+ = (J_2P_1)^+$.

Indeed, $P_1$ is an orthogonal projector so we have: $Im(P_1)=Ker(P_1)^\perp$.
By the definition of the Moore-Penrose inverse we have $Im((J_2 P_1)^+) = Ker(J_2 P_1)^\perp$.
We have naturally $Ker(P_1)\subset Ker(J_2P_1)$ so $Ker(J_2P_1)^\perp \subset Ker(P_1)^\perp$.
Wrapping up we have $Im((J_2 P_1)^+)\subset Im(P_1)$ and by definition of projector $P_1|_{Im(P_1)} = Id$.
Thus we have $P_1(J_2 P_1)^+ = (J_2 P_1)^+$.

However, I have no idea of why using $P_1(J_2 P_1)^+$ could stabilize the numerics.

@stephane-caron
Copy link
Collaborator

stephane-caron commented Aug 16, 2024

Thanks @ymontmarin for double-checking 👍

I followed your reasoning and I agree with each step. (The Moore-Penrose property is listed e.g. in Proposition 6.1.6 of Matrix mathematics: theory, facts, and formulas.) Here is a sample script to test this numerically:

import numpy as np

def sample(nv: int, nj: int):
    J1 = np.random.random((nj, nv))
    J2 = np.random.random((nj, nv))
    P1 = np.eye(nv) - np.linalg.pinv(J1) @ J1
    pinv_J2P1 = np.linalg.pinv(J2 @ P1)
    norm_pinv = np.linalg.norm(pinv_J2P1)
    norm_P1_pinv = np.linalg.norm(P1 @ pinv_J2P1)
    residual = np.linalg.norm((np.eye(nv) - P1) @ pinv_J2P1, ord=1)
    print(f"| {norm_pinv:.3} | {norm_P1_pinv:.3} | {residual:.2} |")

If we run it with the $2 n_j \leq n_v$, there is no difference between the two formulae:

for _ in range(10):
    sample(10, 5)
$\Vert(J_2 P_1)^+\Vert$ $\Vert P_1 (J_2 P_1)^+ \Vert$ $\Vert (I - P_1) (J_2 P_1)^* \Vert$
15.2 15.2 3e-13
9.52 9.52 7.9e-14
9.18 9.18 6.9e-14
12.8 12.8 3.2e-13
6.05 6.05 2.5e-14
31.5 31.5 9.9e-13
66.2 66.2 3.1e-12
8.67 8.67 5.9e-14
12.5 12.5 1.4e-13
19.3 19.3 1.1e-13

But if we run it with $2 n_j > n_v$ the pseudo-inverse computed by np.linalg.pinv has large values. In that case, it is beneficial to reproject by $P_1$:

for _ in range(10):
    sample(9, 5)
$\Vert(J_2 P_1)^+\Vert$ $\Vert P_1 (J_2 P_1)^+ \Vert$ $\Vert (I - P_1) (J_2 P_1)^* \Vert$
9.23e+14 11.4 1.8e+15
3.97 3.97 1.6e-14
6.19 6.19 2.7e-14
3.48e+14 20.4 6.1e+14
4.86 4.86 3.7e-14
6.03 6.03 4.3e-14
7.39 7.39 1.4e-13
4.77e+14 6.97 1.3e+15
9.28e+14 6.78 1.9e+15
2.26e+14 5.01 4.5e+14

I guess then for the introductory material in Pinocchio's documentation the formula with the extra projection is better? (Less steps to get there, reprojection has some practical benefits.)

@stephane-caron stephane-caron changed the title doc - Correct null space projection formula. doc - Update nullspace projection formula. Aug 16, 2024
@ymontmarin
Copy link

ymontmarin commented Aug 16, 2024

@abussy-aldebaran, after some discussions, indeed $P_1(J_2P_1)^+$ is more stable as the null singular value of $J_2P_1$ induced by $P_1$ will always be treated as $0$ due to the reuse of $P_1$, even if numerically they are above the cutoff use by Numpy pinv function (see doc).

This stabilization come at the cost of an additional matrix multiplication. Another solution is tuning the Numpy variables rtol or rcond that control the cutoff to determine the null singular values.

Note that, you could still have some explosion coming from the null singular values of $J_2P_1$ outside the nullspace of $P_1$ and not detected by the cutoff. In those situations, tuning the Numpy variables rtol or rcond will be the only way to go.

@antoine-bussy
Copy link

antoine-bussy commented Aug 19, 2024

@ymontmarin Thanks for the precisions.

First I'd like to apologize. I didn't do enough testing, and it was luck that it worked better with the reuse of $P_1$, as I used random initial configurations. Using the same initial configuration produced identical results for both cases.

As you pointed out, the explosions probably come from the low threshold on singular values. I increased the absolute tolerance to $1e-2$ and the explosions stopped (I used the scipy version of linalg.pinv). Thanks for that tip. I think it'd be helpful to add it in the tutorial.
And if I understood correctly, this would also apply to the singular values inside the null space of $P_1$, so that the reprojection wouldn't be necessary ?

About $Im(A^+) = (Ker(A))^\perp$. I found a simple proof, but it doesn't strike me as obvious. Am I missing something ?

Finally, the simplification of $P_1$ doesn't apply to more general tasks. Wouldn't it be more didactic to keep it the reprojection in the tutorial ?

I can do a PR.

@ymontmarin
Copy link

ymontmarin commented Aug 20, 2024

@antoine-bussy Thank you for your proposition.

Yes it could be a good idea to have a PR with the precisions:

  • Straight calculation gives $vq_2 = vq_1 + P_1(J_2P_1)^+(v_2^*-J_2vq_1)$
  • We have the mathematical property $P_1(J_2P_1)^+=(J_2P_1)^+$
  • The first expression ensure a re-projection on the image of $P_1$, hence prevent interpreting null singular value due to $P_1$ as non-null
  • It is important to tune the tolerance of pinv for other null singular value and to avoid the additional re-projection with $P_1$.

Could you give me an example of a general task where it does not apply ?
With this reasoning, it encompasses every task that leads to a least square with sub vector space constraint:
$min_{x\in F}|Ax -b|$ where $F$ be a sub vector space of $\mathbb{R}^n$. If we denote by $P_F$ is the orthogonal projection on $F$, the solution is $(AP_F)^+b$ because of the reasoning already detailed.

Finally, the property $Im(A^+) = Ker(A)^\perp$ can indeed be obtained from the algebraic property of the Moore-Penrose inverse. However, the intuition better comes, at least for me, from the geometric interpretation of the Moore-Penrose inverse. The Moore-Penrose inverse is "the minimum norm solution among the solution of a linear least square problem". As an image is often better than long text, here is a slide I use in my class:
slide

slide.pdf

Mathematically $A^+y = argmin_{x\in LS(A,y)} |x|^2$ where $LS(A,y)$ is the set of solution of $argmin_{x\in \mathbb{R}^n}|Ax - y|$. In other word $A^+y$ is the orthogonal projection of $0$ onto $LS(A,y)$ and $LS(A,y)$ is an affine subspace with support vector space $Ker(A)$, so $\langle 0-A^+y,z\rangle=0$ for any $z$ in $Ker(A)$ which implies $A^+y \subset Ker(A)^\perp$, matching dimension gives the equality.

Indeed, $LS(A,y)$ is an affine subspace with support vector space $Ker(A)$ as for any $x$ is in $LS(A,y)$ and any $z$ in $Ker(A)$ we have $A(x+z)=Ax$ and $x+z$ is also a solution and in turn is in $LS(A,y)$, although for any $z$ not in $Ker(A)$ we have $A(x+z)\neq Ax$ and because $x$ is optimal, necessarily $|Ax - y| < |A(x+z) - y|$ and $x+z$ is not in $LS(A,y)$ .

@abussy-aldebaran
Copy link
Contributor Author

I submitted a PR #2415.

@ymontmarin Thanks for the explanations. I was thinking about a joint-space task, such as joint-space impedance control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants