Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize vector component access and use GLM_ASSERT_LENGTH in dual_quaternion #1308

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Tr1NgleDev
Copy link

the title says it all
the explanation for the optimization:

  • accessing using a pointer is faster than a switch/case (which is also clearly visible if you ever try to RE some code that uses that operator[])
  • it's a single line
  • this was technically already done in type_quat so i don't understand why it wasn't done in type_vecs

also dual_quaternion operator[] for some reason didn't use the GLM_ASSERT_LENGTH macro, so i fixed that.

@Mashpoe
Copy link

Mashpoe commented Jul 15, 2024

I'm the developer of the hit game 4D Miner (which uses GLM). The vector operator[] calls are getting inlined as switch statements, which aren't getting optimized. There is a lot of code in the game that deals with component indices, so there will probably be a considerable speedup for certain functions if this is merged, and some functions will actually be inlined just because of the changes in this pr.

@ZXShady
Copy link

ZXShady commented Aug 7, 2024

but this techincally has UB though because the standard allows padding of members of same data type although no implementation does

@Mashpoe
Copy link

Mashpoe commented Aug 7, 2024

but this techincally has UB though because the standard allows padding of members of same data type although no implementation does

I would argue that the consistency and performance improvements far outweigh the risks of UB here then. If this can't be merged, then this optimization should probably be removed from type_quat. I was originally against this for the same reason, but I was unable to find a better solution, and then I learned that the same thing has been done elsewhere in the project for at least 8 years without causing any issues.

If any major compiler ever breaks this, preprocessor checks can be added in the future for that compiler version, or this optimization can be removed altogether. For now at least, considering the large performance gains which have already been demonstrated for a real-world use case, and that this optimization has already been used elsewhere in the project for years, I see no reason this shouldn't be merged.

@ZXShady
Copy link

ZXShady commented Aug 8, 2024

we already have UB because of union type punning but all compilers define it like C behavior and all compiler defines your issue's behavior and even if not we can add a fallback so this is worth it to implement, I agree with you .

sad that speezing out performance is technically "ub"

@ZXShady
Copy link

ZXShady commented Aug 13, 2024

to note the thing we lose here is constexpr evaluation for any index but 0. but I would say the performance benefits is worth the change

@Mashpoe
Copy link

Mashpoe commented Aug 13, 2024

I would agree. From my experience, the vast majority of use cases for component indexing are when you have to determine which component you are accessing at runtime. Constexpr component indexing seems like a very niche use case that could be done manually with a switch statement if it is ever needed.

@ZXShady
Copy link

ZXShady commented Sep 19, 2024

@Mashpoe this actually has a huge impact on other constexpr functions like all of glm matrix multiplications so alot of code needs to be rewritten

template<typename T, qualifier Q>
GLM_FUNC_QUALIFIER GLM_CONSTEXPR mat<4, 2, T, Q> operator*(mat<2, 2, T, Q> const& m1, mat<4, 2, T, Q> const& m2)
{
	return mat<4, 2, T, Q>(
		m1[0][0] * m2[0][0] + m1[1][0] * m2[0][1],
		m1[0][1] * m2[0][0] + m1[1][1] * m2[0][1],
		m1[0][0] * m2[1][0] + m1[1][0] * m2[1][1],
		m1[0][1] * m2[1][0] + m1[1][1] * m2[1][1],
		m1[0][0] * m2[2][0] + m1[1][0] * m2[2][1],
		m1[0][1] * m2[2][0] + m1[1][1] * m2[2][1],
		m1[0][0] * m2[3][0] + m1[1][0] * m2[3][1],
		m1[0][1] * m2[3][0] + m1[1][1] * m2[3][1]);
}

this function is no longer constexpr and needs to be rewritten in terms of accessing members

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants