Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of smartOS support and what future holds #1663

Open
anonrig opened this issue Dec 11, 2024 · 3 comments
Open

Status of smartOS support and what future holds #1663

anonrig opened this issue Dec 11, 2024 · 3 comments

Comments

@anonrig
Copy link
Member

anonrig commented Dec 11, 2024

Hi everybody,

I'm unfortunately opening this issue to TSC repository and not to the nodejs/build repository, in order to get more visibility and share the urgency/state with the rest of the @nodejs/tsc members. Unfortunately, even though I tagged @nodejs/platform-smartos team multiple times, I haven't seen any progress (meaning updating the GCC version) which concerned me enough to create this issue. My goal is not to offend or blame anyone but just discuss our path forward for operating systems that are blocking future improvements of node.js due to the lack of support or personnel working on them.

Current smartOS machines uses old GCC versions despite the "Supported toolchains" show higher versions. For example, our documentation says we support GCC 12 or higher, but our smartOS machine is using GCC 10.

There is an active pull-request of mine from 3 months ago where it is still failing to build. nodejs/node#54990. This is concerning me the mostly because 3 months is a long enough time to expect a compiler version update from a Tier 1 supported OS.

Today, I opened another pull-request to test if updating Ada to the main branch of Ada breaks anything and it seems smartOS (as well as macOS) have failed to build due to old compiler. (Ref: nodejs/node#56218)

As a result of this fact, I recommend lowering SmartOS to "Tier 3/Experimental" and unblock existing pull-requests. I'm open to all suggestions, even though I have a recommendation.

Happy holidays!

cc @nodejs/build @nodejs/platform-smartos @nodejs/tsc

@anonrig anonrig changed the title Status of smartOS machines and what future holds Status of smartOS support and what future holds Dec 11, 2024
@bahamat
Copy link

bahamat commented Dec 11, 2024

I'm on the smartos team. I did receive the notification for nodejs/node#54990. You received an answer from @richardlau that the migration was in progress. This migration was being handled by Ryan Aslett. As of August 22, to my knowledge, Ryan was able to make progress without further assistance from me. I can say that after that date no further assistance was requested. I don't have access to the Jenkins build infra, so aside from assisting in designating which image versions should be used, and provisioning in MNX (which we provide for free), there's not anything I can do. And since I don't have access to the build system, I am unable to even verify whether the task was completed.

As for the compiler version, I discussed this with you today on Slack.

To recap what we discussed on Slack (and to expand a bit):

As of August, I had arranged with Ryan Aslett (via Slack) to have SmartOS build agents for v21.4.1 and v23.4.0.

As far as a newer version of gcc goes:

  • 21.4.1 will be unsupported as of next month when 24.4.0 is released
  • 22.4.0 has gcc12, which is newer than the requested gcc11
  • 23.4.0 has gcc13, which is newer than the requested gcc11
  • The upcoming 24.4.0 will have gcc13, which is newer than the requested gcc11

Given this, the best thing today to right now is to replace the 21.4.1 image with 22.4.0, which will be supported for another 13 months. Then, once 24.4.0 is released, the 23.4.0 image can be replaces with 24.4.0, or kept as-is. To the extent that I am able to assist with this, I am ready, able, and willing to provide that assistance.

I think it's a mischaracterization that we have been unresponsive. We've addressed every issue that's been asked of us (to the extent that we have permission to contribute) and in a timely manor. We have provided fixes for bugs in node and v8, and also helped uncover bugs in other operating systems before those bugs could land in a release (i.e., bugs were discovered in a failed build on SmartOS, but further analysis showed that other operating systems were affected by the same bug, even though they did not exhibit a build failure).

As I stated earlier, the one issue you mentioned that we didn't directly respond, you had already received an answer from someone more directly involved and responsible for the task than we are so it didn't seem necessary to have to make any specific reply.

@mcollina
Copy link
Member

cc @ryanaslett what's the status of the migration @bahamat is referring to?

@ryanaslett
Copy link

Apologies for the delay, and lack of understanding of the urgency and impact.

I have provisioned a Smartos23 instance, and got as far as evaluating whether or not the instances were ready to put into the testing matrix.

When running the tests, they were failing intermittently, or, when they did succeed, the tests were taking over 7 hours to complete, even on subsequent runs.

The next step is to investigate whether they're misconfigured, or underprovisioned, or what is causing them to intermittently fail, as well as underperform.

I had intended to begin that investigation myself, so I had better questions to ask, but other priorities got my attention, and I hadn't yet returned to this initiative to finish it off.

Jenkins had automatically cleaned up the old builds I was testing with in the meantime, so wI'll have to kick off another build to start troubleshooting this. (running here: https://ci.nodejs.org/job/node-test-commit-smartos-test-ryan/nodes=smartos23-64/9/consoleFull)

I should have communicated the status of this much sooner and tried to involve more assistance, as I'm neither fully versed in the intricacies of SmartOS, nor do I have deep experience with building node, specifically. As such I was struggling with where to begin investigating to get these runners into a state that resembles the existing node18/20 performance (30-40 minutes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants