Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Frequent Releases | Moving towards CI/CD #481

Open
LongLiveCHIEF opened this issue Sep 6, 2021 · 14 comments
Open

More Frequent Releases | Moving towards CI/CD #481

LongLiveCHIEF opened this issue Sep 6, 2021 · 14 comments
Assignees

Comments

@LongLiveCHIEF
Copy link

LongLiveCHIEF commented Sep 6, 2021

I was away from my work that was using atsamd crates for a few months, and came back to find many of the things I was waiting on that were in the works previously, have now been merged!

Unfortunately though, no new BSP/PAC/HAL's have been released. This has made it difficult to use these features and continue to patch in features from other contributors for things in the works, as I have to do a lot of merging and cherry-picking locally against the upstream master.

If there were more frequent releases (say for example a minor/patch ), it would be easier to patch in prospective features as I develop laterally on things like BSP's.

I've taken a look at existing actions, and think they are a good start to this. I have some questions, and based on those answers, have some suggestions.

I'd also be willing to contribute much of the work I propose. This issue will serve as a central issue to some of the smaller things that are capable of being isolated into scoped issues/PR's.

Questions

  • Am I right in seeing that releasing a PAC or BSP releases all of them? There's no workflow that allows to release a BSP independent of the other BSP's?
    • Ditto for bumping versions of the BSP/PAC?
  • Do we want to keep that behavior where the version number's are synced/incremented at the same time, or was that only a limitation of what we could implement with GH Actions at the time? (if the latter, I would be happy to start work on a PR to implement deterministic build/release action updates... see more info below)

Suggestions

Filtered Build/Release workflows It looks like each PR builds and tests all BSP's, regardless of which BSP's (if any) are changed. We could make it possible to build and release individual BSP's based on the file changes detected in the push/PR. For example, I've done this in the Octoprint/octoprint-docker project, where I selectively build and release the various image variants based on detected file changes.

Add input to manual release workflow to allow individual crates to be bump-released. We can use a free-form input for the Release crates workflow to allow a privileged maintainer to enter the name of the crate to be released

Add a step to aggregate/update changelogs and release notes We can grab a list of all PR's since the last release of an affected crate, and automatically generate a changelog and release notes based on these changes. This is something I did back when I maintained some official puppet modules, and it worked really well. Here's how that worked. Ignore the Ruby pieces, and just pay attention to their groups similarity to our project structure and the logic behind their releases and changelog generation. That group serves as the "official" modules of the puppet forge ecosystem, in much the same way the atsamd project is a monorepo with many packages.

We could do the same thing, but with github actions.

Closing Thoughts

Overall, my general thought is that we should release BSP's and PAC's as frequently as possible, even with every change. I think we should embrace semver. I firmly believe that there's no problem with having a 0.18.x this week, and a 0.23.x 2 weeks later.

This would have the added benefit of keeping master in sync with the latest docs generated and pushed on crates.io... which right now is a struggle because you go through source code in master and want to use it, only to realize it's not published and therefore docs not generated. It creates a disonance that makes working in the ecosystem slower and more painful than working in the C++ embedded ecosystem, and IMO Rust is leaps and bounds capable of being more user/developer friendly than C++ embedded ecosystem hands down.

I think that's what we all are striving for, and I'm hoping this discussion will help further that. Depending on what discussion/decisions shake out on how often we are comfortable releasing BSP's, PAC's, and HAL... we can mix and match from the suggestions above to implement github actions that seamlessly implement those decisions.

@LongLiveCHIEF
Copy link
Author

particularly, if i get enough support for build/release of BSP/PAC's individually, I'll create an issue and get started on that first.

@twitchyliquid64
Copy link
Contributor

+1 to being able to release a BSP individually, but I think we should keep the existing behavior for PACs & the HAL, as they end up being dependent anyway.

The original intent behind the bump crates workflow and the release workflow was that we would run it frequently; so all crates would get updated regularly and we would never not do all at once. With the desire to get a lot of stuff into the 0.13 release this hasnt happened for a while.

+1 Add a step to aggregate/update changelogs and release notes
+1 Add input to manual release workflow to allow individual crates to be bump-released

-1 Filtered Build workflows (fine with filtered release workflows):
I might be the only one who wants to die on this hill, but I think we get a good amount of value for running all tests/builds on all PRs. Thanks to github its massively parallel, so CI usually only takes a few minutes (we used to use travis, which would take hours). Given its only a few minutes, i personally dont believe we need the complexity of filtering.

@LongLiveCHIEF
Copy link
Author

What about BSP's not all having the same version number? If there are updates to a BSP, and it gets released individually, the assumption I'm making is that it's ok for its version number to be bumped while the other BSP's remain at their current version.

@LongLiveCHIEF
Copy link
Author

+1 to being able to release a BSP individually, but I think we should keep the existing behavior for PACs & the HAL, as they end up being dependent anyway.

I've actually been going back and forth on this one since I started using atsamd-rs. My current line of thought is this; while the PAC's are dependent upon the HAL, they aren't dependent upon each other. If we're then using features, I wonder if the release relationship between HAL and PAC can be completely decoupled as well? I actually think this one needs some critical thought. I think we're artificially tightly coupling the PAC's to the HAL, and the need to release PAC's together that have nothing to do with each other is a code smell of that problem.

-1 Filtered Build workflows (fine with filtered release workflows):
I might be the only one who wants to die on this hill, but I think we get a good amount of value for running all tests/builds on all PRs. Thanks to github its massively parallel, so CI usually only takes a few minutes (we used to use travis, which would take hours). Given its only a few minutes, i personally dont believe we need the complexity of filtering.

I don't really have a problem with this stand, but my thought was toward whether or not we can ever see a scenario where a contributor submits a change to one BSP/PAC, and a test fails for a different one (maybe due to rust language updates?)

How fair would it be to that contributor that all of their code being submitted works fine, but their PR can't be merged because a test fails for PAC/BSP they have no ability to even attempt to fix?

The run times of the action, while being drastically reduced, aren't really a problem, so there's no benefit to optimization. My primary reason for suggesting this was to make it more contributor friendly.

The easier it is to contribute to something, the more you increase the likelihood of contributors submitting future PR's, or even becoming maintainers.

@LongLiveCHIEF
Copy link
Author

Often times "waiting to release" can create more technical debt than "releasing on merge". Consider if the v2 stuff was released as it came in, and semver numbers ticked up. Isn't cargo built to handle that? Users could just say "I need the v1 stuff because I'm not ready".

Would we be in a place now where we have a complicated mix of v1/v2 in each PAC, and a (0.13 release won't mean anything to cargo) if we released via immediately and used true semver logic? wouldn't be stuck in a place now where we have a mix of v1 and v2 in PAC'.

@bradleyharden
Copy link
Contributor

bradleyharden commented Sep 6, 2021

@LongLiveCHIEF, overall, I would say I agree with all of your proposed changes.

I think what you're ultimately getting at is the difference between prescriptive and descriptive version numbers. Right now, we're using version numbers very prescriptively. We define some arbitrary goal for a release number, and we work towards that. Instead, you're proposing that we just work on master and let the version number describe the relative compatibility between releases.

I've had debates in my head about this concept for a while, and I don't think I've come up with many compelling arguments for the prescriptive approach. Most of it comes down to the psychology of version numbers, which is not compelling to me at all. Ultimately, I think we would all be better off if we just got used to seeing large major versions. It would certainly be more informative.

For instance, if the Linux kernel actually held itself to SemVer, I think it would still be at 1.x, because they never ever break userland. If you know that a project truly follows SemVer, but it is still at 1.x after 30 years, that tells you something immense about its stability. On the other hand, a small project on version 57.x after 2 years also tells you something significant about its stability.

With respect to the particular suggestions, I'll go down the list:

Synchronized releases

I think we're artificially tightly coupling the PAC's to the HAL, and the need to release PAC's together that have nothing to do with each other is a code smell of that problem.

Agreed. Right now, if someone wants to patch a PAC and include that change in the HAL, it requires us to release all of the PACs, even if they haven't changed. With the new approach, bumping the major version of a single PAC would probably require us to bump the major version of the HAL as well, which is a potential problem. But I don't see how that's any worse than what we're doing right now. We're still bumping everything, it's just happening all at once. I don't see any reason to bump a PAC if nothing has changed.

I would favor desynchronizing all of the version numbers, PAC, HAL and BSP.

Filtered workflows

Given its only a few minutes, i personally dont believe we need the complexity of filtering.

While I understand it only takes a few minutes -- and it's free -- I don't think we're being responsible stewards of resources when we test everything on every PR. Sure those resources are paid for by GitHub/Microsoft, which doesn't impact us. But we are still consuming resources. And if every project on GitHub took that same approach, it would have a very real impact.

Once we get the BSP tier system in order, testing every BSP on every commit could mean we waste 50+% of every CI job re-running the same set of tests over and over. I just see that as fundamentally wasteful, regardless of who is paying the bill. And while I agree that filtered workflows will add some level of complexity, it seems like we have someone here who is willing to take on that complexity and wants to build a robust system. I think we should let him.

Manual release of individual crates

+1

Aggregate release notes

+1 as well. I don't think you'll find anyone who disagrees.

@bradleyharden
Copy link
Contributor

There has been some discussion on Matrix that I would like to capture.

From @jbeaurivage:

I've debated this a bit, and here's a strategy I'd like to adopt:

  • pull the trigger on v0.13 soon (can someone look over Get ready for v0.13 release #462?)
  • figure out the remaining BSP tier stuff, fix the pygamer/SPI bug, and get setup with CI/versioning. Yank all the v1 cruft. Release that as v1.0.0

I agreed with this appraoch.

From @Sympatron:

Regarding the 1.0 discussion: Any breaking change needs to bump the "left-most non-zero digit" to play nice with Cargo. Breaking changes include removing anything in the public API. As far as I know you plan to remove a lot of v1 stuff in the near future which would result in another major version bump right after releasing 1.0. This would be the end of the world, but should be considered. Since we are currently changing a lot of the public API including the upcoming changes to the clocking API I would argue atsamd-rs should stay 0.x for a little longer. Frequent releases combined with the current frequency of API changes would mean that we would get to quite high version numbers in a short timespan. Again, not the end of the world, but personally I think it wouldn't convey the correct state of the project and you can achieve the same thing now with minor bumps for breaking changes or major additions and patch bumps for non-breaking changes.

My response:

@Sympatron, I think your post gets at the heart of my argument.

the current frequency of API changes would mean that we would get to quite high version numbers in a short timespan ... I think it wouldn't convey the correct state of the project

I've had a similar aversion to large major-version numbers myself. But every time I've tried to subject my aversion to more scrutiny, I've realized that there's no real technical merit to it. It's mostly psychological.

I would counter and say that large version numbers would, in fact, convey the correct state of the project. Holding off on 1.0 implies that there is some significance associated with it. If we want to stay at 1.x for a long time, to create a very stable API, then we will eventually need to reject changes that are not backwards compatible. To me, that doesn't reflect our philosophy.

From my perspective, the general mood has been, "if this change improves the HAL, we are willing to accept it, regardless of breaking changes". As a mostly hobbyist library, with occasional professional users, I think this is the better approach. Stability can be a worthy goal, but it comes with technical debt. You can't fix or improve things, because it would break existing code.

For example, If you had asked me 6 months ago, I would have told you that the gpio::v2 module is ready and worthy of the traditional 1.0 label. But then @ian_rees came along and identified a minor shortcoming with the way we handled external interrupts. We ended up making a small, yet still breaking change.

If we want to commit to a very stable 1.x, then we should get used to the idea of rejecting such changes. But I don't think we should. The changes were fairly minor, and they improved the HAL overall.

So if we're not going to commit to stability over improvement, then why carry a leading zero indefinitely? Ultimately, I think moving forward with 1.0 and beyond would more accurately reflect our philosophy to new users.

Also, I would like to note that the current plan proposed by @jbeaurivage would have us release now as 0.13, tear out the v1 bits, and then release 1.0. We wouldn't have an immediate 2.0. That being said, I wouldn't be surprised if 2.0 comes along shortly after anyway.

@LongLiveCHIEF
Copy link
Author

LongLiveCHIEF commented Sep 9, 2021 via email

@bradleyharden
Copy link
Contributor

A few more contributions from the chat.

@Sympatron interprets the leading zero, not the large version numbers, as the indication that stability is not a priority. And @ianrrees pointed out that there is precedence, in OpenOCD, for staying at 0.x indefinitely. That's a perspective I hadn't really thought of.

@ianrrees also pointed out that this decision is prone to bike-shedding, so I think we should probably cut the debate off here and just make a decision.

I think the four most active maintainers right now are probably myself, @jbeaurivage, @sajattack and @twitchyliquid64. Should we take a vote?

I think the two major points of contention are:

  1. Should we start incrementing major version numbers or stick to minor numbers for the foreseeable future?
  2. Should we implement filtered CI workflows? Perhaps we should see a prototype from @LongLiveCHIEF before we make a decision on this one?

Personally, I would say:

  1. I still vote for major numbers, even if it's a bit different than the traditional approach. To me, it makes more sense.
  2. I would vote for filtered CI workflows, unless it proves to be massively complex.

@jbeaurivage
Copy link
Contributor

  1. I'm also in favour of major versions.
  2. Filtered workflows also seem more adequate to me.

@LongLiveCHIEF
Copy link
Author

I had to pause work on this for a week, but will be starting back up tomorrow. How do you guys feel about javascript in the github actions? It's kinda the "native" actions language, and the work I've done so far uses that to find all affected crates to create the matrix.

This means we'll be able to add/remove PACs/BSP's at any time without having to maintain a list of filters, and they will automatically get built and tested from the moment they're added.

My work for this one so far is here:
the composite action: https://github.com/LongLiveCHIEF/atsamd/blob/add-workflow-filters/.github/actions/affected-crates/action.yml, which calls -> https://github.com/LongLiveCHIEF/atsamd/blob/add-workflow-filters/.github/actions/affected-crates/gen-matrix.js.

What would be interesting, is instead of javascript, using a docker composite action and doing this with rust. That would take me longer though, since I'm still a Rust newbie.

@twitchyliquid64
Copy link
Contributor

I'm totally okay with js, we already have some actions in python lol

I think of the other maintainers, python is the most well known so that would be preferred, but I don't think anyone I know believes in writing CI in rust xD

@LongLiveCHIEF
Copy link
Author

o... I've got the build filtering done, to the point where it automatically determines affected crates and creates a build and testing matrix for both the stable and nightly toolchains for each affected crate

however, to really implement it, I think we need to make changes to some of the other workflows, since they're tightly coupled with the idea that all the crates/docs are built/published/versioned together

basically, we can't use the filtered workflows until we make a definitive decision on de-syncing PAC and HAL version numbers

@LongLiveCHIEF
Copy link
Author

forgot to throw in a link: https://github.com/LongLiveCHIEF/atsamd/actions/runs/1304428221. Check out that run, which is an example change to a single crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants