Skip to content

Merge changes made by NGINXaaS #193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 110 commits into from

Conversation

arussellf5
Copy link

This PR brings in all the changes made by the NGINXaaS for Azure team.

puneetsarna and others added 28 commits July 17, 2025 17:28
The primary intent behind this is to keep retrying updates which may be made
before the controlplane has registered the existence of a named upstream in
the customer's NGINX configuration.
NOTE: This commit was accidentally missed out in
the first iteration of this fork and is present in
upstream: https://github.com/nginxinc/nginx-loadbalancer-kubernetes/blob/main/internal/configuration/settings.go#L189

The code needs to move forward to start the
watchers and health server (side note: we should
also think about the ordering of these at some
point) and not running the infomer in a go routine
prevents the program from further execution until
the context is canceled.

In the current iteration on main, the controller
is stuck on the informer, and then k8s kills the
service and restarts it since the health server is
not up.
The user specifies the ingress service whose events the application should watch through setting the "service-annotation-match" annotation on the application's config map. Only events with a matching service annotation will be passed by the watcher to the handlers. The informer now listens to events from all namespaces. This frees the end user from the restriction of only using the nginx ingress controller.
…eam name

The port name should now be formatted like this: "http-tea", where the first part
of the string is the context type (either "http" or "stream") and the second part
of the string after the hyphen is the name of the upstream.
We need to be able to publish the operator to
dockerhub in order to be publicly available for
customers.

Following what we have in ARP as a release
strategy where a release tag action would publish the
image to dockerhub.
go test does not have a good way to capture
unit-test coverage as part of test runs. This
commit captures the error code of the unit test
run, runs the coverage generation and then exits
based on the test status.
Helm will be used as part of the user story to
deploy the operator but it is also a good tool to
deploy the operator while developing it.

This commit adds the ability to publish helm
charts:
- to the dev registry for local iteration.
- to the regular devops registry for CI iteration
  and testing.

This will also help us test the helm chart itself.
0.0.1 is okay but it's not really a bug fix.
We should keep a single release script that will
publish docker images and helm charts for the
official release. This commit just updates the
current release script to handle both artifact
types. Helm logic will follow.
Previously, owing to a bug, if the name of the upstream included hyphens it would be rejected by the operator.
We are going to release with `0.x.y` and keeping
the internal version inline with what will get
published out will reduce confusion.
This repository is going to produce artifacts that
will be available publicly and the end users will
care about semantic versioning.

We need to be able to map a public facing version
to internally produced artifacts easily and having
semver internally eases that work.

This commit does not enforce the versioning but
adds a version file that has the semver,
which will be used to version the product.

We can follow a workflow where during release
time, we cut a release, which creates a tag and we
retag existing dev artifacts to be shipped as an
official artifact.
While a cosmetic change, it does impact how
dokerhub repo needs to be setup to publish the
helm chart. In addition to that, it impacts what
the user see on k8s itself and it should not be
nginx-loadbalancer-kubernetes as that is
confusing.
While the chartname (which was renamed in the
prior commit) gets used for naming stuff, it makes
sense to also change the name value in the chart
itself to use the new name.
Now that official images exist on docker hub, we
should use those images in our charts.
Now that the chart has been renamed, gitignore
needs to know about ignoring new charts.
This lets us test the operator from the private
registry. Also, in case a customer does not want
to pull from docker hub and instead use their own
registry, they can do so and specify a pull secret
for the image.
With the change to make the operator read a config
file (to reduce code complexity), we need to make
sure that the file exists for the service to
start.
arussellf5 and others added 26 commits July 17, 2025 17:31
NLK listens for changes to any kubernetes service of type LoadBalancer and updates the NGINXaaS upstreams with the external IPs and ports of the LB service. We are using the external IPs and not the node IPs so as to provide seamless plug-and-play functionality with the customer's networking framework.
This will help fix vulnerabilities discoved by mend in k8s.io/apiMAChinery-v0.26.0.
This enables concurrency-safe access by multiple goroutines.
The latest versions of the kubernetes libraries recommend using a typed workqueue and this reduces a bit of boilerplate and error handling, because we no longer have to cast the workitems returned by the queue into the desired types.
Multiple parallel tests were all accessing the same pointer to a single variable for the DefaultTransport in the http package. This was leading to data races in unit tests.
There's a syntax issue in the "listen" directive. Should be "listen 9113;", not "listen 9113:". Using the current file, I got an error upon reloading the NGINX service ("nginx: [emerg] invalid port in "9113:" of the "listen" directive in /etc/nginx/conf.d/prometheus.conf:11")
There was a change in the API for the NGINX Plus Client that was missed when updating to the latest version. This corrects that.
These functions were being used to create IDs that were not really
necessary for the business logic and which were generating security
alerts because of weak cryptography techniques.
The http client is processing requests created by the nginx plus
client library, and that library should always include a sensible number
of headers. But the lack of change on the number of headers was causing
security vulnerability flags to be raised over denial of service
resource exhaustion attacks.
Go cache in the CI is seeded in the project
working directory. We should skip the mod cache
from lint/formatting as it's upstream code and
there are high chances of the linting failing as
upstream lint rules != our lint rules.
…inx hosts

In order for the nginx-hosts yaml field to be parsed correctly by viper the template needs to:

1. not put double quotes around the value (this causes viper to interpret it as a string)
2. render it as a JSON array rather than a go representation of a slice.
The biggest change here is to remove most the TLS modes to enable mTLS and self-signed certificates. Product decided that this was too complex and there was not enough user demand for most of these options. We decided to pare down the code and remove tests that were no longer well maintained. The remaining configuration allows users to toggle a single switch: whether to make the http client verify the NGINX host's certificate chain and host name if https is being used. If the user wishes to enable https with self-signed certs they can use the "skip-verify-tls" setting to allow this. The default behavior is to perform this verification.

We are maintaining the deprecated "no-tls" and "ca-tls" inputs for NGINXaaS backwards comptability reasons. The "no-tls" setting name was highly misleading, because all it did was disable TLS verification: it DID NOT disable TLS altogether in https mode. Similarly, the "ca-tls" setting did not enable TLS itself. TLS is enabled by default when the URL of the NGINX host includes the https protocol. The user setting merely enforced the verification of the certificate chain and host as described above.
Now that the plus go client allows users to check the http status code of the error, handle the upstream not found case by doing nothing.
@arussellf5 arussellf5 force-pushed the nginxaas-merge-devops branch from 36ec676 to 24a3c17 Compare July 17, 2025 16:32
@arussellf5 arussellf5 closed this Jul 18, 2025
@arussellf5 arussellf5 deleted the nginxaas-merge-devops branch July 18, 2025 09:01
@arussellf5 arussellf5 restored the nginxaas-merge-devops branch July 18, 2025 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants