-
-
Notifications
You must be signed in to change notification settings - Fork 169
WIP: Optimize OpenEMR 7.0.5 container startup and memory usage #509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Refactor the OpenEMR 7.0.5 Docker image for significantly improved startup performance and reduced memory footprint. This is a work-in-progress that requires community testing to verify all container functionality before merging. ## Performance Improvements Benchmarking against the current Docker Hub image (`openemr/openemr:7.0.5`) shows: | Metric | Optimized | Original | Improvement | |-----------------|--------------|-------------|------------------| | Startup Time | 15.0s | 73.1s | **4.9x faster** | | Memory (Avg) | 92.8 MB | 304.2 MB | **69% reduction**| | Memory (Peak) | 117.1 MB | 326.6 MB | **64% reduction**| | Throughput | 114.9 req/s | 117.2 req/s | Equivalent | ## Technical Changes ### Build-Time Permission Optimization (Primary) - Moved file permission operations (`chmod`/`chown`) from runtime to build time - The original container scanned and modified ~15,000 files on every startup (40-60s) - Now permissions are baked into the image; only setup-modified files need runtime adjustment ### Startup Script Modernization - Rewrote `openemr.sh` from POSIX sh to bash with `set -euo pipefail` - Added comprehensive inline documentation and clear section organization - Improved database wait logic with exponential backoff - Simplified permission handling for runtime efficiency ### Dockerfile Documentation - Added extensive comments explaining each build stage - Documented package purposes, permission schemes, and build targets ## Testing Needed This PR needs help verifying that all OpenEMR container functions work correctly: - [ ] Fresh installation (auto-configure) - [ ] Manual setup mode - [ ] SSL/TLS certificate configuration - [ ] Redis session handling - [ ] Kubernetes mode (admin/worker roles) - [ ] Swarm mode coordination - [ ] Upgrade path from previous versions - [ ] XDebug configuration - [ ] Let's Encrypt integration - [ ] Document upload/storage - [ ] Multi-site configurations ## How to Test Build and test the optimized image: cd docker/openemr/7.0.5 docker build -t openemr:7.0.5-optimized . docker compose up -dOr use the benchmarking utility to compare performance: cd utilities/container_benchmarking ./benchmark.sh## Related Files - `docker/openemr/7.0.5/*` - Optimized container files - `utilities/container_benchmarking/` - Benchmarking suite for performance validation --- ## Summary of the Container Benchmarking Utility The **Container Benchmarking Utility** (`utilities/container_benchmarking/`) is a valuable development tool for the OpenEMR project: ### What It Does - **Compares two container images** side-by-side (local build vs. Docker Hub reference) - **Measures key performance metrics**: - **Startup time**: How long until the container is healthy - **Throughput**: Requests per second under load (Apache Bench) - **Resource usage**: CPU and memory during operation ### Why It's Useful for OpenEMR Container Development 1. **Quantifies optimizations**: Instead of guessing, you get hard numbers (e.g., "4.9x faster startup") 2. **Prevents regressions**: Compare any changes against the baseline Docker Hub image 3. **Quick feedback loop**: Full benchmark suite runs in 3-5 minutes 4. **Reproducible results**: Documented methodology with timestamp-organized output 5. **Analysis tools**: Includes `summary.sh`, `compare_results.sh`, and CSV export for deeper analysis ### Recommended Use Cases - **Before PRs**: Verify performance impact of container changes - **CI/CD integration**: Automated regression detection in GitHub Actions - **Version comparisons**: Test across OpenEMR versions (7.0.3, 7.0.4, 7.0.5, etc.) - **Optimization validation**: Prove that changes actually improve performance The utility democratizes performance testing—any contributor can run benchmarks locally and share quantified results, making it easier to evaluate and merge container improvements.
Fixes shellcheck issues in compare_results.sh
kojiromike
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. I'm looking forward to getting more benchmarking data out of the gate.
I hope you don't mind my reviewing it early despite the WIP marker. I just don't know when I'll get another chance.
Mainly my comments are around taking better advantage of bash. If we're not going to try to be POSIX compliant then we should take full advantage of bash features, particularly integer arithmetic and shopts like nullglob, globstar and maybe extglob.
The other stuff is about taking better advantage of current docker buildtime features like caching.
And maybe it's not a conversation for today, but maybe we don't need to have bash, python and php on the image. Some day, we should consider rewriting these tools in a single language. We can take advantage of Symfony CLI and other libs already included in openemr to avoid adding more overhead or dependency management.
| RUN apk add --no-cache build-base \ | ||
| # Clone OpenEMR repository (shallow clone to reduce image size) | ||
| && git clone https://github.com/openemr/openemr.git --depth 1 \ | ||
| && rm -rf openemr/.git \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're cloning master and nuking .git, we could fetch the archive from GitHub and unpack it instead.
| # Install PHP dependencies (production mode, no dev dependencies) | ||
| && composer install --no-dev \ | ||
| # Install Node.js dependencies for frontend build | ||
| && npm install --unsafe-perm \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
npm install/build could be done in a previous stage (docker multistage) and only the assets copied over.
| && cd ../ \ | ||
| # Clean up vendor and asset files using Phing build tool | ||
| && composer global require phing/phing \ | ||
| && /root/.composer/vendor/bin/phing vendor-clean \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing composer install in a stage and copying only the needed bits could also reduce the need for manual cleaning applications.
| && composer dump-autoload --optimize --apcu \ | ||
| # Clear Composer and npm caches to reduce image size | ||
| && composer clearcache \ | ||
| && npm cache clear --force \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should take advantage of docker's support for build-time caching instead of disabling or nuking them at build time.
OpenCoreEMR can share how we're doing this for our own builds some time next week. Cc: @msummers42
| # Install kcov dependencies | ||
| # Install kcov build dependencies | ||
| # kcov is a code coverage tool that requires compilation from source | ||
| RUN apk add --no-cache bash \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have bash above, now.
| local files=() | ||
| while IFS= read -r file; do | ||
| files+=("${file}") | ||
| done < <(find "${RESULTS_DIR}" -name "benchmark_*.txt" -type f 2>/dev/null | sort || true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using globstar/nullglob here instead of find.
|
|
||
| for file in "${recent_files[@]}"; do | ||
| local timestamp | ||
| timestamp=$(basename "${file}" | sed 's/benchmark_//' | sed 's/\.txt$//') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using parameter expansion here instead of basename/sed/sed
| RESULTS_DIR="${RESULTS_DIR:-./results}" | ||
|
|
||
| # Colors for output | ||
| GREEN='\033[0;32m' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using tput in all these cases.
| NC='\033[0m' # No Color | ||
|
|
||
| log_info() { | ||
| echo -e "${BLUE}ℹ${NC} $*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using printf instead of echo -e
| extract_metric() { | ||
| local file="${1}" | ||
| local metric="${2}" | ||
| grep "^${metric}=" "${file}" | cut -d'=' -f2 | sed 's/s$//' | sed 's/ms$//' || echo "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be replaced with a single awk command, see above.
kojiromike
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. I'm looking forward to getting more benchmarking data out of the gate.
I hope you don't mind my reviewing it early despite the WIP marker. I just don't know when I'll get another chance.
Mainly my comments are around taking better advantage of bash. If we're not going to try to be POSIX compliant then we should take full advantage of bash features, particularly integer arithmetic and shopts like nullglob, globstar and maybe extglob.
The other stuff is about taking better advantage of current docker buildtime features like caching.
And maybe it's not a conversation for today, but maybe we don't need to have bash, python and php on the image. Some day, we should consider rewriting these tools in a single language. We can take advantage of Symfony CLI and other libs already included in openemr to avoid adding more overhead or dependency management.
|
Thank you for taking the time to go through all this @kojiromike ; really appreciate your time. I'll be busy for a little bit but I should have time in ~2 weeks to start going through all the comments you left and making edits. If anyone wants to step in before then and start making edits themselves please feel free; I welcome all the help I can get on this PR. The time from container start to when it passes healthchecks is something really important to me because the shorter that time is the better our autoscaling will be when running multiple horizontal replicas. If it takes 5 minutes to spin up a new node it doesn't mean you can't autoscale but it means that you won't be able to autoscale as well as if it took 1 minute. There's some things you can do with Karpenter and other autoscalers to autoscale performantly even when startup time is relatively long (things like having pre-spun up hosts ready to serve traffic waiting rather than spinning up hosts from scratch every time there's an autoscaling event) but reducing the "time from start to healthy" will improve horizontal autoscaling no matter where OpenEMR is run. While reducing the size of the container image is good in general (especially because it helps reduces some of the hardware specs you'd need to run OpenEMR using docker and the time it takes to pull the container); the rise of container caching technologies (specifically for Kubernetes and basically anywhere else you'd host this in a large public cloud provider) means that this is something I'm planning to prioritize less than reducing the amount of operations the container has to perform when it starts because I think this is going to impact startup time way more than container image size. I feel the same way about reducing the build times of the containers. Reducing build times is never a bad thing (especially because it allows developers to test more easily) but I think that prioritizing performing the least amount of operations and performing those operations as quickly as possible on container start (especially if those operations are the same each time and can be accomplished at Docker build time or by making configurations to repositories where build materials are pulled from) is going to give us the greatest "value for effort" for the time being. I think the fastest way to startup would basically be to ...
What I'd really like to do is introduce automated CI/CD for container builds that verifies that all of this functionality works as its supposed to:
I've managed to make a "high availability" docker compose setup (put high availability in quotes because even if you have two containers running, running both of them on the same host is not really high availability) that launches two containers behind a load balancer in docker-compose which can be used to test that Swarm node configuration works so I'm making some progress in this regard but, again, I'll more than happily take all the help I can on pushing this forward. The benchmarking is a good first step because it allows us to quickly verify the performance impacts of a change to one of the containers but as long as we have to more or less manually verify that the functionality above works as expected there's only going to be so fast we'll be able to iterate. By the way thanks for all your work in introducing all the unit tests you've introduced so far. I'm a big fan of what's going on in the "tests" folder in the "openemr/openemr" repo and everything going on in the ".github" folder in this repo. If we have CI/CD that verifies all the functionality the container should have and uses the benchmarking utilities then the process to find optimizations like this becomes much easier. Anyone would be able to fork the repository, make a change, and then know in minutes whether or not that change improved the container and maintained the functionality we need to properly run OpenEMR. |
Agreed
Let me see if I understand. Is openemr.sh slow when the application is already set up and ready to go, and we're just scaling it horizontally by adding another node in basically |
|
Both. Most of my contributions to the OpenEMR containers come from a perspective similar to an operator of a cluster (be it Kubernetes or otherwise in the case of ECS) who runs OpenEMR on behalf of a larger organization. Sort of like if you were a DevOps engineer hired by a hospital and tasked with operating a cluster running OpenEMR. The difference is rather than me being concerned about a specific installation I'm basically trying to make machines that print out highly reliable setups for other operators. Most of the time I try to do that by acting like I'm an operator myself and stress testing these setups I'm printing out to see what I need to improve with the machine that makes them. When we first started making OpenEMR on ECS we noticed that the deployment was failing because the healthchecks were timing out. At first we thought this was because the containers had failed some setup task and so they were just sitting idle until they failed healthchecks and were terminated but then we realized that this was because the containers were just taking a really long time to start up. When we run on ECS we use Fargate and run on a collection of Graviton 2 and Graviton 3 processors in whatever AWS region you decide to use. When we run on EKS we tend to use (although auto-mode defines node-pools with multiple types of instances) Graviton 3 or 4. I ended up deciding on Graviton because it was 1/ cheaper than other comparable instance types 2/ There's been a bunch of optimization work done for running PHP on Graviton and I found it generally performed better than comparable x86 based instances. Also AWS has a lot of ARM chips so from the perspective of autoscaling I think the chance that one of our clusters is going to ask an AWS data center for another ARM vCPU and for that data center to say "sorry we're all out" is pretty low. This allows us to operate OpenEMR at an incredibly large scale and to do so relatively affordably. The initial load testing results we got from using just the ECS version alone (which is arguably less performant overall than the EKS setup) show us comfortably serving the equivalent of ~20K virtual users or ~4000 req/sec with the equivalent of $0.079/hour worth of Fargate compute that never went beyond 19% avg. CPU utilization or 31% avg. RAM utilization. I truly believe that both the ECS or EKS setups as they exist today could have specific configurations set that would allow them to meet the demands of even national or global scale deployments and serve many many concurrent users. I'd say it takes about ~9-12 minutes for a leader pod to complete configuration on ECS and pass healthchecks and about ~6-8 minutes for new follower pods to spin up. There's some variation here because it's luck of the draw whether or not you end up running on Graviton 2 vs 3 and you can notice the difference in testing. I'd say it takes about ~7-9 minutes for a leader pod to complete configuration on EKS and pass healthchecks and about ~4-6 minutes for a follower pod to spin up. There's some variation here because it's luck of the draw as to what specific processor you end up running on but in general I think the processors we end up running on in our EKS architecture are a little bit more performant than those we get allocated on average when we run on ECS. The ECS version is cheaper and easier to manage than the EKS version while the EKS version offers integration paths to a lot more tools and a lot more functionality around really granular observability out of the box so I think it makes sense to offer both. In general I'd advise using the ECS based one for smaller to mid-size installations and the EKS based one for mid-size to large installations but you can definitely use either the ECS or EKS one at any scale. While it's improved over time as we've figured out ways to ...
we've still had to set pretty generous healthchecks to get OpenEMR to run in a container orchestrator on AWS. I imagine that as other architectures are created in the future to run OpenEMR on container orchestrators elsewhere we'll probably run into similar issues. It's super normal for applications to take a few minutes to start either as replicas or, especially, when they boot for the first time. However, applications that do take a while to boot and are going to be scheduled on Kubernetes or a similar orchestrator generally try to, as soon as is reasonably possible, start responding to healthchecks on some endpoint that says "this application might not be ready to serve traffic but it is operating normally". An example of this would be an application that starts almost immediately responding to healtchecks at If we can get the containers to reliably respond to healthchecks in <2 minutes when they boot from the perspective of a user or an operator we'd be able to map our autoscaling capacity very closely to the amount of compute we'd need at any one time to serve traffic when running in a cloud and basically all of the nodes would be healthy at any one time because the longest an unhealthy node would be allowed to serve traffic would be <2 minutes. This would make the system cheaper and more reliable overall. You can get around the long start up times when running today by just setting the autoscaling values lower than you'd probably want to. So instead of setting the average CPU utilization to schedule a scaling event from 70-90% and worry about potentially running up against more traffic than you can serve as you try to spin up new instances what you can instead do is set to be at 40-60% or something similar so that you're more conservative in allocating the amount of compute you have ready to serve traffic at any one time. Aside from requiring you to spend more money you can get away with using this method to reliably serve even large volumes of OpenEMR traffic when running in a public cloud. What's perhaps more concerning than cost is the case in which let's say a node fails and we may not have many nodes up at one time because we can serve a lot of traffic with just a few nodes. Because we've had to configure the healthchecks to be really lenient to accommodate for a ~5-7 minute window where a container is running but not responding to healthchecks this can hypothetically lead to a scenarios where a user can experience degraded functionality over a period of a few minutes and not really know why. Let's imagine a hypothetical situation where let's say 1/4 hosts is doing nothing but serving 5xx responses to users. Neither of our setups need to use sticky sessions and most of the time we're just evenly distributing traffic to our replicas behind a load balancer. This means about 1/4 requests sent by a user don't work so from their perspective they're clicking around in OpenEMR but 1/4 clicks don't work correctly in the UI and they're sort of left scratching their head as to what's going on and it may take a few minutes for that to resolve. It's not the biggest deal in the world but I think if we were to even lower the "time to healthy" by a few minutes it would create a scenario in which basically all of the nodes minus a minute or two here and there were healthy at all times. The faster a container starts responding to healthchecks the faster we can set up the orchestrator to remove unhealthy nodes and replace them with healthy ones which greatly improves the reliability of the system overall. We'll also save money on autoscaling by allowing us to set higher CPU and RAM thresholds for triggering scale out events. In most orchestrators (including ECS and EKS) you can set a startup grace period in which you say something to the effect of "This container might take 5 minutes to boot initially but after that first 5 minute grace period do more aggressive healthchecks" which is something we do; so it definitely doesn't take us ~9-11 minutes to identify and remove unhealthy nodes but it's still pretty high relative to most applications you'd run in a cluster so it's something I'm thinking about still. This PR aims to address three things I've noticed when running OpenEMR as a horizontal replica:
I think we're actually pretty close to the point where any more startup optimizations would be nice to have but not as critical as they are today. That's all while acknowledging that the current setups can perform pretty well today and host OpenEMR with autoscaling for even incredibly large deployments. What I'd like to introduce in a subsequent PR (or I'd be happy to support anyone who'd like to lead this) would be to modify the containers so that they basically start passing some basic healthcheck as quickly as possible. We can still layer more aggressive and thorough sets of healthchecks on top of that (and even give them grace periods if need be) but having an indication as early as possible that the container is progressing normally even if it's not finished with everything it needs to do would be super helpful. I think this PR plus maybe one or two more is all we'd need to do to ensure that startup times were fast enough in the production container (Flex containers wouldn't be in scope for this effort) to pass healthchecks quickly and then everything else beyond that would be more of a "nice to have" rather than something that would substantially impact the overall functioning of the system. |
Refactors the OpenEMR 7.0.5 Docker image for significantly improved startup performance and reduced memory footprint. This is a work-in-progress that requires community testing to verify all container functionality before merging.
Performance Improvements
Benchmarking against the current Docker Hub image (
openemr/openemr:7.0.5) shows:Technical Changes
Build-Time Permission Optimization (Primary)
chmod/chown) from runtime to build timeStartup Script Modernization
openemr.shfrom POSIX sh to bash withset -euo pipefailDockerfile Documentation
Testing Needed
This PR needs help verifying that all OpenEMR container functions work correctly:
How to Test
Build and test the optimized image:
cd docker/openemr/7.0.5
docker build -t openemr:7.0.5-optimized .
docker compose up -d
Or use the benchmarking utility to compare performance: cd utilities/container_benchmarking
./benchmark.sh
Related Files
docker/openemr/7.0.5/*- Optimized container filesutilities/container_benchmarking/- Benchmarking suite for performance validationSummary of the Container Benchmarking Utility
The Container Benchmarking Utility (
utilities/container_benchmarking/) is a valuable development tool for the OpenEMR project:What It Does
Why It's Useful for OpenEMR Container Development
summary.sh,compare_results.sh, and CSV export for deeper analysisRecommended Use Cases
The utility democratizes performance testing—any contributor can run benchmarks locally and share quantified results, making it easier to evaluate and merge container improvements.