Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added configuration for using specific SQLite versions. #50

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

laymonage
Copy link

@laymonage laymonage commented Jan 29, 2025

Fixes #11, built on top of #49.

There are a few caveats as noted in the README. Will add more details in PR comments.

Marking as draft for now, as ideally the SpatiaLite service should also pick up the SQLite version, and possibly allow for a specific SpatiaLite version. I spent hours digging into it but I encountered either segfaults, uninitialized DB errors, or missing library errors.

Here's my WIP patch for SpatiaLite if you want to look into it, but please note that it contains commands that are redundant as I'm still experimenting different combinations.

Details
diff --git a/.env b/.env
index c1c4223..c396b59 100644
--- a/.env
+++ b/.env
@@ -6,6 +6,9 @@ ORACLE_VERSION=23.5.0.0
 POSTGRESQL_VERSION=14
 POSTGIS_VERSION=3.1
 SQLITE_VERSION=
+SPATIALITE_VERSION=
 SQLITE_CFLAGS="-DSQLITE_ENABLE_DESERIALIZE \
                -DSQLITE_ENABLE_JSON1 \
+               -DSQLITE_ENABLE_RTREE \
+               -DSQLITE_ENABLE_COLUMN_METADATA=1 \
                -DSQLITE_MAX_VARIABLE_NUMBER=32766"
diff --git a/compose.yml b/compose.yml
index 301fb3e..1c5f83c 100644
--- a/compose.yml
+++ b/compose.yml
@@ -287,7 +287,7 @@ services:
       - DATABASE_ENGINE=django.db.backends.postgresql
       - DATABASE_HOST=postgresql-db

-  sqlite:
+  sqlite: &sqlite
     <<: *base
     image: django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}-sqlite${SQLITE_VERSION}
     pull_policy: never
@@ -310,7 +310,6 @@ services:
             else
               cp .libs/libsqlite3.so /tmp/
             fi
-            rm -rf /tmp/sqlite
           fi
         EOF
         SHELL ["/bin/bash", "-c"]
@@ -369,9 +368,54 @@ services:
       - DATABASE_HOST=postgresql-gis-db

   sqlite-gis:
-    <<: *base
-    depends_on:
-      <<: *depends-on-caches
+    <<: *sqlite
+    image: django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}-sqlite${SQLITE_VERSION}-spatialite${SPATIALITE_VERSION}
+    pull_policy: never
+    build:
+      context: .
+      dockerfile_inline: |
+        FROM django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}-sqlite${SQLITE_VERSION}
+        SHELL ["/bin/bash", "-o", "errexit", "-o", "nounset", "-o", "pipefail", "-o", "xtrace", "-c"]
+        # Only compile SpatiaLite if a version is specified.
+        USER root
+        RUN apt-get update --quiet --yes && apt-get install --no-install-recommends --yes libfreexl-dev libproj-dev
+        RUN <<EOF
+          if [[ "${SQLITE_VERSION}" && "${SPATIALITE_VERSION}" ]]; then
+            cd /tmp/sqlite
+            make install
+            cd /tmp
+            curl -Lo libspatialite.tar.gz https://www.gaia-gis.it/gaia-sins/libspatialite-sources/libspatialite-${SPATIALITE_VERSION}.tar.gz
+            tar xzf libspatialite.tar.gz
+            cd libspatialite-${SPATIALITE_VERSION}
+            ls -al
+            curl -Lo config.guess https://git.savannah.gnu.org/cgit/config.git/plain/config.guess
+            curl -Lo config.sub https://git.savannah.gnu.org/cgit/config.git/plain/config.sub
+            ls -al
+            chmod 755 config.guess
+            chmod 755 config.sub
+            ./configure
+            make
+            ls -al
+          fi
+        EOF
+        RUN <<EOF
+          cd /tmp/libspatialite-${SPATIALITE_VERSION}/
+          ls -al src/.libs
+          apt-get remove -y libsqlite3-mod-spatialite
+          make install
+        EOF
+        RUN cp /tmp/libspatialite-${SPATIALITE_VERSION}/src/.libs/mod_spatialite.so /tmp/
+        RUN ls -al /tmp
+        USER django:django
+        SHELL ["/bin/bash", "-c"]
+        ENV LD_PRELOAD="$$LD_PRELOAD ${SPATIALITE_VERSION:+/tmp/mod_spatialite.so}"
+      args:
+        - PYTHON_IMPLEMENTATION=${PYTHON_IMPLEMENTATION}
+        - PYTHON_VERSION=${PYTHON_VERSION}
+        - SQLITE_VERSION=${SQLITE_VERSION}
+        - SQLITE_CFLAGS=${SQLITE_CFLAGS}
+      additional_contexts: *additional-contexts
+
     environment:
       - DATABASE_ENGINE=django.contrib.gis.db.backends.spatialite

diff --git a/packages.txt b/packages.txt
index 86f8505..763c65d 100644
--- a/packages.txt
+++ b/packages.txt
@@ -1,5 +1,7 @@
+autoconf
 binutils
 build-essential
+curl
 default-libmysqlclient-dev
 default-mysql-client
 gdal-bin

I'm not a Docker expert, so any suggestions to improve the setup would be very much appreciated.

included in the `sqlite` module. A mismatch in the module and the dynamically
loaded library may result in Python failing to load, which may happen if we
use an SQLite version that is older than the system version.
- Debian and Ubuntu use a custom `CFLAGS` variable to compile their distributed
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.env Outdated
Comment on lines 9 to 11
SQLITE_CFLAGS="-DSQLITE_ENABLE_DESERIALIZE \
-DSQLITE_ENABLE_JSON1 \
-DSQLITE_MAX_VARIABLE_NUMBER=32766"
Copy link
Author

@laymonage laymonage Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained this before in more detail in django/django#18899 (comment), but here's a breakdown of each flag:

SQLITE_ENABLE_DESERIALIZE

This flag was not enabled by default until SQLite 3.36. Without this flag, you'll encounter an error like pyenv/pyenv#2625.

  • This is because Python checks for the availability of the deserialize functions at compile time.
  • If we don't want to add this flag, we can work around this by using an older version of Python that didn't have support for serialize() (< 3.11), or by using bullseye (which ships with 3.34.1 and thus Python 3.12 wasn't built to have SQLite deserialize functions). Or, of course, compiling Python from source ourselves.

SQLITE_ENABLE_JSON1

This flag was not enabled by default until SQLite 3.38.

We have the following test that fails if you run it with a database backend that does not support JSONField, because it's missing the @skipUnlessDBFeature decorator:

https://github.com/django/django/blob/8a6b4175d790424312965ec77e4e9b072fba188b/tests/schema/tests.py#L2425-L2443

You can verify this by setting SQLITE_VERSION to a version < 3.38 and removing -DSQLITE_ENABLE_JSON1 from SQLITE_CFLAGS, or by using a version >= 3.38 and adding -DSQLITE_OMIT_JSON.

Happy to raise a ticket and PR for that.

Edit: I've raised a ticket https://code.djangoproject.com/ticket/36156#ticket

SQLITE_MAX_VARIABLE_NUMBER

There are a few tests that would fail when it does ContentType.objects.all().delete() due to a row count query exceeding the variable limit. This only happens when the whole test suite is run, but not when the failing tests are run in isolation (likely because other tests would create more ContentType instances).

This flag defaulted to 999 in < 3.32 (and 32766 in >= 3.32.0).

This is something that Debian/Ubuntu has customized, even in 3.31.

Related:

compose.yml Outdated
Comment on lines 308 to 313
if [ -f libsqlite3.so ]; then
cp libsqlite3.so /tmp/
else
cp .libs/libsqlite3.so /tmp/
fi
rm -rf /tmp/sqlite
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "configure" script underwent major refactoring in 3.48.0: https://www.sqlite.org/releaselog/3_48_0.html

As a result, the compiled library is placed directly in the source root. In older versions, it's placed inside a .libs directory.

The rm -rf is just a cleanup step to keep the image size small.

@@ -285,6 +289,38 @@ services:

sqlite:
<<: *base
image: django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}-sqlite${SQLITE_VERSION}
pull_policy: never
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't push the image to Docker Hub. This ensures doing FROM django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION} won't make Docker try to pull it from Docker Hub if the image isn't available locally.

Could also try build instead, but if I recall correctly it doesn't work in this context unless you already built the base image. Using build is even worse as it will run the build process even if the image has already been built. It does use the cache, but it also adds a few seconds to the run, compared to never that would just skip it.

Copy link
Member

@charettes charettes Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be something off here. Here's what I get trying to build the image against Docker 4.38

$ PYTHON_VERSION=3.12 docker compose build sqlite
[+] Building 0.5s (3/3) FINISHED                                                                                                                                docker:desktop-linux
 => [sqlite internal] load build definition from Dockerfile                                                                                                                     0.0s
 => => transferring dockerfile: 734B                                                                                                                                            0.0s
 => ERROR [sqlite internal] load metadata for docker.io/library/django-docker-box:python-3.12                                                                                   0.4s
 => [sqlite auth] library/django-docker-box:pull token for registry-1.docker.io                                                                                                 0.0s
------
 > [sqlite internal] load metadata for docker.io/library/django-docker-box:python-3.12:
------
failed to solve: django-docker-box:python-3.12: failed to resolve source metadata for docker.io/library/django-docker-box:python-3.12: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see. I need to build the base first with PYTHON_VERSION=3.12 docker compose build base.

Might be worth documenting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated how we could define x-service image dependencies and I stumbled upon this page (which was deleted not too long along to encourage users to move to bake).

Unfortunately the solution they propose doesn't work when the image that you depend upon is parametrized

diff --git a/compose.yml b/compose.yml
index 301fb3e..e5d99f6 100644
--- a/compose.yml
+++ b/compose.yml
@@ -320,7 +320,9 @@ services:
         - PYTHON_VERSION=${PYTHON_VERSION}
         - SQLITE_VERSION=${SQLITE_VERSION}
         - SQLITE_CFLAGS=${SQLITE_CFLAGS}
-      additional_contexts: *additional-contexts
+      additional_contexts:
+        <<: *additional-contexts
+        "django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}": "service:base"
     depends_on:
       <<: *depends-on-caches
     environment:

The only solution I could find was to use depends_on to ensure the image was successfully

diff --git a/compose.yml b/compose.yml
index 301fb3e..10b4a03 100644
--- a/compose.yml
+++ b/compose.yml
@@ -142,6 +142,7 @@ services:
   # Base service to allow building the image with `docker compose build base`.
   base:
     <<: *base
+    command: --help

   # Services: Databases

@@ -323,6 +324,8 @@ services:
       additional_contexts: *additional-contexts
     depends_on:
       <<: *depends-on-caches
+      base:
+        condition: service_completed_successfully
     environment:
       - DATABASE_ENGINE=django.db.backends.sqlite3

Copy link
Member

@charettes charettes Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also mentioned in these docs

additional_contexts can also refer to an image built by another service. This allows a service image to be built using another service image as a base image, and to share layers between service images.

But I can't get it work either with

diff --git a/compose.yml b/compose.yml
index 301fb3e..9f761fb 100644
--- a/compose.yml
+++ b/compose.yml
@@ -294,7 +294,7 @@ services:
     build:
       context: .
       dockerfile_inline: |
-        FROM django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}
+        FROM base
         SHELL ["/bin/bash", "-o", "errexit", "-o", "nounset", "-o", "pipefail", "-o", "xtrace", "-c"]
         # Only compile SQLite and set LD_PRELOAD if a version is specified.
         RUN <<EOF
@@ -320,7 +320,9 @@ services:
         - PYTHON_VERSION=${PYTHON_VERSION}
         - SQLITE_VERSION=${SQLITE_VERSION}
         - SQLITE_CFLAGS=${SQLITE_CFLAGS}
-      additional_contexts: *additional-contexts
+      additional_contexts:
+        <<: *additional-contexts
+        base: service:base
     depends_on:
       <<: *depends-on-caches
     environment:

It results in

[+] Building 0.0s (0/0)                                                                                                                                         docker:desktop-linux
failed to get build context base: stat /path/to/django-docker-box/service:base: no such file or directory

Which is unsurprising as even their minimal example fails to run (when fixing their duplicate base service name 🤦 )

services:
 base:
  build:
    context: .
    dockerfile_inline: |
      FROM alpine
      RUN ...
 other: -- This was base wrongly set to base in their example
  build:
    context: .
    dockerfile_inline: |
      FROM base # image built for service base
      RUN ...
    additional_contexts:
      base: service:base
$ docker compose build service
[+] Building 0.0s (0/0)                                                                                                                                         docker:desktop-linux
failed to get build context base: stat /tmp/service:base: no such file or directory

Copy link
Author

@laymonage laymonage Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this, Simon!

I've refactored the setup to use multi-stage build in 1ab68b9.

It allows us to reuse the layers from the base image. Unfortunately, they're treated as two completely different images though, so Docker will still run all the steps from scratch instead of just the steps from after the base image. It's fairly quick as the layers will be reused (as long as it doesn't think the cached apt install layer didn't miss...), but technically it can be made faster by somehow using the built image.

For example, the following trick seems to work last time I tested it:

diff --git a/Containerfile b/Containerfile
index 3e5cf8a..6d3fa95 100644
--- a/Containerfile
+++ b/Containerfile
@@ -1,5 +1,6 @@
 # syntax=docker/dockerfile:1.12

+ARG BASE_IMAGE=pass
 ARG PYTHON_IMPLEMENTATION=python
 ARG PYTHON_VERSION=3.12
 FROM ${PYTHON_IMPLEMENTATION}:${PYTHON_VERSION}-slim-bookworm AS base
@@ -65,7 +66,7 @@ VOLUME /django/source
 WORKDIR /django/source/tests
 ENTRYPOINT ["/django/entrypoint.bash"]

-FROM base AS sqlite
+FROM ${BASE_IMAGE} AS sqlite
 ARG SQLITE_VERSION
 ARG SQLITE_CFLAGS
 SHELL ["/bin/bash", "-o", "errexit", "-o", "nounset", "-o", "pipefail", "-o", "xtrace", "-c"]
diff --git a/compose.yml b/compose.yml
index c18d32a..49a4157 100644
--- a/compose.yml
+++ b/compose.yml
@@ -7,6 +7,7 @@ x-base: &base
     dockerfile: ./Containerfile
     target: base
     args:
+      - BASE_IMAGE=pass
       - PYTHON_IMPLEMENTATION=${PYTHON_IMPLEMENTATION}
       - PYTHON_VERSION=${PYTHON_VERSION}
     additional_contexts:
@@ -297,6 +298,7 @@ services:
       <<: *base-build
       target: sqlite
       args:
+        - BASE_IMAGE=django-docker-box:${PYTHON_IMPLEMENTATION}-${PYTHON_VERSION}
         - PYTHON_IMPLEMENTATION=${PYTHON_IMPLEMENTATION}
         - PYTHON_VERSION=${PYTHON_VERSION}
         - SQLITE_VERSION=${SQLITE_VERSION}

but I'm not sure if that'd be acceptable. The ARG BASE_IMAGE=pass in the base image is just a placeholder as otherwise Docker would complain if an initial value is not provided, despite only being used for the SQLite image.

Also that approach would couple the Containerfile with the compose.yml file unless you provide BASE_IMAGE yourself. Without this optimisation, the Containerfile can be used standalone (I think) if one so wish.

@charettes
Copy link
Member

Thanks so much for your work on this @laymonage!

From reviewing your changes I think the tradeoff of sticking SQLITE_CFLAGS to flags that work with the currently bundled version of Python in the image makes sense particularly considering the amount of work that would be required to compile a compatible version of Python otherwise.

This is way better than what we ever had when I was pulling my hairs compiling multiple versions of SQLite and managing them to pinpoint the exact version that regressed and filed orf#25.

I'll give it a shot tomorrow trying to spike a solution to the the SQLite max query parameters issue but from a my relative exposure to Docker everything seemed right.

@ngnpope ngnpope self-requested a review January 31, 2025 12:48
@ngnpope
Copy link
Member

ngnpope commented Jan 31, 2025

Amazing! I'll try to have a play with this next week.

@@ -14,4 +14,5 @@ libpq-dev
libproj-dev
libsqlite3-mod-spatialite
pkg-config
tcl-dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires re-building base first otherwise you'll run into /bin/sh: 1: tclsh: not found.

Copy link
Author

@laymonage laymonage Feb 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep – I wasn't sure what to do about it. Adding a note in the README seems redundant if it's only needed once if you've already built the image before.

Although after refactoring it to the multi-stage build, this is no longer the case as they're now two separate images (unless we use the optimisation trick I described in #50 (comment)).

I've verified this by deleting all existing images and containers, rebuilding from current main, and then only running something like SQLITE_VERSION=3.41.0 SQLITE_CFLAGS="-DSQLITE_OMIT_JSON" docker compose run --rm sqlite db_functions.json without rebuilding base.

@laymonage laymonage force-pushed the specific-sqlite branch 3 times, most recently from 648f804 to c8d15b9 Compare February 16, 2025 17:15
@laymonage laymonage marked this pull request as ready for review February 16, 2025 18:06
Copy link
Author

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know when I'll be able to get specific SpatiaLite versions working, so I've updated the README with a note instead. I think we can get this in first, and do SpatiaLite separately later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support different versions of SQLite (libsqlite)
3 participants