Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
2128faf
Created CentOS 8 mininal image
Nov 16, 2021
8f3f21f
Added Python, FUSE, SSH and standard containers
Nov 16, 2021
5afe45f
Added README
Nov 16, 2021
9d0c9b2
Updated Python-container per https://github.com/databricks/containers…
Dec 31, 2021
36187a3
initial ganglia testing
rportilla-databricks May 27, 2021
84bb1ea
ganglia_img
rportilla-databricks May 28, 2021
dfa53ad
latest ganglia docker attempt
rportilla-databricks May 29, 2021
ae1d5d4
pushing ganglia.conf
rportilla-databricks May 29, 2021
cc50d34
added comments
rportilla-databricks May 29, 2021
fce75fa
latest ganglia changes
rportilla-databricks Jun 27, 2021
1a4fd66
adding gconf configs
rportilla-databricks Nov 24, 2021
c9b365c
adding gconf configs
rportilla-databricks Nov 24, 2021
668cba5
adding files to support ADD instead of inline shell comamnds
rportilla-databricks Nov 26, 2021
9f06cdd
Merge branch 'databricks:master' into centos8
HQJaTu Jan 3, 2022
8e3e00b
Changed minimal base image into CentOS 8 Stream. See: https://www.cen…
Jan 3, 2022
9f0aea4
Bugfix: Added missing ps-command into Databricks:minimal for Spark Ma…
Feb 6, 2022
5f6e276
Bugfix: Added missing ip-command to make Spark Driver working properly
Feb 7, 2022
0a8bf1f
Merge branch 'databricks:master' into centos8
HQJaTu Feb 7, 2022
9c9cdae
Merge branch 'databricks:master' into centos8
HQJaTu Jun 27, 2022
54c7e92
Changed Dockerhub repository name / tag system into something sustain…
Jun 27, 2022
617a304
Initial version on CentOS 9 Stream
Oct 17, 2022
fb558e9
Changed Python module versions to match official 11.2 images
Oct 17, 2022
1658829
Entirely new hierarchy for CentOS 9 images
Oct 19, 2022
c6f3dc0
chmod a-x on all files
Oct 19, 2022
8695e97
Merge branch 'databricks:master' into centos9
HQJaTu Oct 20, 2022
907c6a0
Added setup for SSHd, still doesn't seem to work correctly
Oct 23, 2022
3ae6321
Improvement: Upgrade pyarrow, added Databrics SQL-connector for expen…
Oct 23, 2022
f5e540e
Improvement on README
Oct 23, 2022
f8b2486
Merge branch 'databricks:master' into centos9
HQJaTu Feb 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions experimental/centos/centos-8/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Databricks Container Services - CentOS 8 Containers

This is a Databricks container runtime using CentOS 8 as base image.

### Info
- [DockerHub](https://hub.docker.com/_/centos) CentOS images
- Crypto policies in minimal are set to LEGACY enabling TLSv1, TLSv1.1 and CBC-ciphers
to allow connections into AWS RDS MySQL / MariaDB

## Images

- [Standard](standard): FUSE + OpenSSH server
- [Minimal](minimal): base, OpenJDK 1.8
- [Python](python): Pyton 3.8
- [DBFS FUSE](dbfsfuse): FUSE
- [SSH](ssh): OpenSSH server
9 changes: 9 additions & 0 deletions experimental/centos/centos-8/dbfsfuse/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM kingjatu/databricks-centos-8-python:latest

# Fuse:
RUN dnf install -y \
fuse

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*
19 changes: 19 additions & 0 deletions experimental/centos/centos-8/minimal/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
FROM quay.io/centos/centos:stream8

# Import keys to suppress warnings about GPG-keys
RUN rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial \
&& rpm --import https://packages.microsoft.com/keys/microsoft.asc

# Minimal:
# WARNING! Lower security by enabling TLSv1 and TLSv1.1
# Docs: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/considerations_in_adopting_rhel_8/security_considerations-in-adopting-rhel-8#tls-v10-v11_security
RUN dnf install -y \
java-1.8.0-openjdk \
sudo \
procps iproute \
&& update-ca-trust \
&& update-crypto-policies --set LEGACY

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*
30 changes: 30 additions & 0 deletions experimental/centos/centos-8/python/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
FROM kingjatu/databricks-centos-8-minimal:latest

# Python:
ARG python_dir=/databricks/python3/bin
RUN dnf install -y \
python38 \
python3-virtualenv

# Initialize the default environment that Spark and notebooks will use
RUN virtualenv --python python3.8 --system-site-packages /databricks/python3

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# These python libraries are used by Databricks notebooks and the Python REPL
# You do not need to install pyspark - it is injected when the cluster is launched
# Versions are intended to reflect DBR 9.0
RUN $python_dir/pip install \
six==1.15.0 \
# ensure minimum ipython version for Python autocomplete with jedi 0.17.x
ipython==7.19.0 \
numpy==1.19.2 \
pandas==1.2.4 \
pyarrow==4.0.0 \
matplotlib==3.4.2 \
jinja2==2.11.3

# Specifies where Spark will look for the python process
ENV PYSPARK_PYTHON=/databricks/python3/bin/python3
14 changes: 14 additions & 0 deletions experimental/centos/centos-8/ssh/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
FROM kingjatu/databricks-centos-8-minimal:latest

# Fuse:
RUN dnf install -y \
openssh-server

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# Add new user: bricks
# Warning: the created user has root permissions inside the container
# Warning: you still need to start the ssh process with `sudo service ssh start`
RUN useradd --create-home --shell /bin/bash --groups wheel bricks
14 changes: 14 additions & 0 deletions experimental/centos/centos-8/standard/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
FROM kingjatu/databricks-centos-8-dbfsfuse:latest

# Fuse:
RUN dnf install -y \
openssh-server

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# Add new user: bricks
# Warning: the created user has root permissions inside the container
# Warning: you still need to start the ssh process with `sudo service ssh start`
RUN useradd --create-home --shell /bin/bash --groups wheel bricks
20 changes: 20 additions & 0 deletions experimental/centos/centos-9/R-ssh/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM kingjatu/databricks-centos-9-r:latest

# Fuse:
RUN dnf install -y \
openssh-server

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# SSHd setup
CMD ["/usr/libexec/openssh/sshd-keygen", "ecdsa"]
CMD ["/usr/libexec/openssh/sshd-keygen", "rsa"]
CMD ["/usr/libexec/openssh/sshd-keygen", "ed25519"]
CMD ["/sbin/sshd", "-p", "2200"]

# Add new user: bricks
# Warning: the created user has root permissions inside the container
# Warning: you still need to start the ssh process with `sudo service ssh start`
RUN useradd --create-home --shell /bin/bash --groups wheel bricks
23 changes: 23 additions & 0 deletions experimental/centos/centos-9/R/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
FROM kingjatu/databricks-centos-9-python:latest

# R language, needs two repositories:
# CodeReady Linux Builder (CRB) and Extra Packages for Enterprise Linux (EPEL):
RUN dnf install -y dnf-plugins-core && \
dnf config-manager --set-enabled crb && \
dnf install -y epel-release && \
dnf install -y \
R

# Install Rserve
RUN R --vanilla -e 'install.packages("Rserve",, "http://rforge.net")'

# Clean-up:
RUN dnf config-manager --set-disabled crb \
&& dnf remove -y dnf-plugins-core \
&& dnf remove -y epel-release \
&& dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# Run Rserve on container launch. Databricks will do this.
#CMD ["R", "CMD", "Rserve"]

14 changes: 14 additions & 0 deletions experimental/centos/centos-9/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Databricks Container Services - CentOS 9 stream Containers

This is a Databricks container runtime using CentOS 9 stream as base image.

### Info
- [RedHat Quay.io](https://quay.io/repository/centos/centos?tab=tags) CentOS images

## Images

- [Base](base): CentOS 9 Stream made run as Apache Spark node + [FUSE](https://www.kernel.org/doc/html/latest/filesystems/fuse.html) (Filesystem in Userspace)
- [Python](python): Python 3.9, configured to run in Databricks
- [Python-SSH](python-ssh): Python 3.9 + OpenSSH server
- [R](r): R 4.2 tools for statistical computing
- [R-ssh](r-ssh): R + OpenSSH server
21 changes: 21 additions & 0 deletions experimental/centos/centos-9/base/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
FROM quay.io/centos/centos:stream9

# Import keys to suppress warnings about GPG-keys
RUN rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial \
&& rpm --import https://packages.microsoft.com/keys/microsoft.asc

RUN dnf install -y \
java-1.8.0-openjdk \
sudo \
procps iproute iputils \
fuse

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# Runtime version:
# Databricks has, but it needs to be a proper dotted version.
#ENV DATABRICKS_RUNTIME_VERSION=
ENV DATABRICKS_CUSTOM_RUNTIME_VERSION="Base 11.2 / CentOS9-Stream"

20 changes: 20 additions & 0 deletions experimental/centos/centos-9/python-ssh/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM kingjatu/databricks-centos-9-python:latest

# Fuse:
RUN dnf install -y \
openssh-server

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*

# SSHd setup
CMD ["/usr/libexec/openssh/sshd-keygen", "ecdsa"]
CMD ["/usr/libexec/openssh/sshd-keygen", "rsa"]
CMD ["/usr/libexec/openssh/sshd-keygen", "ed25519"]
CMD ["/sbin/sshd", "-p", "2200"]

# Add new user: bricks
# Warning: the created user has root permissions inside the container
# Warning: you still need to start the ssh process with `sudo service ssh start`
RUN useradd --create-home --shell /bin/bash --groups wheel bricks
50 changes: 50 additions & 0 deletions experimental/centos/centos-9/python/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
FROM kingjatu/databricks-centos-9-base:latest AS compile-image

# Python:
ARG python_dir=/databricks/python3/bin
RUN dnf install -y \
python39 python3-devel gcc

# Initialize the default environment that Spark and notebooks will use
RUN python3.9 -m venv --system-site-packages /databricks/python3

# These python libraries are used by Databricks notebooks and the Python REPL
# You do not need to install pyspark - it is injected when the cluster is launched
# Versions are intended to reflect DBR 9.0
RUN $python_dir/pip install --upgrade pip
RUN $python_dir/pip install \
six==1.16.0 \
# ensure minimum ipython version for Python autocomplete with jedi 0.17.x
ipython==7.32.0 \
numpy==1.20.3



FROM kingjatu/databricks-centos-9-base:latest AS build-image

# Python:
ARG python_dir=/databricks/python3/bin
RUN dnf install -y \
python39

# Clean-up:
RUN dnf clean all \
&& rm -rf /tmp/* /var/tmp/*


# Copy venv with libraries
COPY --from=compile-image /databricks/python3 /databricks/python3

RUN $python_dir/pip install \
virtualenv \
ipykernel \
pandas==1.3.4 \
pyarrow==9.0.0 \
matplotlib==3.4.3 \
jinja2==2.11.3 \
databricks-sql-connector==2.1.0


# Specifies where Spark will look for the python process
ENV PYSPARK_PYTHON=/databricks/python3/bin/python3