Skip to content

BERDataLakehouse/trino_access_control

Repository files navigation

BERDL Trino Access Control Plugin

Trino SystemAccessControl plugin that enforces BERDL namespace isolation — the same u_{username}__* pattern used by the Spark Connect NamespaceValidationInterceptor, but at the Trino query engine level.

Namespace Patterns

Pattern Description Example
u_{username}__* Personal namespace u_alice__my_database
{tenant}_* Tenant namespace (group membership required) kbase_ke_pangenome
information_schema, default Shared schemas (configurable)

Trino is read-only in BERDL — all data writes go through Spark. Users can only create/drop their own dynamic catalogs (u_{user}_*).

How Tenant Resolution Works

  1. User connects to Trino with kbase_auth_token as an extra credential
  2. On first query, the plugin calls the BERDL governance API: GET /workspaces/me/groups
  3. The API returns the user's group memberships (e.g., ["kbase", "kbasero", "research"])
  4. Both regular (kbase) and read-only (kbasero) memberships grant read access to kbase_* schemas
  5. Results are cached per-user with a configurable TTL (default 300 seconds)

When a user is added to or removed from a tenant group, the change takes effect within one cache TTL window — no Trino restart required.

Configuration

In access-control.properties:

access-control.name=berdl-namespace-isolation

# Catalogs visible to all users (comma-separated). Empty = only u_{user}_* catalogs.
shared.catalogs=

# Schemas visible to all users (comma-separated).
shared.schemas=information_schema,default

# Catalogs where schema filtering is skipped entirely. Empty = all catalogs enforce filtering.
unfiltered.catalogs=

# BERDL governance API base URL for tenant group resolution.
# If empty, tenant namespace support is disabled (only personal namespaces work).
governance.api.url=http://mms.dev:8000

# How long to cache tenant group memberships per user (seconds).
governance.cache.ttl.seconds=300

Access Control Rules

Operation Rule
SHOW CATALOGS Filtered to u_{user}_* + shared.catalogs
SHOW SCHEMAS Filtered to u_{user}__* + {tenant}_* + shared.schemas
SELECT / SHOW TABLES Allowed in own, tenant, and shared schemas
INSERT / UPDATE / DELETE Denied — Trino is read-only; use Spark
CREATE / DROP SCHEMA Denied — use Spark
CREATE / DROP TABLE Denied — use Spark
RENAME TABLE Denied — use Spark
CREATE / DROP CATALOG Only u_{user}_* prefix allowed (needed for dynamic catalog setup)

Building

Requires Java 25 (matches Trino 479 SPI). The easiest way to build is via Docker:

# Build the Docker image (compiles plugin + packages into Trino image)
docker build -t trino_access_control .

# Or build just the JAR via the builder stage
docker build --target builder -t trino-ac-builder .

To build locally (requires JDK 25):

./gradlew jar
# Output: libs/berdl-trino-access-control-1.0.0.jar

Running

Docker Compose (local development)

The plugin is used in spark_notebook/docker-compose.yaml:

trino:
  build:
    context: ../trino_access_control
    dockerfile: Dockerfile
  volumes:
    - ./configs/trino/config.properties:/etc/trino/config.properties
    - ./configs/trino/access-control.properties:/etc/trino/access-control.properties

Project Structure

trino_access_control/
├── src/main/java/us/kbase/trino/
│   ├── BerdlAccessControlPlugin.java   # Plugin entry point (SPI)
│   ├── BerdlAccessControlFactory.java  # Factory for creating access control instances
│   └── BerdlSystemAccessControl.java   # Core access control logic
├── src/main/resources/META-INF/services/
│   └── io.trino.spi.Plugin             # Service provider registration
├── build.gradle.kts                    # Build config (Trino 479, Java 25)
├── Dockerfile                          # Multi-stage build
└── .github/workflows/
    ├── test.yml                        # CI: build + compile check
    └── build_docker_image.yml          # CI: Docker image publish

Dependencies

  • Trino SPI 479 (compile-only — provided by Trino server at runtime)
  • Java 25 (matches Trino 479 compilation target)
  • No additional runtime dependencies — uses only JDK standard library (java.net.http, java.util.regex)

Integration with BERDL

This plugin is one layer in BERDL's defense-in-depth namespace isolation:

Layer Component Enforcement
Query engine (Trino) This plugin Read-only + schema/catalog filtering
Query engine (Spark) NamespaceValidationInterceptor CREATE DATABASE prefix validation
Metadata store Hive Metastore Auth {username}_* database namespace isolation
Storage MinIO IAM policies Per-user S3 bucket isolation
Policy Apache Ranger Fine-grained table/column access control

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors