Unleashing Data Dependency-based Query Optimization

This repository contains the artifacts for the paper Unleashing Data Dependency-based Query Optimization.

Reproduction Guide

We listed all steps required to compile the DBMS code and execute all experiments in reproduction.sh. See this file for details, or execute it as is.

Reproducing all results will require multiple days. The script is expected to run on a recent Ubuntu version (we used 24.04 LTS). We only recommend running it as is on an isolated system: we install large packages and might change a system's default package versions. Executing might require root privileges because of package installation and running Docker for Umbra experiments. See the usage in the following:

./reproduction.sh [NUMA_NODE] [CLIENTS]

NUMA_NODE is the NUMA node ID to bind the experiments to. Defaults to 0.
CLIENTS is the number of clients to use for the high-load experiments. Defaults to the number of cores avaible on NUMA node NUMA_NODE * 0.6. We used 32.

The script calls all reproduction scripts in reproduction:

install.sh loads the subdirectories, compiles the DBMSs, and installs them. We install large packages and might break the system setup.
experiments_hyrise.sh executes the experiments for dependency-based optimizations in Hyrise.
experiments_systems.sh executes the throughput experiments for different DBMSs.
experiments_naive_validation.sh runs the naive dependency validation as a baseline for metadata-aware techniques.
create_plots.sh creates all plots.

Repository Structure

The hyrise submodule imports the adapted version of Hyrise for dependency-based query optimization.

The presented query rewrites are implemented as optimizer rules, found in hyrise/src/lib/optimizer/strategy. The relevant implementations are:
- O-1: dependent_group_by_reduction_rule.[c|h]pp
- O-2: join_to_semi_join_rule.[c|h]pp
- O-3: join_to_predicate_rewrite_rule.[c|h]pp
hyrise/src/plugins/dependency_discovery_plugin.[c|h]pp contains the dependency discovery plug-in. The implementation is further split.
- The dependency_discovery/candidate_strategy subdirectory contains the candidate generation rules.
- The dependency_discovery/validation_strategy subdirectory contains the metadata-aware dependency validation algorithms.
The hyrise/scripts folder contains various scripts, e.g., for benchmarking Hyrise.
- benchmark_single_optimizations.sh orchestrates all expriments for the impact of dependency-based optimizations in Hyrise, including dependency discovery times.
- benchmark_compare_plugin_sf.sh runs the experiments for the tradeoff between latency improvements achieved by dependency-based optimizations and the discovery overhead for different scale factors.

The code to run the experiments for dependency-based optimizations on different systems is mostly located in the python folder.

python/db_comparison_runner.py executes the experiment that measures the throughput improvement for different DBMSs.

The resources directory contains the benchmark schema/create table statements and log files. Python scripts for visualization and some helpers are located in scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
greenplum @ 0a7a356		greenplum @ 0a7a356
hyrise @ 0ff005b		hyrise @ 0ff005b
monetdb @ 7e38d50		monetdb @ 7e38d50
python		python
reproduction		reproduction
resources		resources
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
install_dependencies.sh		install_dependencies.sh
reproduction.sh		reproduction.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Unleashing Data Dependency-based Query Optimization

Reproduction Guide

Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

HPI-Information-Systems/dependency-based-qo

Folders and files

Latest commit

History

Repository files navigation

Unleashing Data Dependency-based Query Optimization

Reproduction Guide

Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages