Skip to content

Commit

Permalink
Merge branch 'main' into 246-alter-subscription
Browse files Browse the repository at this point in the history
  • Loading branch information
TianyuZhang1214 authored Dec 7, 2024
2 parents b7714df + 19b1af5 commit cdf1635
Show file tree
Hide file tree
Showing 15 changed files with 179 additions and 74 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/clients-compatibility.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ jobs:
sudo mv duckdb /usr/local/bin
cd compatibility/pg/
# curl -L -o ./java/postgresql-42.7.4.jar https://jdbc.postgresql.org/download/postgresql-42.7.4.jar
curl -L -o ./java/postgresql-42.7.4.jar https://jdbc.postgresql.org/download/postgresql-42.7.4.jar
npm install pg
sudo cpanm --notest DBD::Pg
pip3 install "psycopg[binary]" pandas pyarrow polars
Expand Down
81 changes: 81 additions & 0 deletions .github/workflows/server-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@

name: Docker Server Mode Test

on:
push:
branches: [ "main" ]

jobs:
test-server:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Install system packages
uses: awalsh128/cache-apt-pkgs-action@latest
with:
packages: mysql-client postgresql-client
version: 1.0

- name: Start MyDuck Server in server mode
run: |
docker run -d --name myduck \
-p 13306:3306 \
-p 15432:5432 \
--env=SETUP_MODE=SERVER \
apecloud/myduckserver:latest
# Wait for MyDuck to be ready
sleep 10
- name: Test MySQL protocol
run: |
# Test connection and create/insert/query data
mysql -h127.0.0.1 -P13306 -uroot -e "
CREATE DATABASE test;
USE test;
CREATE TABLE items (id INT PRIMARY KEY, name VARCHAR(50));
INSERT INTO items VALUES (1, 'test1'), (2, 'test2');
SELECT * FROM items ORDER BY id;" | tee mysql_results.txt
# Verify results
if grep -q "test1" mysql_results.txt && grep -q "test2" mysql_results.txt; then
echo "MySQL protocol test successful"
else
echo "MySQL protocol test failed"
exit 1
fi
- name: Test PostgreSQL protocol
run: |
# Test connection and query data
psql -h 127.0.0.1 -p 15432 -U postgres -c "
SELECT * FROM test.items ORDER BY id;" | tee pg_results.txt
# Verify results
if grep -q "test1" pg_results.txt && grep -q "test2" pg_results.txt; then
echo "PostgreSQL protocol test successful"
else
echo "PostgreSQL protocol test failed"
exit 1
fi
- name: Test DuckDB SQL features
run: |
# Test some DuckDB-specific features through PostgreSQL protocol
psql -h 127.0.0.1 -p 15432 -U postgres -c "CREATE TABLE numbers AS SELECT * FROM range(1, 5) t(n);"
psql -h 127.0.0.1 -p 15432 -U postgres -c "
SELECT list_aggregate(list(n), 'sum') as list_sum FROM numbers;" | tee duckdb_results.txt
# Verify results (sum should be 10)
if grep -q "10" duckdb_results.txt; then
echo "DuckDB features test successful"
else
echo "DuckDB features test failed"
exit 1
fi
- name: Cleanup
if: always()
run: |
docker rm -f myduck || true
40 changes: 20 additions & 20 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,26 +9,26 @@ Before getting started, ensure that the following dependencies are installed:
1. **Go**
Download and install the latest version of Go by following the [official installation guide](https://go.dev/doc/install).

2. **Python and `sqlglot[rs]` package**
MyDuck Server depends on the `sqlglot[rs]` package, which can be installed using `pip3`. You have two options for installation:

- **Global installation** (use with caution as it may affect system packages):
```bash
pip3 install "sqlglot[rs]" --break-system-packages
```

- **Installation inside a virtual environment** (recommended):
```bash
mkdir -p ~/venv
python3 -m venv ~/venv/myduck
source ~/venv/myduck/bin/activate
pip3 install "sqlglot[rs]"
```

Make sure to activate the virtual environment when you work on the project:
```bash
source ~/venv/myduck/bin/activate
```
2. **Python and `sqlglot[rs]` package**
MyDuck Server depends on the `sqlglot[rs]` package, which can be installed using `pip3`. You have two options for installation:

- **Global installation** (use with caution as it may affect system packages):
```bash
pip3 install "sqlglot[rs]" --break-system-packages
```

- **Installation inside a virtual environment** (recommended):
```bash
mkdir -p ~/venv
python3 -m venv ~/venv/myduck
source ~/venv/myduck/bin/activate
pip3 install "sqlglot[rs]"
```

Make sure to activate the virtual environment when you work on the project:
```bash
source ~/venv/myduck/bin/activate
```

---

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ We have big plans for MyDuck Server! Here are some of the features we’re worki

## 🏡 Join the Community

- [Discord](https://discord.gg/tSy2MYw3) Let's communicate on Discord about requirements, issues, and user experiences.
- [Discord](https://discord.gg/9MC5cgw5YK) Let's communicate on Discord about requirements, issues, and user experiences.

## 💡 Contributing

Expand Down
13 changes: 13 additions & 0 deletions compatibility/pg-pytools/pyarrow_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,16 @@
df_from_pg = df_from_pg.astype({'id': 'int64', 'num': 'int64'})
# Compare the original DataFrame with the DataFrame from PostgreSQL
assert df.equals(df_from_pg), "DataFrames are not equal"

# Copy query result to a pandas DataFrame
arrow_data = io.BytesIO()
with cur.copy("COPY (SELECT id, num * num AS num FROM test.tb1) TO STDOUT (FORMAT arrow)") as copy:
for block in copy:
arrow_data.write(block)

with pa.ipc.open_stream(arrow_data.getvalue()) as reader:
df_from_pg = reader.read_pandas().astype({'id': 'int64', 'num': 'int64'})
df['num'] = df['num'] ** 2
df = df.drop('data', axis='columns')
# Compare the original DataFrame with the DataFrame from PostgreSQL
assert df.equals(df_from_pg), "DataFrames are not equal"
26 changes: 11 additions & 15 deletions compatibility/pg/csharp/PGTest.cs
Original file line number Diff line number Diff line change
@@ -1,28 +1,28 @@
using System;
using System.Collections.Generic;
using System.Data;
using Microsoft.Data.SqlClient;
using Npgsql;
using System.IO;

public class PGTest
{
public class Tests
{
private SqlConnection conn;
private SqlCommand cmd;
private NpgsqlConnection conn;
private NpgsqlCommand cmd;
private List<Test> tests = new List<Test>();

public void Connect(string ip, int port, string user, string password)
{
string connectionString = $"Server={ip},{port};User Id={user};Password={password};";
string connectionString = $"Host={ip};Port={port};Username={user};Password={password};Database=postgres;";
try
{
conn = new SqlConnection(connectionString);
conn = new NpgsqlConnection(connectionString);
conn.Open();
cmd = conn.CreateCommand();
cmd.CommandType = CommandType.Text;
}
catch (SqlException e)
catch (NpgsqlException e)
{
throw new Exception($"Error connecting to database: {e.Message}", e);
}
Expand All @@ -35,7 +35,7 @@ public void Disconnect()
cmd.Dispose();
conn.Close();
}
catch (SqlException e)
catch (NpgsqlException e)
{
throw new Exception(e.Message);
}
Expand Down Expand Up @@ -97,7 +97,7 @@ public Test(string query, string[][] expectedResults)
this.expectedResults = expectedResults;
}

public bool Run(SqlCommand cmd)
public bool Run(NpgsqlCommand cmd)
{
try
{
Expand Down Expand Up @@ -125,15 +125,11 @@ public bool Run(SqlCommand cmd)
{
for (int col = 0; col < expectedResults[rows].Length; col++)
{
string result = reader.GetString(col);
string result = reader.GetValue(col).ToString().Trim();
if (expectedResults[rows][col] != result)
{
Console.Error.WriteLine($"Expected:\n'{expectedResults[rows][col]}'");
Console.Error.WriteLine($"Result:\n'{result}'\nRest of the results:");
while (reader.Read())
{
Console.Error.WriteLine(reader.GetString(0));
}
Console.Error.WriteLine($"Result:\n'{result}'");
return false;
}
}
Expand All @@ -148,7 +144,7 @@ public bool Run(SqlCommand cmd)
return true;
}
}
catch (SqlException e)
catch (NpgsqlException e)
{
Console.Error.WriteLine(e.Message);
return false;
Expand Down
2 changes: 1 addition & 1 deletion compatibility/pg/csharp/PGTest.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.Data.SqlClient" Version="5.2" />
<PackageReference Include="Npgsql" Version="7.0.7" />
</ItemGroup>

</Project>
7 changes: 2 additions & 5 deletions compatibility/pg/java/PGTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -89,13 +89,10 @@ public boolean run() {
while (rs.next()) {
int cols = 0;
for (String expected : expectedResults[rows]) {
String result = rs.getString(cols + 1);
String result = rs.getString(cols + 1).trim();
if (!expected.equals(result)) {
System.err.println("Expected:\n'" + expected + "'");
System.err.println("Result:\n'" + result + "'\nRest of the results:");
while (rs.next()) {
System.err.println(rs.getString(1));
}
System.err.println("Result:\n'" + result + "'");
return false;
}
cols++;
Expand Down
13 changes: 8 additions & 5 deletions compatibility/pg/rust/pg_test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,16 @@ impl Test {
}
for (i, row) in rows.iter().enumerate() {
for (j, expected) in self.expected_results[i].iter().enumerate() {
let result: String = row.get(j);
let result: String = row.try_get::<usize, String>(j)
.or_else(|_| row.try_get::<usize, i32>(j).map(|v| v.to_string()))
.or_else(|_| row.try_get::<usize, i64>(j).map(|v| v.to_string()))
.or_else(|_| row.try_get::<usize, f64>(j).map(|v| v.to_string()))
.unwrap_or_default()
.trim()
.to_string();
if expected != &result {
eprintln!("Expected:\n'{}'", expected);
eprintln!("Result:\n'{}'\nRest of the results:", result);
for row in rows.iter().skip(i + 1) {
eprintln!("{}", row.get::<usize, String>(0));
}
eprintln!("Result:\n'{}'", result);
return false;
}
}
Expand Down
20 changes: 9 additions & 11 deletions compatibility/pg/test.bats
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bats

setup() {
psql -h 127.0.0.1 -p 5432 -U postgres -c "DROP SCHEMA IF EXISTS test CASCADE;"
psql -h 127.0.0.1 -p 5432 -U postgres -d postgres -c "DROP SCHEMA IF EXISTS test CASCADE;"
touch /tmp/test_pids
}

Expand Down Expand Up @@ -52,12 +52,10 @@ start_process() {
start_process $BATS_TEST_DIRNAME/go/pg 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
}

# Before you uncomment the following tests, you need to uncomment the corresponding lines in
# .github/workflows/clients-compatibility.yml, which will download the necessary dependencies.
# @test "pg-java" {
# start_process javac $BATS_TEST_DIRNAME/java/PGTest.java
# start_process java -cp $BATS_TEST_DIRNAME/java:$BATS_TEST_DIRNAME/java/postgresql-42.7.4.jar PGTest 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
# }
@test "pg-java" {
start_process javac $BATS_TEST_DIRNAME/java/PGTest.java
start_process java -cp $BATS_TEST_DIRNAME/java:$BATS_TEST_DIRNAME/java/postgresql-42.7.4.jar PGTest 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
}

@test "pg-node" {
start_process node $BATS_TEST_DIRNAME/node/pg_test.js 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
Expand All @@ -83,7 +81,7 @@ start_process() {
start_process ruby $BATS_TEST_DIRNAME/ruby/pg_test.rb 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
}

# @test "pg-rust" {
# start_process cargo build --release --manifest-path $BATS_TEST_DIRNAME/rust/Cargo.toml
# start_process $BATS_TEST_DIRNAME/rust/target/release/pg_test 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
# }
@test "pg-rust" {
start_process cargo build --release --manifest-path $BATS_TEST_DIRNAME/rust/Cargo.toml
start_process $BATS_TEST_DIRNAME/rust/target/release/pg_test 127.0.0.1 5432 postgres "" $BATS_TEST_DIRNAME/test.data
}
2 changes: 1 addition & 1 deletion compatibility/pg/test.data
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
CREATE SCHEMA test

CREATE TABLE test.tb1 (id int, value float, c1 char(10), primary key(id))
CREATE TABLE test.tb1 (id int, value double precision, c1 char(10), primary key(id))

INSERT INTO test.tb1 VALUES (1, 1.1, 'a'), (2, 2.2, 'b')

Expand Down
3 changes: 2 additions & 1 deletion docker/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ setup() {
if [ -n "$LOG_LEVEL" ]; then
export LOG_LEVEL="-loglevel $LOG_LEVEL"
fi
parse_dsn

# Ensure required directories exist
mkdir -p "${DATA_PATH}" "${LOG_PATH}"

Expand All @@ -199,6 +199,7 @@ setup() {
;;
"REPLICA")
echo "Starting MyDuck Server in REPLICA mode..."
parse_dsn
run_server_in_background
wait_for_my_duck_server_ready
run_replica_setup
Expand Down
17 changes: 17 additions & 0 deletions docs/tutorial/pg-python-data-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,20 @@ polars_df = pl.from_arrow(arrow_df)
```python
polars_df = pl.from_pandas(pandas_df)
```

## 4. Retrieving Query Results as DataFrames

You can also retrieve query results from MyDuck Server as DataFrames using Arrow format. Here is an example:

```python
# Copy query result to a Polars DataFrame
arrow_data = io.BytesIO()
with cur.copy("COPY (SELECT id, num * num AS num FROM test.tb1) TO STDOUT (FORMAT arrow)") as copy:
for block in copy:
arrow_data.write(block)

with pa.ipc.open_stream(arrow_data.getvalue()) as reader:
arrow_table = reader.read_all()
polars_df = pl.from_arrow(arrow_table)
print(polars_df)
```
Loading

0 comments on commit cdf1635

Please sign in to comment.