Skip to content

Commit

Permalink
Merge pull request #131 from Canner/feature/ds-bigquery
Browse files Browse the repository at this point in the history
Feature: BigQuery support - statement
  • Loading branch information
oscar60310 authored Nov 3, 2022
2 parents bce09d1 + 2d347bd commit 642aeb7
Show file tree
Hide file tree
Showing 22 changed files with 1,142 additions and 59 deletions.
108 changes: 54 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,21 @@
</p>

## What is VulcanSQL
> **VulcanSQL is an Analytics API generator** that helps data engineers to build scalable analytics APIs using only SQL without writing any backend code.

> **VulcanSQL is an Analytics API generator** that helps data engineers to build scalable analytics APIs using only SQL without writing any backend code.
## Why VulcanSQL?

APIs are still the primary programming interface for data consumers to utilize data in their daily business applications, such as BI, reports, dashboards, spreadsheets, and web applications. However, data stored in data warehouses are not accessible for those users and tools without an API consumption layer.
APIs are still the primary programming interface for data consumers to utilize data in their daily business applications, such as BI, reports, dashboards, spreadsheets, and web applications. However, data stored in data warehouses are not accessible for those users and tools without an API consumption layer.

VulcanSQL aims to solve that problem by translating SQL into flexible APIs; it is contextual in that it can translate APIs into the corresponding SQL based on different user personas and business contexts. It is also extendable with custom business logic and complex SQL translation.
VulcanSQL aims to solve that problem by translating SQL into flexible APIs; it is contextual in that it can translate APIs into the corresponding SQL based on different user personas and business contexts. It is also extendable with custom business logic and complex SQL translation.

## When use VulcanSQL?

When scaling data usages outside the traditional data team to business users and application developers using APIs. VulcanSQL is the perfect solution for data using in applications.

## Features

- Parameterized SQL into scalable and secure APIs
- Built-in API access and version control
- Built-in self-generated API documentation
Expand All @@ -40,12 +42,13 @@ When scaling data usages outside the traditional data team to business users and

- PosgreSQL
- DuckDB
- Snowflake (WIP)
- BigQuery (WIP)
- Snowflake
- BigQuery

## How VulcanSQL works?

### Step 1: Parameterized your SQL.

<p align="center">
<img src="https://i.imgur.com/2PMrlJC.png" width="600" >
</p>
Expand Down Expand Up @@ -75,64 +78,58 @@ Response
<details>
<summary>1. Error Handling</summary>
If you want to throw errors based on business logic. for example, run a query first, if no data return, throw `404 not found`.
```sql
{% req user %}
select * from public.users where userName = {{ context.parames.userName }} limit 1;
{% endreq %}

{% if user.value().length == 0 %}
{% error "user not found" %}
{% endif %}

select * from public.groups where userId = {{ user.value()[0].id }};
```
If you want to throw errors based on business logic. for example, run a query first, if no data return, throw `404 not found`.
```sql
{% req user %}
select * from public.users where userName = {{ context.parames.userName }} limit 1;
{% endreq %}

{% if user.value().length == 0 %}
{% error "user not found" %}
{% endif %}

select * from public.groups where userId = {{ user.value()[0].id }};
```
</details>
<details>
<summary>2. Authorization</summary>
You can pass in user attributes to achieve user access control. We will build the corresponding SQL on the fly.
```sql
select
--- masking address if query user is not admin
{% if context.user.name == 'ADMIN' %}
{% "address" %}
{% elif %}
{% "masking(address)" %}
{% endif %},

orderId,
amount
from orders

--- limit the data to the store user belongs to.
where store = {{ context.user.attr.store }}
```
You can pass in user attributes to achieve user access control. We will build the corresponding SQL on the fly.
```sql
select
--- masking address if query user is not admin
{% if context.user.name == 'ADMIN' %}
{% "address" %}
{% elif %}
{% "masking(address)" %}
{% endif %},

orderId,
amount
from orders

--- limit the data to the store user belongs to.
where store = {{ context.user.attr.store }}
```
</details>
<details>
<summary>3. Validation</summary>
You can add a number validator on `userId` input.
- SQL
```sql
select * from public.users where id = {{ context.params.userId }}
```
- Schema
```yaml
parameters:
userId:
in: query
validators: # set your validator here.
- name: 'number'
```
</details>
You can add a number validator on `userId` input.
- SQL
```sql
select * from public.users where id = {{ context.params.userId }}
```
- Schema
`yaml parameters: userId: in: query validators: # set your validator here. - name: 'number' `
</details>
### Step 2: Build self-serve documentation and catalog
Expand Down Expand Up @@ -176,6 +173,7 @@ On API catalog page, you can preview data or read from your applications.
Visit [the documentation](https://vulcansql.com/docs/installation) for installation guide.
## Demo Screenshot
<p align="center">
<img src="https://i.imgur.com/j4jcYj1.png" width="800" >
</p>
Expand All @@ -201,10 +199,12 @@ Visit [the documentation](https://vulcansql.com/docs/installation) for installat
> 🔌 **Connect**: Users will be able to follow the guide and connect from their applications.
## Community
* Welcome to our [Discord](https://discord.gg/ztDz8DCmG4) to give us feedback!
* If any issues, please visit [Github Issues](https://github.com/Canner/vulcan-sql/issues)
- Welcome to our [Discord](https://discord.gg/ztDz8DCmG4) to give us feedback!
- If any issues, please visit [Github Issues](https://github.com/Canner/vulcan-sql/issues)
## Special Thanks
<a href="https://vercel.com/?utm_source=vulcan-sql-document&utm_campaign=oss">
<img src="https://user-images.githubusercontent.com/9553914/193729375-e242584f-95c5-49d4-b064-3892aa427117.svg">
</a>
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
},
"private": true,
"dependencies": {
"@google-cloud/bigquery": "^6.0.3",
"@koa/cors": "^3.3.0",
"bcryptjs": "^2.4.3",
"bluebird": "^3.7.2",
Expand Down
2 changes: 1 addition & 1 deletion packages/doc/docs/connectors.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ We support the following data warehouses to connect with, you can choose multipl
| [PostgreSQL](./connectors/postgresql) | ✅ Yes | ✅ Yes | ❌ No |
| [DuckDB](./connectors/duckdb) | ✅ Yes | ✅ Yes | ❌ No |
| [Snowflake](./connectors/snowflake) | ✅ Yes | ✅ Yes | ❌ No |
| BigQuery | | | |
| [BigQuery](./connectors/bigquery) | ✅ Yes | ✅ Yes | ❌ No |

\* Fetching rows only when we need them, it has better performance with large query results.

Expand Down
78 changes: 78 additions & 0 deletions packages/doc/docs/connectors/bigquery.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# BigQuery

Connect with your bigquery servers via the official [Node.js Driver](https://cloud.google.com/nodejs/docs/reference/bigquery/latest).

## Installation

1. Install package

```bash
npm i @vulcan-sql/extension-driver-bq
```

:::info
If you run VulcanSQL with Docker, you should use the command `vulcan-install @vulcan-sql/extension-driver-bq` instead.

:::

2. Update `vulcan.yaml`, and enable the extension.

```yaml
extensions:
...
// highlight-next-line
bq: '@vulcan-sql/extension-driver-bq' # Add this line
```
3. Create a new profile in `profiles.yaml` or in your profile files. For example:
:::info
You can choose one from `keyFilename` or `credentials` to use.

Your service account must have the following permissions to successfully execute queries.

- BigQuery Data Viewer
- BigQuery Job User

>

For details, please refer to [here](https://cloud.google.com/docs/authentication#service-accounts).
:::

with `keyFilename`:

```yaml
- name: bq # profile name
type: bq
connection:
location: US
projectId: 'your-project-id'
keyFilename: '/path/to/keyfile.json'
allow: '*'
```

with `credential`:

```yaml
- name: bq # profile name
type: bq
connection:
location: US
projectId: 'your-project-id'
credential:
client_email: [email protected]
private_key: '-----BEGIN PRIVATE KEY----- XXXXX -----END PRIVATE KEY-----\n'
allow: '*'
```

## Connection Configuration

Please check [Interface BigQueryOptions](https://cloud.google.com/nodejs/docs/reference/bigquery/latest/bigquery/bigqueryoptions) and [Google BigQuery: Node.js Client](https://github.com/googleapis/nodejs-bigquery/blob/main/src/bigquery.ts#L173-L244) for further information.

| Name | Required | Default | Description |
| ------------------------ | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| location | N | US | Location must match that of the dataset(s) referenced in the query. |
| projectId | N | | The project ID from the Google Developer's Console, e.g. 'grape-spaceship-123'. We will also check the environment variable `GCLOUD_PROJECT` for your project ID. If your app is running in an environment which [supports](https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application) Application Default Credentials, your project ID will be detected. |
| keyFilename | N | | Full path to the a .json, .pem, or .p12 key downloaded from the Google Developers Console. If you provide a path to a JSON file, the `projectId` option above is not necessary. NOTE: .pem and .p12 require you to specify the `email` option as well. |
| credentials | N | | Credentials object. |
| credentials.client_email | N | | Your service account. |
| credentials.private_key | N | | Your service account's private key. |
18 changes: 18 additions & 0 deletions packages/extension-driver-bq/.eslintrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{
"extends": ["../../.eslintrc.json"],
"ignorePatterns": ["!**/*"],
"overrides": [
{
"files": ["*.ts", "*.tsx", "*.js", "*.jsx"],
"rules": {}
},
{
"files": ["*.ts", "*.tsx"],
"rules": {}
},
{
"files": ["*.js", "*.jsx"],
"rules": {}
}
]
}
54 changes: 54 additions & 0 deletions packages/extension-driver-bq/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# extension-driver-bq

[nodejs-bigquery](https://cloud.google.com/nodejs/docs/reference/bigquery/latest) driver for Vulcan SQL.

## Install

1. Install package

```sql
npm i @vulcan-sql/extension-driver-bq
```

2. Update `vulcan.yaml`, enable the extension.

```yaml
extensions:
bq: '@vulcan-sql/extension-driver-bq'
```
3. Create a new profile in `profiles.yaml` or in your profiles' paths.

> ⚠️ Your service account must have the following permissions to successfully execute queries.
>
> - BigQuery Data Viewer
> - BigQuery Job User

```yaml
- name: bq # profile name
type: bq
connection:
# Location must match that of the dataset(s) referenced in the query.
location: US
# Optional: The max rows we should fetch once.
chunkSize: 100
# The project ID from the Google Developer's Console, e.g. 'grape-spaceship-123'. We will also check the environment variable `GCLOUD_PROJECT` for your project ID. If your app is running in an environment which [supports](https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application) Application Default Credentials), your project ID will be detected.
projectId: 'your-project-id'
# Full path to the a .json, .pem, or .p12 key downloaded from the Google Developers Console. If you provide a path to a JSON file, the `projectId` option above is not necessary. NOTE: .pem and .p12 require you to specify the `email` option as well.
keyFilename: '/path/to/keyfile.json'
```
## Testing
```bash
nx test extension-driver-bq
```

This library was generated with [Nx](https://nx.dev).

To run test, the following environment variables are required:

- BQ_LOCATION
- BQ_PROJECT_ID
- BQ_CLIENT_EMAIL
- BQ_PRIVATE_KEY
14 changes: 14 additions & 0 deletions packages/extension-driver-bq/jest.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
module.exports = {
displayName: 'extension-driver-bq',
preset: '../../jest.preset.ts',
globals: {
'ts-jest': {
tsconfig: '<rootDir>/tsconfig.spec.json',
},
},
transform: {
'^.+\\.[tj]s$': 'ts-jest',
},
moduleFileExtensions: ['ts', 'js', 'html'],
coverageDirectory: '../../coverage/packages/extension-driver-bq',
};
29 changes: 29 additions & 0 deletions packages/extension-driver-bq/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"name": "@vulcan-sql/extension-driver-bq",
"description": "BigQuery driver for Vulcan SQL",
"version": "0.3.0",
"type": "commonjs",
"publishConfig": {
"access": "public"
},
"keywords": [
"vulcan",
"vulcan-sql",
"data",
"sql",
"database",
"data-warehouse",
"data-lake",
"api-builder",
"bigquery",
"bq"
],
"repository": {
"type": "git",
"url": "https://github.com/Canner/vulcan.git"
},
"license": "MIT",
"peerDependencies": {
"@vulcan-sql/core": "~0.3.0-0"
}
}
Loading

1 comment on commit 642aeb7

@vercel
Copy link

@vercel vercel bot commented on 642aeb7 Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.