-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databend vs. Snowflake Features and Capabilities #13059
Comments
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
We also support aggregate and hash join spilling. |
@BohuTANG Could you help me change the address of the connection, https://docs.databend.com/load-data/pipeline => https://docs.databend.com/doc/load-data/pipeline Thank you in advance! |
@BohuTANG |
I didn't do a comprehensive review, but your Snowflake features are out of date. A few examples:
|
Hi @sfc-gh-space, Thank you for the information. I've updated the outdated sections with the information you provided.
While Snowflake supports external tables, it's important to note that Databend’s external location tables operate differently (see: Databend External Location Tables). In Snowflake, you can't place the Snowflake-specific micro-partition table format into your own S3 bucket, as it's confined to Snowflake-managed storage. However, Databend allows external table data to be stored directly in your S3 bucket. Databend’s approach is more akin to Snowflake's external stage queries (see: Querying an External Stage).
Here we have changed it to |
Regarding your comment: "Databend allows external table data to be stored directly in your S3 bucket." Apache Iceberg™ is a fully native table format (e.g a first party object) for Snowflake since June 2024. This means Snowflake customers can choose to store their data in their own S3 bucket if they like. Apache Iceberg™ tables are Parquet files with metadata. Databend appears to store data in Parquet as well. I see zero distinction. |
Hi @sfc-gh-space, Just to clarify, Databend's FUSE table format is our default format and is similar to Snowflake's table format with micro-partitions. It is optimized for performance, multi-transaction support, streaming capabilities, and includes significant improvements for columnar storage. One key difference is that Databend allows you to store FUSE tables in external locations, such as your own bucket, while Snowflake's external tables only support external formats. |
I assume this purpose of this matrix is to help companies understand the difference between Databend and Snowflake. In this case, you've used a term ("external table") that means something specific to Snowflake which is causing confusion. You've attempted to clarify this by redefining external table to mean "native table in a customer's VPC". But here's the issue: 12 years ago Snowflake only had one internal table format, FDN. However, in 2024, Snowflake has additional native tables formats for different use cases. FDN (micro-partitions), Apache Iceberg™ (Parquet + metadata), hybrid tables (row+column), Event Tables (OpenTelemetry), etc. If you have redefined this column to mean "native table format in customer VPC", then the answer for Snowflake is Yes: Apache Iceberg™ (Parquet + metadata). The answer for Databend is Yes: FUSE (Parquet + metadata). Apache Iceberg™ tables in Snowflake are fully supported first party objects that use the same query compiler, have the same ability to prune, and support other native Snowflake features like live sharing, data governance, etc. More importantly for customers, performance on these tables are excellent. |
Features and Capabilities
Databend is an open-source alternative to Snowflake, so we need a page to introduce the differences between Databend and Snowflake.
Feel free to contribute if you notice any omissions.
Java
Ptyhon
Node.js
Java
Python
Node.js
Rust
Databend and Snowflake Functions Comparison
Mathematical Functions
ABS(x)
ABS(x)
ACOS(x)
ACOS(x)
ASIN(x)
ASIN(x)
ATAN(x)
ATAN(x)
ATAN2(y, x)
ATAN2(y, x)
CEIL(x)
CEIL(x)
CEILING(x)
CEILING(x)
COS(x)
COS(x)
DEGREES(x)
DEGREES(x)
EXP(x)
EXP(x)
FLOOR(x)
FLOOR(x)
LN(x)
LN(x)
LOG(x)
LOG(x)
LOG10(x)
LOG10(x)
LOG2(x)
LOG2(x)
PI()
PI()
POW(x, y)
POW(x, y)
POWER(x, y)
POWER(x, y)
RADIANS(x)
RADIANS(x)
RAND()
RAND()
ROUND(x, d)
ROUND(x, d)
SIGN(x)
SIGN(x)
SIN(x)
SIN(x)
SQRT(x)
SQRT(x)
TAN(x)
TAN(x)
String Functions
ASCII(expr)
ASCII(expr)
CHAR(n, ...)
CHAR(n, ...)
CONCAT(expr1, ...)
CONCAT(expr1, ...)
CONCAT_WS(separator, expr1, ...)
CONCAT_WS(separator, expr1, ...)
INSTR(str, substr)
INSTR(str, substr)
LENGTH(str)
LENGTH(str)
LCASE(str)
LOWER(str)
LEFT(str, len)
LEFT(str, len)
LOWER(str)
LOWER(str)
LTRIM(str)
LTRIM(str)
POSITION(substr IN str)
POSITION(substr IN str)
REPEAT(str, count)
REPEAT(str, count)
REPLACE(str, from_str, to_str)
REPLACE(str, from_str, to_str)
REVERSE(str)
REVERSE(str)
RIGHT(str, len)
RIGHT(str, len)
RTRIM(str)
RTRIM(str)
SPACE(n)
SPACE(n)
SUBSTR(str, pos)
SUBSTR(str, pos)
SUBSTR(str, pos, len)
SUBSTR(str, pos, len)
SUBSTRING(str, pos)
SUBSTRING(str, pos)
SUBSTRING(str, pos, len)
SUBSTRING(str, pos, len)
UCASE(str)
UPPER(str)
UPPER(str)
UPPER(str)
Date and Time Functions
ADD_DAYS(date, interval)
DATEADD(day, interval, date)
ADD_HOURS(date, interval)
DATEADD(hour, interval, date)
ADD_MINUTES(date, interval)
DATEADD(minute, interval, date)
ADD_MONTHS(date, interval)
DATEADD(month, interval, date)
ADD_SECONDS(date, interval)
DATEADD(second, interval, date)
ADD_YEARS(date, interval)
DATEADD(year, interval, date)
DATE(expr)
DATE(expr)
DAY(date)
DAY(date)
DAYNAME(date)
DAYNAME(date)
DAY_OF_MONTH(date)
DAYOFMONTH(date)
DAY_OF_WEEK(date)
DAYOFWEEK(date)
DAY_OF_YEAR(date)
DAYOFYEAR(date)
HOUR(date)
HOUR(date)
LAST_DAY(date)
LAST_DAY(date)
MINUTE(time)
MINUTE(time)
MONTH(date)
MONTH(date)
NOW()
CURRENT_TIMESTAMP()
SECOND(time)
SECOND(time)
TO_DATE(expr)
TO_DATE(expr)
TO_TIMESTAMP(expr)
TO_TIMESTAMP(expr)
WEEK(date)
WEEK(date)
YEAR(date)
YEAR(date)
YESTERDAY()
DATEADD(day, -1, CURRENT_DATE())
TOMORROW()
DATEADD(day, 1, CURRENT_DATE())
TO_START_OF_DAY(expr)
DATE_TRUNC('day', expr)
TO_START_OF_HOUR(expr)
DATE_TRUNC('hour', expr)
TO_START_OF_MINUTE(expr)
DATE_TRUNC('minute', expr)
TO_START_OF_MONTH(expr)
DATE_TRUNC('month', expr)
TO_START_OF_QUARTER(expr)
DATE_TRUNC('quarter', expr)
TO_START_OF_WEEK(expr)
DATE_TRUNC('week', expr)
TO_START_OF_YEAR(expr)
DATE_TRUNC('year', expr)
TO_UNIX_TIMESTAMP(expr)
TO_TIMESTAMP(expr)
TO_YEARMONTH(expr)
TO_CHAR(expr, 'YYYY-MM')
TO_YYYYMMDD(expr)
TO_CHAR(expr, 'YYYYMMDD')
TO_YYYYMMDDHH(expr)
TO_CHAR(expr, 'YYYYMMDDHH')
TO_YYYYMMDDHHMMSS(expr)
TO_CHAR(expr, 'YYYYMMDDHHMMSS')
Aggregate Functions
AVG(x)
AVG(x)
COUNT(x)
COUNT(x)
MAX(x)
MAX(x)
MIN(x)
MIN(x)
SUM(x)
SUM(x)
Conversion Functions
CAST(expr AS type)
CAST(expr AS type)
TO_BOOLEAN(expr)
CAST(expr AS BOOLEAN)
TO_STRING(expr)
CAST(expr AS STRING)
TO_INT32(expr)
CAST(expr AS INT)
TO_INT64(expr)
CAST(expr AS BIGINT)
TO_UINT32(expr)
CAST(expr AS NUMBER)
TO_UINT64(expr)
CAST(expr AS NUMBER)
TO_FLOAT32(expr)
CAST(expr AS FLOAT)
TO_FLOAT64(expr)
CAST(expr AS DOUBLE)
TO_DATE(expr)
CAST(expr AS DATE)
TO_TIMESTAMP(expr)
CAST(expr AS TIMESTAMP)
TRY_CAST(expr AS type)
TRY_CAST(expr AS type)
JSON Functions
JSON_OBJECT(key1, value1[, key2, value2[, ...]])
OBJECT_CONSTRUCT(key1, value1[, key2, value2[, ...]])
PARSE_JSON(expr)
PARSE_JSON(expr)
JSON_EXTRACT_PATH_TEXT(expr, path_name)
GET(expr, path_name)
JSON_PRETTY(json_string)
TO_JSON_STRING(json_string)
JSON_ARRAY(value1[, value2[, ...]])
ARRAY_CONSTRUCT(value1[, value2[, ...]])
JSON_PATH_QUERY(variant, path_name)
PATH(expr, path_name)
JSON_PATH_QUERY_FIRST(variant, path_name)
PATH(expr, path_name)
CHECK_JSON(expr)
IS_JSON(expr)
GET_PATH(variant, path_name)
GET_PATH(variant, path_name)
JSON_PATH_EXISTS(json_data, json_path_expression)
JSON_PATH_EXISTS(json_data, json_path_expression)
JSON_OBJECT_KEEP_NULL(key1, value1[, key2, value2[, ...]])
OBJECT_CONSTRUCT_KEEP_NULL(key1, value1[, key2, value2[, ...]])
IS_INTEGER(expr)
IS_INTEGER(expr)
IS_NULL_VALUE(expr)
IS_NULL_VALUE(expr)
IS_STRING(expr)
IS_STRING(expr)
IS_BOOLEAN(expr)
IS_BOOLEAN(expr)
IS_ARRAY(expr)
IS_ARRAY(expr)
GET(variant, index)
GET(variant, index)
JSON_PATH_QUERY_ARRAY(variant, path_name)
PATH_ARRAY(variant, path_name)
GET_IGNORE_CASE(variant, field_name)
GET_IGNORE_CASE(variant, field_name)
JSON_ARRAY_ELEMENTS(json_string)
ARRAY_ELEMENTS(json_string)
JSON_STRIP_NULLS(json_string)
REMOVE_NULLS(json_string)
AS_BOOLEAN(variant)
CAST(variant AS BOOLEAN)
AS_INTEGER(variant)
CAST(variant AS INTEGER)
AS_FLOAT(variant)
CAST(variant AS FLOAT)
AS_STRING(variant)
CAST(variant AS STRING)
AS_ARRAY(variant)
CAST(variant AS ARRAY)
AS_OBJECT(variant)
CAST(variant AS OBJECT)
Bitmap Functions
BUILD_BITMAP(expr)
TO_BITMAP(expr)
BITMAP_AND(expr1, expr2)
BITMAP_OR(expr1, expr2)
BITMAP_XOR(expr1, expr2)
BITMAP_HAS_ALL(expr1, expr2)
BITMAP_HAS_ANY(expr1, expr2)
BITMAP_TO_STRING(expr)
Conditional Functions
IF(cond, expr1, expr2)
IFF(cond, expr1, expr2)
NULLIF(expr1, expr2)
NULLIF(expr1, expr2)
IFNULL(expr1, expr2)
IFNULL(expr1, expr2)
COALESCE(expr1, expr2[, ...])
COALESCE(expr1, expr2[, ...])
Encryption and Hash Functions
MD5(expr)
MD5(expr)
SHA(expr)
SHA(expr)
SHA2(expr, x)
SHA2(expr, x)
CRC32(expr)
CRC32(expr)
XXHASH32(expr)
XXHASH64(expr)
BLAKE3(expr)
SIPHASH64(expr)
CITY64WITHSEED(expr1, expr2)
Window Functions
ROW_NUMBER()
ROW_NUMBER()
RANK()
RANK()
DENSE_RANK()
DENSE_RANK()
PERCENT_RANK()
PERCENT_RANK()
CUME_DIST()
CUME_DIST()
NTILE(n)
NTILE(n)
LAG(expr, n, default)
LAG(expr, n, default)
LEAD(expr, n, default)
LEAD(expr, n, default)
FIRST_VALUE(expr)
FIRST_VALUE(expr)
LAST_VALUE(expr)
LAST_VALUE(expr)
NTH_VALUE(expr, n)
NTH_VALUE(expr, n)
Array Functions
ARRAY_REDUCE(array, lambda)
ARRAY_FLATTEN(array)
ARRAY_FLATTEN( <array> )
SLICE(array, start[, end])
SLICE(array, start[, end])
ARRAY_TO_STRING(array, '<separator>')
ARRAY_TO_STRING(array, '<separator>')
CONTAINS(array, element)
ARRAY_CONTAINS(array, element)
ARRAY_FILTER(array, lambda)
ARRAY_SORT(array[, order[, nullposition]])
ARRAY_SORT(array[, order[, nullposition]])
UNNEST(array)
UNNEST(array)
ARRAY_CONCAT(array1, array2)
ARRAY_CAT(array1, array2)
GET(array, index)
GET(array, index)
ARRAY_INDEX_OF(array, element)
ARRAY_INDEX_OF(array, element)
ARRAY_UNIQUE(array)
ARRAY_UNIQUE(array)
ARRAY_REMOVE_FIRST(array)
ARRAY_TRANSFORM(array, lambda)
TRANSFORM( <array> , <lambda_expression> )
ARRAY_DISTINCT(array)
ARRAY_DISTINCT(array)
ARRAY_APPEND(array, element)
ARRAY_APPEND(array, element)
ARRAY_PREPEND(element, array)
ARRAY_PREPEND(element, array)
ARRAY_LENGTH(array)
ARRAY_LENGTH(array)
ARRAY_APPLY(array, lambda)
Geospatial Functions
H3_EXACT_EDGE_LENGTH_KM(h3)
H3_EXACT_EDGE_LENGTH_KM(h3)
H3_EDGE_LENGTH_M(1)
H3_EDGE_LENGTH_M(1)
H3_LINE(h3, a_h3)
H3_LINE(h3, a_h3)
H3_EDGE_LENGTH_KM(res)
H3_EDGE_LENGTH_KM(res)
H3_EXACT_EDGE_LENGTH_M(h3)
H3_EXACT_EDGE_LENGTH_M(h3)
GEOHASH_DECODE('<geohashed-string>')
GEOHASH_DECODE('<geohashed-string>')
H3_HEX_AREA_M2(res)
H3_HEX_AREA_M2(res)
H3_TO_GEO_BOUNDARY(h3)
H3_TO_GEO_BOUNDARY(h3)
H3_GET_UNIDIRECTIONAL_EDGES_FROM_HEXAGON(h3)
H3_GET_UNIDIRECTIONAL_EDGES_FROM_HEXAGON(h3)
H3_HEX_AREA_KM2(res)
H3_HEX_AREA_KM2(res)
H3_IS_VALID(h3)
H3_IS_VALID(h3)
H3_TO_GEO(h3)
H3_TO_GEO(h3)
H3_UNIDIRECTIONAL_EDGE_IS_VALID(h3)
H3_UNIDIRECTIONAL_EDGE_IS_VALID(h3)
GEO_TO_H3(lon, lat, res)
GEO_TO_H3(lon, lat, res)
GEOHASH_ENCODE(lon, lat)
GEOHASH_ENCODE(lon, lat)
POINT_IN_POLYGON((x,y), [(a,b), (c,d), (e,f) ... ])
POINT_IN_POLYGON((x,y), [(a,b), (c,d), (e,f) ... ])
H3_DISTANCE(h3, a_h3)
H3_DISTANCE(h3, a_h3)
H3_CELL_AREA_RADS2(h3)
H3_CELL_AREA_RADS2(h3)
H3_TO_CENTER_CHILD(h3, res)
H3_TO_CENTER_CHILD(h3, res)
H3_CELL_AREA_M2(h3)
H3_CELL_AREA_M2(h3)
H3_GET_FACES(h3)
H3_GET_FACES(h3)
H3_IS_PENTAGON(h3)
H3_IS_PENTAGON(h3)
STRING_TO_H3(h3)
STRING_TO_H3(h3)
H3_GET_UNIDIRECTIONAL_EDGE(h3, a_h3)
H3_GET_UNIDIRECTIONAL_EDGE(h3, a_h3)
H3_GET_BASE_CELL(h3)
H3_GET_BASE_CELL(h3)
H3_GET_UNIDIRECTIONAL_EDGE_BOUNDARY(h3)
H3_GET_UNIDIRECTIONAL_EDGE_BOUNDARY(h3)
H3_GET_DESTINATION_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_GET_DESTINATION_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_TO_PARENT(h3, parent_res)
H3_TO_PARENT(h3, parent_res)
H3_NUM_HEXAGONS(res)
H3_NUM_HEXAGONS(res)
H3_TO_CHILDREN(h3, child_res)
H3_TO_CHILDREN(h3, child_res)
H3_GET_INDEXES_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_GET_INDEXES_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_TO_STRING(h3)
H3_TO_STRING(h3)
H3_K_RING(h3, k)
H3_K_RING(h3, k)
H3_IS_RES_CLASS_III(h3)
H3_IS_RES_CLASS_III(h3)
H3_INDEXES_ARE_NEIGHBORS(h3, a_h3)
H3_INDEXES_ARE_NEIGHBORS(h3, a_h3)
H3_GET_ORIGIN_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_GET_ORIGIN_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
H3_GET_RESOLUTION(h3)
H3_GET_RESOLUTION(h3)
Other Functions
NEXTVAL(seq_name)
NEXTVAL(seq_name)
QUOTE(str)
QUOTE(str)
The text was updated successfully, but these errors were encountered: