Added
- Support parallelism for table stages
Fixes
- Emit last encountered state message if there are no records.
Changes
- Migrate CI to github actions
- Bump dependencies
- Increase
max_records
when selecting columns by an order of magnitude - Bumping dependencies
- Add support for
date
property format - Stop logging record when error happens
- Fixed an issue with S3 metadata required for decryption not being included in archived load files.
- Add
archive_load_files
parameter to optionally archive load files on S3 - Bumping dependencies
- Add optional
batch_wait_limit_seconds
parameter - Bumping dependencies
- Fixed an issue when
SHOW FILE FORMATS
ran too many times slowing down the startup time of the target - Bump
snowflake-connectory-python
from2.3.10
to2.4.1
- Bump
numpy
from<1.20.0
to<1.21.0
- Add parquet support
- Add check and few logs in the date parsing routine
- Bumping dependencies
- Update caching mechanism to fix issue with badly ordered queryies in a transaction
- Introduced a reserved named parameter for prepared statements.
- Do not use parallel file upload with PUT command and table stages.
- Bumping dependencies
- Add
{{database}}
token toquery_tag
parameter - Use Jinja style
query_tag
template variables
- Fixed a dependency issue
- Add everything from the unreleased
1.9.0
- Use snowflake table stages by default to load data into tables
- Add optional
query_tag
parameter - Add optional
role
parameter to use custom roles - Fixed an issue when generated file names were not compatible with windows
- Bump
joblib
to0.16.0
to be python 3.8 compatible - Bump
snowflake-connectory-python
to2.3.6
- Bump
boto3
to1.16.20
- Fixed an issue when
pipelinewise-target-snowflake
failed whenQUOTED_IDENTIFIERS_IGNORE_CASE
snowflake parameter set to true - Add
aws_profile
option to support Profile based authentication to S3 - Add option to authenticate to S3 using
AWS_PROFILE
,AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
andAWS_SESSION_TOKEN
environment variables - Add
s3_endpoint_url
ands3_region_name
options to support non-native S3 accounts - Flush stream only if the new schema is not the same as the previous one
- Add
s3_acl
option to support ACL for S3 upload - Fixed an issue when no primary key error logged as
INFO
and not asERROR
- Fixed an issue when new columns sometimes not added to target table
- Fixed an issue when the query runner returned incorrect value when multiple queries running in one transaction
- Switch jsonschema to use Draft7Validator
- Fix loading tables with space in the name
- Generate compressed CSV files by default. Optionally can be disabled by the
no_compression
config option
- Log inserts, updates and csv size_bytes in a more consumable format
- Use SHOW SCHEMAS|TABLES|COLUMNS instead of INFORMATION_SCHEMA
- Support usage of reserved words as table names.
- Support custom logging configuration by setting
LOGGING_CONF_FILE
env variable to the absolute path of a .conf file
- Change default /tmp folder for encrypting files
- Make AWS key optional and obtain it secondarily from env vars
- Add temp_dir optional parameter to config
- Fixed issue when JSON value not sent correctly
- Load binary data into Snowflake BINARY data type column
- Add missing module
python-dateutil
- Review dates & timestamps and fix them before insert/update
- Pinned stable version of
urllib3
- Pinned stable version of
botocore
andboto3
- Fixed issue when extracting bookmarks from the state messages sometimes failed
- Bump
snowflake-connector-python
to 2.0.3
- Fixed an issue when number of rows in buckets were not calculated correctly and caused flushing of data at the wrong time with degraded performance
- Fixed an issue when sometimes the last bucket of data was not flushed correctly
- Bump
snowflake-connector-python
to 2.0.1 - Always use secure connection to Snowflake and force auto commit
- Add
flush_all_streams
option - Add
parallelism
option - Add
max_parallelism
option
- Emit new state message as soon as data flushed to Snowflake
- Log SQLs only in debug mode
- Further improvements in
information_schema.tables
caching
- Improved and optimised
information_schema.tables
caching
- Caching
information_schema.tables
to avoid long running SQLs in snowflake - Instead of DROPPING exiting column RENAME it
- Add
data_flattening_max_level
option
- Optimised queries to
information_schema.tables
- Create
_sdc_deleted_at
asVARCHAR
to avoid issues caused by invalid formatted date-times received from taps
- Manage only three metadata columns:
_sdc_extracted_at
,_sdc_batched_at
and_sdc_deleted_at
- Initial release