DataHub v0.12.1
Release Highlights
New Features
SQLAlchemy Source Enhancements: Support for view lineage across all SQLAlchemy sources (PR #9039).
Airflow Integration: Retry callback and support for ExternalTaskSensor subclasses added (PR #8514).
Kafka Enhancements: Increased Kafka message size and enabled compression (PR #9038).
JSONSchema Ingestion: Enabled schema-aware JsonSchemaTranslator (PR #8971).
Search Bar Improvements: Added a flag to hide/display the autocomplete query (PR #9104).
SQL Parser Performance: Enhancements and asyncio fixes (PR #9119).
MongoDB Ingestion: Support for stateful ingestion and improved schema inference for lists (PR #9118, PR #9145).
Policy Engine Updates: Refactoring and enhancements, including support for 10k+ policies (PR #9163, PR #9177).
UI Enhancements: Numerous improvements including command-k icons in the search bar, updated Apollo cache, and auto-complete debounce in the search bar (PR #9194, PR #9193, PR #9205).
Fivetran Integration: Connector integration for Fivetran (PR #9018).
Neo4j Database Support: Connection to specific Neo4j databases now supported (PR #9179).
Chart Subtypes in UI: Support for chart subtypes (PR #9186).
Fixes and Improvements
BigQuery Fixes: Resolved issues with lineage filter query, and fixed extracting comments from complex types (PR #9114, PR #8950).
MongoDB Refactoring: Platform instance addition to MongoDB (PR #8663).
Kafka Setup: Adjusted truststore settings for PEM files (PR #8656).
REST API Authorization: Fixed rollback failure when authorization is enabled (PR #9092).
Java Exception Handling: Addressed java.util.ConcurrentModificationException (PR #9090).
UI and Documentation: Fixed filtering logic in UI, corrected documentation errors, and added feature guides (PR #9116, PR #9125, PR #9124, PR #9126, PR #9134, PR #9137, PR #9122, PR #9068).
SQL Server and Snowflake Ingestion: Updated queries and fixed missing view downstream call (PR #9127, PR #8966).
ClickHouse and DB2 Ingestion: Addressed column reflection regression and table properties handling (PR #9143, PR #9128).
Ingestion Improvements: Numerous fixes and enhancements across various ingestion sources (PR #9153, PR #9155, PR #9141, PR #9157, PR #9123).
CI and Build Process: Tweaked workflows, increased gradle retries, and addressed CI errors (PR #9052, PR #9091, PR #9160).
Security Updates: Addressed a zookeeper CVE and other security concerns (PR #9190).
UI Refactoring: Improved entity page loading indicators and renamed button texts (PR #9195, PR #9196).
Policy and Auth Enhancements: Refactored policy locking and added roles to policy engine validation logic (PR #9178).
Testing and Continuous Integration
API Testing: Added tests for managing secrets, access token privilege, and flaky tests fix (PR #9121, PR #9167, PR #9132, PR #9175).
Cypress Test Fixes: Addressed glossary navigation and download_lineage_results tests (PR #9175, PR #9132).
Cleanup and Refactoring
Ingestion Cleanup: Removed legacy memory_leak_detector and refactored ingestion sources (PR #9158, PR #9135, PR #9120, PR #9105).
PDL Refactoring: Refactored Assertion model enums (PR #9191).
Build and Deployment
Release Preparation: Updated files for the 0.12.0 release (PR #9130).
What's Changed
- feat(ingest): support view lineage for all sqlalchemy sources by @mayurinehate in #9039
- fix(ingest/bigquery): Fixing lineage filter query by @treff7es in #9114
- refactor(ingestion/mongodb): Add platform_instance to mongodb by @nicholas-fwang in #8663
- fix(kafka-setup): Don't set truststore pass for PEM files by @mmmeeedddsss in #8656
- fix(ingest): Fix roll back failure when REST_API_AUTHORIZATION_ENABLED is set to true by @TonyOuyangGit in #9092
- (fix): Avoid java.util.ConcurrentModificationException by @rtekal in #9090
- Fix(ingest/bigquery): fix extracting comments from complex types by @maaaikoool in #8950
- docs: add versions 0.12.0 by @yoonhyejin in #9125
- fix(ui) Fix filtering logic for everwhere generating OR filters by @chriscollins3456 in #9116
- build(release): Update files for 0.12.0 release by @pedro93 in #9130
- fix(ingest/sql-server): update queries to use escaped procedure name by @mayurinehate in #9127
- feat(airflow): retry callback, support ExternalTaskSensor subclasses by @richenc in #8514
- docs: fix saasonly flags for some pages by @yoonhyejin in #9124
- fix(ingest/snowflake): missing view downstream cll if platform instance is set by @mayurinehate in #8966
- feat: Add flag to hide/display the autocomplete query for search bar by @kushagra-apptware in #9104
- docs(timeline): correct markdown heading level by @mayurinehate in #9126
- docs(graphql) Correct mutation -> query for searchAcrossLineage examples by @eboneil in #9134
- feat(kafka): increase kafka message size and enable compression by @david-leifker in #9038
- feat(ingest/jsonschema) enable schema-aware
JsonSchemaTranslator
by @KulykDmytro in #8971 - fix(metadata-ingestion): adds default value to _resolved_domain_urn i… by @alexklavensnyt in #9115
- ci: tweak to only run relevant workflows by @anshbansal in #9052
- Fix for flaky download_lineage_results cypress test by @kkorchak in #9132
- docs: Update updating-datahub.md by @pedro93 in #9131
- fix(ingest/clickhouse): pin version to solve column reflection regression by @hsheth2 in #9143
- feat(ingest/looker): cleanup error handling by @hsheth2 in #9135
- feat(ingest): add
entity_supports_aspect
helper by @hsheth2 in #9120 - feat(sqlparser): support more update syntaxes + fix bug with subqueries by @hsheth2 in #9105
- docs: correct broken doc links by @sachinsaju in #9137
- feat(ingest): sql parser perf + asyncio fixes by @hsheth2 in #9119
- feat(quickstart): fix broker InconsistentClusterIdException issues by @hsheth2 in #9148
- fix(policies): remove non-existent policies, fix name by @anshbansal in #9150
- Fix for a test that passed on Oss and failed on Saas by @kkorchak in #9147
- docs(teradata): teradata doc external link 404 fix by @sachinsaju in #9152
- fix(datahub-client): Include relocation for snakeyaml dependency. by @jiateoh in #8911
- fix(ingest): cleanup large images in CI by @hsheth2 in #9153
- build: increase gradle retries by @hsheth2 in #9091
- feat(ingest): bump sqlglot parser by @hsheth2 in #9155
- feat(ingest/mongodb): support stateful ingestion by @TonyOuyangGit in #9118
- API test for managing secrets privilege by @kkorchak in #9121
- fix(ingest): handle exceptions in min, max, mean profiling by @mayurinehate in #9129
- feat: rename Assets tab to Owner Of by @kushagra-apptware in #9141
- fix(ingest/mongodb): fix schema inference for lists of values by @hsheth2 in #9145
- fix(ingest/db2): fix handling for table properties by @deepgarg-visa in #9128
- fix(ingest): fully support MCPs in urn_iter primitive by @hsheth2 in #9157
- fix(ingest/bigquery): use correct row count in null count profiling c… by @mayurinehate in #9123
- docs: add feature guides for subscriptions and notifications by @yoonhyejin in #9122
- docs: unify oidc guides using tabs by @yoonhyejin in #9068
- chore(ingest): remove legacy memory_leak_detector by @hsheth2 in #9158
- feat(ingest/looker): support emitting unused explores by @hsheth2 in #9159
- refactor(policy): refactor policy locking, no functional difference by @david-leifker in #9163
- API test for managing access token privilege by @kkorchak in #9167
- fix(mysql-setup): quote database name by @darnaut in #9169
- fix(health): fix health check url authentication by @david-leifker in #9117
- fix(elasticsearch): fix elasticsearch-setup for dropped 000001 index by @david-leifker in #9074
- Origin/fix flaky glossary navigation cypress test by @kkorchak in #9175
- fix: bad lineage link in
LineageGraphOnboardingConfig.tsx
by @walter9388 in #9162 - OBS-191 | Viewing domains page should not require Manage Domains priv… by @sumitappt in #9156
- fix: expand the stats row in search preview cards by @gaurav2733 in #9140
- docs(ingest): clarify adding source guide by @hsheth2 in #9161
- chore: stop ingestion-smoke CI errors on forks by @hsheth2 in #9160
- docs(ingest): inherit capabilities from superclasses by @hsheth2 in #9174
- fix(ingest/datahub-source): Order by version in memory by @asikowitz in #9185
- lint(frontend): fix HeaderLinks lint error by @david-leifker in #9189
- refactor(ui): Refactor entity page loading indicators by @jjoyce0510 in #9195
- fix(security): fix for zookeeper CVE-2023-44981 by @david-leifker in #9190
- refactor(ui): Rename "dataset details" button text to "view details" on lineage sidebar profile by @jjoyce0510 in #9196
- feat(ui): Add command-k icons to search bar by @jjoyce0510 in #9194
- feat(ui) Update Apollo cache to work with union types by @chriscollins3456 in #9193
- feat(policy): enable support for 10k+ policies by @david-leifker in #9177
- feat(browsepathv2): Allow system-update to reprocess browse paths v2 by @david-leifker in #9200
- feat(integration/fivetran): Fivetran connector integration by @shubhamjagtap639 in #9018
- feat(neo4j): Allow datahub to connect to specific neo4j database by @deepgarg-visa in #9179
- feat(subtypes): support subtypes for charts in the UI by @gabe-lyons in #9186
- feat(ui) Debounce auto-complete in search bar by @chriscollins3456 in #9205
- fix(lineage): magical lineage layout fix by @gabe-lyons in #9187
- refactor(pdl): Refactoring Assertion model enums out by @jjoyce0510 in #9191
- feat(auth): Add roles to policy engine validation logic by @pedro93 in #9178
- style(ingest/tableau): Rename tableau_constant to c by @asikowitz in #9207
- docs: update broken link in metadata-modelling by @sachinsaju in #9184
- Test policy to create and manage privileges by @kkorchak in #9173
- docs(security): add security doc to website by @RyanHolstien in #9209
- docs(java-sdk-dataset): add dataset via java sdk example by @sachinsaju in #9136
- doc(java-sdk-example):example to create tag via java-sdk by @sachinsaju in #9151
- fix(ingest/powerbi): use dataset workspace id as key for parent container by @looppi in #8994
- refactor(schema tab): Remove last observed timestamps from schema tab by @jjoyce0510 in #9188
- docs: adjust sidebar & create new admin section by @yoonhyejin in #9064
- fix(metadata-io): in Neo4j service use proper algorithm to get lineage by @lix-mms in #8687
- Managed Ingestion UX Improvements by @purnimagarg1 in #9216
- chore(ingest): start working on pydantic v2 support by @hsheth2 in #9220
- feat(ingestion): file-based state checkpoint provider by @shubhamjagtap639 in #9029
- feat(ingestion/airflow): support datajobs as task inlets by @shubhamjagtap639 in #9211
- fix(build): set
@cliMajorVersion@
correctly by @hsheth2 in #9228 - fix(datahub-ingestion): remove old jars, sync pyspark version by @david-leifker in #9217
- fix: add security.md to sidebar by @yoonhyejin in #9229
- feat(policies): reduce default access for all users by @RyanHolstien in #9067
- Update add new company s7 airlines by @YuriyGavrilov in #9019
- docs(debug): add debug information for cli by @RyanHolstien in #9208
- fix(datahub-ingestion): prevent transitive deps, bump addtional pyspa… by @david-leifker in #9233
- feat(ingest/dbt): dbt column-level lineage by @hsheth2 in #8991
- chore(ingest): cleanup various methods by @hsheth2 in #9221
- docs: clarify how to disable telemetry by @hsheth2 in #9236
- feat(ingest/mongodb): support AWS DocumentDB for MongoDB by @TonyOuyangGit in #9201
- feat(airflow): make RUN_IN_THREAD configurable by @hsheth2 in #9226
- fix(signup): prevent invalid email signup by @RyanHolstien in #9234
- chore(security): version adjustments for security vulns by @david-leifker in #9243
- docs(ingest): fix typo in snowflake ingestion docs by @PGuiv in #9239
- chore(security): jre to headless, removes x11 dependency by @david-leifker in #9245
- feat(recomendations): Make top platforms account only for searchable entities by @pedro93 in #9240
- Feature/prd 770 by @gaurav2733 in #9224
- fix:fix search on paginated lists by @Salman-Apptware in #9198
- fix: increase the search bar highlight border to double the width by @gaurav2733 in #9251
- feat: Add loading indicator to Manage Domains sidebar by @sumitappt in #9142
- fix(ui): show external url also in entity profile of containers by @Masterchen09 in #8834
- feat(ingest/unity): Support specifying catalogs directly; pass env correctly by @asikowitz in #9110
- refactor(datahub-web-react): allows proxying to external datahub-frontend servers by @PatrickfBraz in #9250
- chore(node): update node to non-EOL version by @david-leifker in #9252
- fix(ingest): drop redshift-legacy and redshift-usage-legacy sources by @hsheth2 in #9244
- feat(ingest): support advanced configs for aws by @hsheth2 in #9237
- fix(sql-parser): convert platform instance to lowercase when building table urns by @Starkie in #9181
- test(ingest/unity): Update goldens by @asikowitz in #9254
- build(ingest/hive): Update thrift pin by @asikowitz in #8964
- docs(airflow): update plugin setup docs to include UI setup approach by @jiateoh in #9253
- feat(usageclient): updates for usageclient by @david-leifker in #9255
- fix(graphql): prevent duplicate index queries for dataproducts by @david-leifker in #9260
- logging(search): log level highlight value urn detection by @david-leifker in #9262
- Add Python version in Developer README by @kevin1chun in #9268
- Sync datahub-head on merge by @noggi in #9267
- PRD-742/fix:Settings tab should have 2 scrollable sections by @Salman-Apptware in #9218
- feat: add ingestion overview pages by @yoonhyejin in #9210
- fix(ingest/athena): detect decimal type correctly by @bossenti in #9270
- Fix/prd 787 by @gaurav2733 in #9261
- build(deps): bump @babel/traverse from 7.22.11 to 7.23.2 in /docs-website by @dependabot in #9022
- fix(gha): fix gha for single tag by @david-leifker in #9283
- fix(node): fix node_options by @david-leifker in #9281
- fix: Revamp features page by @yoonhyejin in #8839
- docs(acryl cloud): release notes 0.2.13 by @anshbansal in #9291
- fix: stats are spaced out too far by @gaurav2733 in #9292
- feat(mysql): upgrade to version 8.2 for quickstart by @RyanHolstien in #9241
- feat: add townhall RSVP link on the main page by @yoonhyejin in #9277
- fix(ingest/snowflake): Apply email filter on all usage metrics by @treff7es in #9269
- docs(ingestion): Added mention of host without protocol by @SimonOsipov in #9301
- fix(ingest/teradata): Teradata speed up changes by @treff7es in #9059
- fix(kafka): fix consumer properties on due consumer by @david-leifker in #9304
- fix(dbt-cloud): do not pass macros to sorting nodes by @anshbansal in #9302
- fix(ingest/lookml): emit all views with same name and different file path by @mayurinehate in #9279
- fix(deprecation): bring frontend in-sync with model by @anshbansal in #9303
- fix: fix the settings height when there are not many items by @Salman-Apptware in #9294
- docs: update recommended CLI by @anshbansal in #9307
- feat(ui): bump frontend dependencies by @ngamanda in #8353
- fix(java) Fixes NPE ES service by @chriscollins3456 in #9311
- feat(config): Configurable bootstrap of ownership types by @skrydal in #9308
- feat: update the "json-schema" version from package.json to solve json-schema vulnerability by @kushagra-apptware in #9289
- fix(ingest/mssql): Add MONEY and SMALLMONEY data types as Number by @terratrue-daniel in #9313
- fix(ingest): drop deprecated database_alias from sql sources by @mayurinehate in #9299
- Make repositories configurable for enterprise developers by @githendrik in #9230
- fix(ingest/sql): improve handling of views with dots in their names by @Starkie in #9183
- docs(ingest): update docs on adding stateful ingestion by @hsheth2 in #9327
- fix(docker): docker compose health checks port fix by @david-leifker in #9326
- fix : vulnerability (React): Inefficient Regular Expression Complexit… by @gaurav2733 in #9324
- fix(ui) Fix UI glitch in policies creator by @chriscollins3456 in #9266
- fix(sidebar): remove a space reserved for scroll bars when sidebar is collapsed by @allizex in #9322
- feat(ingest/mssql): enable TLS encryption for SQLServer using pytds by @terratrue-daniel in #9256
- fix(datahub-frontend): Add playCaffeine as replacement for removed playEhcache dependency by @MideO in #8344
- fix(ingest): bump pyhive to fix headers issue by @hsheth2 in #9328
- feat(gradle): quickstart postgres gradle task by @david-leifker in #9329
- Upload metadata model to s3 by @noggi in #9325
- fix(ui) Set explicit height on logo images to fix render bug by @chriscollins3456 in #9344
- fix(ingest/browse): Re-emit browse path v2 aspects to avoid race condition by @asikowitz in #9227
- feat(ingest/ldap): make ingestion robust to string departmentId by @hsheth2 in #9258
- doc(ingest/teradata): Adding Teradata to list of Integrations by @treff7es in #9336
- Complexity in chalk/ansi-regex and minimatch ReDoS Vulnerability solution by @kushagra-apptware in #9323
- build(deps): bump tmpl from 1.0.4 to 1.0.5 in /datahub-web-react by @dependabot in #9345
- fix:Address @babel/traverse vulnerabilities by @Salman-Apptware in #9343
- docs(ingest/looker): mark platform instance as a supported capability by @hsheth2 in #9347
- fix:Address HIGH vulnerability with Axios by @Salman-Apptware in #9353
- fix: show formatted total result count in Search by @gaurav2733 in #9356
- feat(sdk): autogenerate urn types by @hsheth2 in #9257
- fix(airflow): support inlet datajobs correctly in v1 plugin by @hsheth2 in #9331
- feat(ingest): clean up DataHubRestEmitter return type by @hsheth2 in #9286
- feat(ingest/dbt): support custom ownership types in dbt meta by @hsheth2 in #9332
- docs(ingest/lookml): clarify that ssh key has no passphrase by @hsheth2 in #9348
- fix(migrate): connect with token without dry-run by @anshbansal in #9317
- fix(ui): Minor: fix unnecessary lineage tab scroll by removing -1 margin on lists by @jjoyce0510 in #9364
- Prd 196 dynamic tabname by @kushagra-apptware in #9352
- docs: add setup instructions for mac dependencies by @hsheth2 in #9346
- feat(ui): Add caching to search, entity profile for better UX by @jjoyce0510 in #9362
- refactor(ui): Remove primary color for sort selector + add t… by @jjoyce0510 in #9363
- feat(ui): Supporting subtypes for data jobs by @jjoyce0510 in #9361
- fix(ingest/bigquery): Fix format arguments for table lineage test (#9340) by @middagj in #9341
- fix(siblingsHook): add logic to account for non dbt upstreams by @ethan-cartwright in #9154
- feat: Support CSV ingestion through the UI by @purnimagarg1 in #9280
- fix: node-fetch forwards secure headers to untrusted sites by @Salman-Apptware in #9375
- fix(ingest/powerbi): Allow old parser to parse [db].[schema].[table] table references by @asikowitz in #9360
- feat(ingest): support stdin in
datahub put
by @hsheth2 in #9359 - fix(ingest): resolve issue with caplog and asyncio by @hsheth2 in #9377
- fix(ingest/airflow): compat with pluggy 1.0 by @hsheth2 in #9365
- feat(ingest/athena): Enable Athena view ingestion and view lineage by @treff7es in #9354
- fix(ingest/redshift): Identify materialized views properly + fix connection args support by @treff7es in #9368
- test(ingest/unity): Unity catalog data generation by @asikowitz in #8949
- fix(elasticsearch): set datahub usage events shard & replica count by @david-leifker in #9388
- feat(gms/search): Adding support for DOUBLE Searchable type by @siladitya2 in #9369
- feat(lint): add spotless for java lint by @anshbansal in #9373
- feat(ci): split no cypress test suite by @anshbansal in #9387
- fix(ingest/redshift): too many values unpack by @anshbansal in #9394
- fix(ingest/redshift): Fix psycopg2 removal from Redshift Source by @treff7es in #9395
- fix(ui): fixed font src spelling mistake by @accso-jo in #9204
- feat(ingest/unity): GE Profiling by @asikowitz in #8951
- feat(ui/last-updated): Calculate last updated time as max(properties time, operation time) by @asikowitz in #9242
- docs: add youtube link to townhall button on docs by @yoonhyejin in #9381
- fix: set new sidebar section by @yoonhyejin in #9393
- fix(json-schema): take into account environment by @matthiasdg in #9385
- feat(datahub-frontend): make Java memory options configurable via ENV variable by @haeniya in #9215
- docs(ingest/sql-queries): Add documentation by @asikowitz in #9406
- docs: fix duplicated overview link for api section by @yoonhyejin in #9402
- feat(glossary): add toggle sidebar button and functionality to Busine… by @olgadimova in #9222
- refactor(ui): Refactor entity registry to be inside App Providers by @jjoyce0510 in #9399
- feat(ui): handle content prop changes in Editor component by @hsheth2 in #9400
- fix(ingest/profiling): Add back db_name to sql_generic_profiler methods by @asikowitz in #9407
- feat(observability): add actor urn to GraphQL spans by @ngamanda in #9382
- fix(ingest/lookml): make deploy key optional by @hsheth2 in #9378
- fix(ingest/powerbi): fix powerbi chart input handling by @looppi in #9415
- fix(ingest): fix metadata for custom python packages by @hsheth2 in #9391
- fix(ingest): bug fixes and docs updates by @hsheth2 in #9422
- Pin alpine base image version to 3.18 by @noggi in #9421
- fix(cypress) Fix flakiness of cypress test for glossary navigation by @chriscollins3456 in #9410
New Contributors
- @nicholas-fwang made their first contribution in #8663
- @richenc made their first contribution in #8514
- @kushagra-apptware made their first contribution in #9104
- @alexklavensnyt made their first contribution in #9115
- @sachinsaju made their first contribution in #9137
- @jiateoh made their first contribution in #8911
- @deepgarg-visa made their first contribution in #9128
- @darnaut made their first contribution in #9169
- @walter9388 made their first contribution in #9162
- @sumitappt made their first contribution in #9156
- @gaurav2733 made their first contribution in #9140
- @purnimagarg1 made their first contribution in #9216
- @YuriyGavrilov made their first contribution in #9019
- @PGuiv made their first contribution in #9239
- @Salman-Apptware made their first contribution in #9198
- @kevin1chun made their first contribution in #9268
- @noggi made their first contribution in #9267
- @SimonOsipov made their first contribution in #9301
- @terratrue-daniel made their first contribution in #9313
- @allizex made their first contribution in #9322
- @MideO made their first contribution in #8344
- @middagj made their first contribution in #9341
- @accso-jo made their first contribution in #9204
- @matthiasdg made their first contribution in #9385
- @haeniya made their first contribution in #9215
- @olgadimova made their first contribution in #9222
Full Changelog: v0.12.0...v0.12.1