Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qlik Cloud ingestion not working for dashboards and sheets #11355

Open
SindreKjetsa opened this issue Sep 11, 2024 · 4 comments
Open

Qlik Cloud ingestion not working for dashboards and sheets #11355

SindreKjetsa opened this issue Sep 11, 2024 · 4 comments
Labels
bug Bug report

Comments

@SindreKjetsa
Copy link

Describe the bug
Ingestion of qlik cloud data is not working properly. It is ingesting the spaces and a few text files, but not dashboards or sheets.

To Reproduce
Steps to reproduce the behavior:
Ingest qlik cloud
Default recipe with a datahub-gms sink and token from a qlik user that is tenant admin.
source: type: qlik-sense config: tenant_hostname: hostname.eu.qlikcloud.com api_key: '${qlik_api_token}' ingest_owner: true pipeline_name: qlik_sense_ingestion_pipeline sink: type: datahub-rest config: server: 'http://datahub-gms:8080'

Expected behavior
Dashboards and sheets should also be ingested into datahub

Screenshots

Desktop (please complete the following information):

  • OS: Windows
  • Browser: Chrome
  • Version: Latest

Additional context
Error message from datahub-actions:
It ingests 57 spaces, and then ends.

[2024-09-11 10:49:36,108] INFO {datahub.ingestion.run.pipeline:296} - Source configured successfully.
[2024-09-11 10:49:36,109] INFO {datahub.cli.ingest_cli:130} - Starting metadata ingestion
[2024-09-11 10:49:36,109] INFO {datahub.ingestion.source.qlik_sense.qlik_sense:602} - Qlik Sense plugin execution is started
[2024-09-11 10:49:36,780] WARNING {datahub.ingestion.source.qlik_sense.qlik_api:43} - Unable to fetch spaces. Exception: 1 validation error for Space
root
time data '2024-06-10T11:49:01Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ' (type=value_error)
[2024-09-11 10:49:36,781] INFO {datahub.ingestion.source.qlik_sense.qlik_sense:180} - Number of spaces = 57
[2024-09-11 10:49:36,781] INFO {datahub.ingestion.source.qlik_sense.qlik_sense:182} - Number of allowed spaces = 57
[2024-09-11 10:49:37,436] WARNING {datahub.ingestion.source.qlik_sense.qlik_api:43} - Unable to fetch items. Exception: 'personal-space-id'
[2024-09-11 10:49:37,444] INFO {datahub.ingestion.run.pipeline:570} - Processing commit request for DatahubIngestionCheckpointingProvider. Commit policy = CommitPolicy.ALWAYS, has_errors=False, has_warnings=False
[2024-09-11 10:49:37,474] WARNING {datahub.ingestion.source.state_provider.datahub_ingestion_checkpointing_provider:95} - No state available to commit for DatahubIngestionCheckpointingProvider
[2024-09-11 10:49:37,497] INFO {datahub.ingestion.run.pipeline:590} - Successfully committed changes for DatahubIngestionCheckpointingProvider.
[2024-09-11 10:49:37,520] INFO {datahub.ingestion.reporting.file_reporter:54} - Wrote SUCCESS report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/0cdf09e9-f0ff-4096-8da5-5033f49f75f7/ingestion_report.json' mode='w' encoding='UTF-8'>
[2024-09-11 10:49:38,666] INFO {datahub.cli.ingest_cli:143} - Finished metadata ingestion

@SindreKjetsa SindreKjetsa added the bug Bug report label Sep 11, 2024
@SindreKjetsa SindreKjetsa changed the title A short description of the bug Qlik Cloud ingestion not working for dashboards and sheets Sep 11, 2024
@henningwold
Copy link

We have been debugging this for a little while. It turns out the JSON for one of the spaces looks like this (some irrelevant info redacted)

{
      "id": "idhash",
      "type": "managed",
      "ownerId": "ownerhash",
      "tenantId": "tenantId",
      "name": "Name",
      "description": "Description",
      "meta": {
        "actions": ["change_owner", "create", "delete", "read", "update"],
        "roles": [],
        "assignableRoles": [
          "basicconsumer",
          "consumer",
          "contributor",
          "dataconsumer",
          "facilitator",
          "operator",
          "publisher"
        ]
      },
      "links": {
        "self": {
          "href": "https://ourworkspace.eu.qlikcloud.com/api/v1/spaces/spaceid"
        },
        "assignments": {
          "href": "https://ourworkspace.eu.qlikcloud.com/api/v1/spaces/spaceid/assignments"
        }
      },
      "createdAt": "2023-09-28T11:14:51.92Z",
      "createdBy": "createdbyhash",
      "updatedAt": "2024-06-10T11:49:01Z"
    }

Note that somehow the "updatedAt" field is missing milliseconds which actually breaks the parsing at

values[Constant.UPDATEDAT] = datetime.strptime(
since it only accepts an exact format that has to contain milliseconds. Whether this is a bug in Qlik or Datahub, however, I am unsure about.

@henningwold
Copy link

As an added bonus, the code at

spaces.append(Space.parse_obj(PERSONAL_SPACE_DICT))
adds the personal space after all other spaces have been added, which leads to this code also terminating as soon as it encounters data belonging in any personal space, as it was never added to the spaces holder in the first time (since the fetch there caught an exception before it was added).

@henningwold
Copy link

For the record: we seem to have (at least temporarily) solved the issue by changing the description back and forth to touch the updatedAt field, but this would naturally have been a lot more difficult to do if the rogue timestamp had instead been for the createdAt field.

@hsheth2
Copy link
Collaborator

hsheth2 commented Sep 12, 2024

Looks like we should probably be a bit more lenient with our timestamp parsing logic. I suspect that we should use something like dateutil.parser.parse instead of strictly parsing with a specific format.

We made a similar change for dbt #10223 - @henningwold would you be open to creating a PR to fix that code in the Qlik source?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

3 participants