Merge pull request #4769 from NVIDIA/branch-22.02

NvTimLiu · web-flow · commit a32ec69e67ff · 2022-02-14T11:52:50.000+08:00
Merge remote-tracking branch 'origin/branch-22.02' into main [skip ci]
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,5 @@
 # Change log
-Generated on 2022-02-07
+Generated on 2022-02-14
 
 ## Release 22.02
 
@@ -117,6 +117,13 @@ Generated on 2022-02-07
 ### PRs
 |||
 |:---|:---|
+|[#4771](https://github.com/NVIDIA/spark-rapids/pull/4771)|revert cudf api links from legacy to stable[skip ci]|
+|[#4767](https://github.com/NVIDIA/spark-rapids/pull/4767)|Update 22.02 changelog to latest [skip ci]|
+|[#4750](https://github.com/NVIDIA/spark-rapids/pull/4750)|Updated doc for decimal support|
+|[#4757](https://github.com/NVIDIA/spark-rapids/pull/4757)|Update qualification tool to remove DECIMAL 128 as potential problem|
+|[#4755](https://github.com/NVIDIA/spark-rapids/pull/4755)|Fix databricks doc for limitations.[skip ci]|
+|[#4751](https://github.com/NVIDIA/spark-rapids/pull/4751)|Fix broken hyperlinks in documentation [skip ci]|
+|[#4706](https://github.com/NVIDIA/spark-rapids/pull/4706)|Update 22.02 changelog to latest [skip ci]|
 |[#4700](https://github.com/NVIDIA/spark-rapids/pull/4700)|Update cudfjni version to released 22.02.0|
 |[#4701](https://github.com/NVIDIA/spark-rapids/pull/4701)|Decrease nighlty tests upper limitation to 7 [skip ci]|
 |[#4639](https://github.com/NVIDIA/spark-rapids/pull/4639)|Update changelog for 22.02 and archive info of some older releases [skip ci]|
diff --git a/docs/demo/AWS-EMR/Mortgage-ETL-GPU-EMR.ipynb b/docs/demo/AWS-EMR/Mortgage-ETL-GPU-EMR.ipynb
@@ -12,7 +12,7 @@
     "\n",
     "Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. This processed dataset is redistributed with permission and consent from Fannie Mae. For the full raw dataset visit [Fannie Mae]() to register for an account and to download\n",
     "\n",
-    "Instruction is available at NVIDIA [RAPIDS demo site](https://rapidsai.github.io/demos/datasets/mortgage-data).\n",
+    "Instruction is available at NVIDIA [RAPIDS demo site](https://docs.rapids.ai/datasets/mortgage-data).\n",
     "\n",
     "## Prerequisite\n",
     "\n",
diff --git a/docs/demo/GCP/Mortgage-ETL-CPU.ipynb b/docs/demo/GCP/Mortgage-ETL-CPU.ipynb
@@ -8,7 +8,7 @@
     "\n",
     "Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. This processed dataset is redistributed with permission and consent from Fannie Mae. For the full raw dataset visit [Fannie Mae]() to register for an account and to download\n",
     "\n",
-    "Instruction is available at NVIDIA [RAPIDS demo site](https://rapidsai.github.io/demos/datasets/mortgage-data).\n",
+    "Instruction is available at NVIDIA [RAPIDS demo site](https://docs.rapids.ai/datasets/mortgage-data).\n",
     "\n",
     "### Prerequisite\n",
     "\n",
diff --git a/docs/demo/GCP/Mortgage-ETL-GPU.ipynb b/docs/demo/GCP/Mortgage-ETL-GPU.ipynb
@@ -12,7 +12,7 @@
     "\n",
     "Dataset is derived from Fannie Mae’s [Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) with all rights reserved by Fannie Mae. This processed dataset is redistributed with permission and consent from Fannie Mae. For the full raw dataset visit [Fannie Mae]() to register for an account and to download\n",
     "\n",
-    "Instruction is available at NVIDIA [RAPIDS demo site](https://rapidsai.github.io/demos/datasets/mortgage-data).\n",
+    "Instruction is available at NVIDIA [RAPIDS demo site](https://docs.rapids.ai/datasets/mortgage-data).\n",
     "\n",
     "### Prerequisite\n",
     "\n",
diff --git a/docs/download.md b/docs/download.md
@@ -619,8 +619,8 @@ account the scenario where input data can be stored across many small files.  By
 CPU threads v0.2 delivers up to 6x performance improvement over the previous release for small
 Parquet file reads.
 
-The RAPIDS Accelerator introduces a beta feature that accelerates [Spark shuffle for
-GPUs](get-started/getting-started-on-prem.md#enabling-rapidsshufflemanager).  Accelerated
+The RAPIDS Accelerator introduces a beta feature that accelerates 
+[Spark shuffle for GPUs](get-started/getting-started-on-prem.md#enabling-rapids-shuffle-manager).  Accelerated
 shuffle makes use of high bandwidth transfers between GPUs (NVLink or p2p over PCIe) and leverages
 RDMA (RoCE or Infiniband) for remote transfers. 
 
diff --git a/docs/get-started/getting-started-databricks.md b/docs/get-started/getting-started-databricks.md
@@ -26,12 +26,12 @@ The number of GPUs per node dictates the number of Spark executors that can run
 1. Adaptive query execution(AQE) and Delta optimization write do not work. These should be disabled
 when using the plugin. Queries may still see significant speedups even with AQE disabled.
 
-    ```bash 
-    spark.databricks.delta.optimizeWrite.enabled false
-    spark.sql.adaptive.enabled false
-    ```
+   ```bash 
+   spark.databricks.delta.optimizeWrite.enabled false
+   spark.sql.adaptive.enabled false
+   ```
     
-    See [issue-1059](https://github.com/NVIDIA/spark-rapids/issues/1059) for more detail. 
+   See [issue-1059](https://github.com/NVIDIA/spark-rapids/issues/1059) for more detail. 
 
 2. Dynamic partition pruning(DPP) does not work.  This results in poor performance for queries which
    would normally benefit from DPP.  See
@@ -42,10 +42,10 @@ when using the plugin. Queries may still see significant speedups even with AQE
 
 4. Cannot spin off multiple executors on a multi-GPU node. 
 
-	Even though it is possible to set `spark.executor.resource.gpu.amount=N` (where N is the number
-    of GPUs per node) in the in Spark Configuration tab, Databricks overrides this to
-    `spark.executor.resource.gpu.amount=1`.  This will result in failed executors when starting the
-    cluster.
+   Even though it is possible to set `spark.executor.resource.gpu.amount=1` in the in Spark 
+   Configuration tab, Databricks overrides this to `spark.executor.resource.gpu.amount=N` 
+   (where N is the number of GPUs per node). This will result in failed executors when starting the
+   cluster.
 
 5. Databricks makes changes to the runtime without notification.
 
diff --git a/docs/get-started/getting-started-gcp.md b/docs/get-started/getting-started-gcp.md
@@ -85,9 +85,9 @@ If you'd like to further accelerate init time to 4-5 minutes, create a custom Da
 ## Run PySpark or Scala Notebook on a Dataproc Cluster Accelerated by GPUs
 To use notebooks with a Dataproc cluster, click on the cluster name under the Dataproc cluster tab
 and navigate to the "Web Interfaces" tab.  Under "Web Interfaces", click on the JupyterLab or
-Jupyter link to start to use sample [Mortgage ETL on GPU Jupyter
-Notebook](../demo/GCP/Mortgage-ETL-GPU.ipynb) to process full 17 years [Mortgage
-data](https://rapidsai.github.io/demos/datasets/mortgage-data).
+Jupyter link to start to use sample 
+[Mortgage ETL on GPU Jupyter Notebook](../demo/GCP/Mortgage-ETL-GPU.ipynb) to process full 17 years 
+[Mortgage data](https://docs.rapids.ai/datasets/mortgage-data).
 
 ![Dataproc Web Interfaces](../img/GCP/dataproc-service.png)
 
diff --git a/docs/get-started/getting-started-workload-qualification.md b/docs/get-started/getting-started-workload-qualification.md
@@ -30,8 +30,8 @@ This article describes the tools we provide and how to do gap analysis and workl
 ### How to use
 
 If you have Spark event logs from prior runs of the applications on Spark 2.x or 3.x, you can use
-the [Qualification tool](../spark-qualification-tool.md) and [Profiling
-tool](../spark-profiling-tool.md) to analyze them.  The qualification tool outputs the score, rank
+the [Qualification tool](../spark-qualification-tool.md) and 
+[Profiling tool](../spark-profiling-tool.md) to analyze them.  The qualification tool outputs the score, rank
 and some of the potentially not-supported features for each Spark application.  For example, the CSV
 output can print `Unsupported Read File Formats and Types`, `Unsupported Write Data Format` and
 `Potential Problems` which are the indication of some not-supported features.  Its output can help
@@ -119,8 +119,8 @@ the driver logs with `spark.rapids.sql.explain=all`.
 
 This log can show you which operators (on what data type) can not run on GPU and the reason.
 If it shows a specific RAPIDS Accelerator parameter which can be turned on to enable that feature,
-you should first understand the risk and applicability of that parameter based on [configs
-doc](../configs.md) and then enable that parameter and try the tool again.
+you should first understand the risk and applicability of that parameter based on 
+[configs doc](../configs.md) and then enable that parameter and try the tool again.
 
 Since its output is directly based on specific version of `rapids-4-spark` jar, the gap analysis is
 pretty accurate.
@@ -213,8 +213,8 @@ which is the same as the driver logs with `spark.rapids.sql.explain=all`.
 
 This log can show you which operators (on what data type) can not run on GPU and the reason.
 If it shows a specific RAPIDS Accelerator parameter which can be turned on to enable that feature,
-you should first understand the risk and applicability of that parameter based on [configs
-doc](../configs.md) and then enable that parameter and try the tool again.
+you should first understand the risk and applicability of that parameter based on 
+[configs doc](../configs.md) and then enable that parameter and try the tool again.
 
 Since its output is directly based on specific version of `rapids-4-spark` jar, the gap analysis is
 pretty accurate.
diff --git a/docs/spark-profiling-tool.md b/docs/spark-profiling-tool.md
@@ -406,8 +406,8 @@ SQL Duration and Executor CPU Time Percent
 +--------+-------------------+-----+------------+--------------------------+------------+---------------------------+-------------------------+
 |appIndex|App ID             |sqlID|SQL Duration|Contains Dataset or RDD Op|App Duration|Potential Problems         |Executor CPU Time Percent|
 +--------+-------------------+-----+------------+--------------------------+------------+---------------------------+-------------------------+
-|1       |local-1626104300434|0    |1260        |false                     |131104      |DECIMAL:NESTED COMPLEX TYPE|92.65                    |
-|1       |local-1626104300434|1    |259         |false                     |131104      |DECIMAL:NESTED COMPLEX TYPE|76.79                    |
+|1       |local-1626104300434|0    |1260        |false                     |131104      |NESTED COMPLEX TYPE        |92.65                    |
+|1       |local-1626104300434|1    |259         |false                     |131104      |NESTED COMPLEX TYPE        |76.79                    |
 ```
 
 - Shuffle Skew Check: 
diff --git a/docs/spark-qualification-tool.md b/docs/spark-qualification-tool.md
@@ -318,8 +318,7 @@ Its summary report outputs the following information:
 2. Application duration
 3. SQL/DF duration 
 4. Problematic Duration, which indicates potential issues for acceleration.
-   Some of the potential issues include unsupported data formats such as Decimal 128-bit 
-   or User Defined Function (UDF) or any Dataset APIs. 
+   Some of the potential issues include User Defined Function (UDF) or any Dataset APIs.
 
 Note: the duration(s) reported are in milli-seconds.
 Sample output in text:
@@ -335,13 +334,11 @@ In the above example, two application event logs were analyzed. “app-202105071
 than the “app-20210507174503-1704” because the score(in the csv output) for “app-20210507174503-2538”   
 is higher than  “app-20210507174503-1704”. 
 Here the `Problematic Duration` is zero but please keep in mind that we are only able to detect certain issues. 
-This currently includes some UDFs, some decimal operations and nested complex types.
+This currently includes some UDFs and nested complex types.
 The tool won't catch all UDFs, and some of the UDFs can be handled with additional steps.
 
 Please refer to [supported_ops.md](./supported_ops.md) 
 for more details on UDF.
-For decimals, the tool tries to parse for decimal operations but it may not capture all of the decimal operations
-if they aren’t in the event logs.
 
 The second output is a more detailed output.
 Here is a sample output requesting csv style output:
@@ -358,7 +355,7 @@ Here is a brief description of each of column that is in the CSV:
 2. App ID: Spark Application ID.
 3. Score :  A score calculated based on SQL Dataframe Task Duration and gets negatively affected for any unsupported operators.
    Please refer to [Qualification tool score algorithm](#Qualification-tool-score-algorithm) for more details.
-4. Potential Problems : Some UDFs, some decimal operations and nested complex types.
+4. Potential Problems : Some UDFs and nested complex types.
 5. SQL DF Duration: Time duration that includes only SQL/Dataframe queries.
 6. SQL Dataframe Task Duration: Amount of time spent in tasks of SQL Dataframe operations.
 7. App Duration: Total Application time.
diff --git a/docs/supported_ops.md b/docs/supported_ops.md
@@ -14,14 +14,12 @@ apply to other versions of Spark, but there may be slight changes.
 
 # General limitations
 ## `Decimal`
-The `Decimal` type in Spark supports a precision
-up to 38 digits (128-bits). The RAPIDS Accelerator in most cases stores values up to
-64-bits and will support 128-bit in the future. As such the accelerator currently only
-supports a precision up to 18 digits. Note that
-decimals are disabled by default in the plugin, because it is supported by a relatively
-small number of operations presently. This can result in a lot of data movement to and
-from the GPU, slowing down processing in some cases.
-Result `Decimal` precision and scale follow the same rule as CPU mode in Apache Spark:
+The `Decimal` type in Spark supports a precision up to 38 digits (128-bits). 
+The RAPIDS Accelerator supports 128-bit starting from version 21.12 and decimals are 
+enabled by default.
+Please check [Decimal Support](compatibility.md#decimal-support) for more details.
+
+`Decimal` precision and scale follow the same rule as CPU mode in Apache Spark:
 
 ```
  * In particular, if we have expressions e1 and e2 with precision/scale p1/s1 and p2/s2
diff --git a/docs/tuning-guide.md b/docs/tuning-guide.md
@@ -337,7 +337,7 @@ Custom Spark SQL Metrics are available which can help identify performance bottl
 
 Not all metrics are enabled by default. The configuration setting `spark.rapids.sql.metrics.level` can be set
 to `DEBUG`, `MODERATE`, or `ESSENTIAL`, with `MODERATE` being the default value. More information about this
-configuration option is available in the <a href="configs.md#sql.metrics.level">configuration</a> documentation.
+configuration option is available in the [configuration documentation](configs.md#sql.metrics.level).
 
 Output row and batch counts show up for operators where the number of output rows or batches are
 expected to change. For example a filter operation would show the number of rows that passed the
diff --git a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/TypeChecks.scala
@@ -1706,14 +1706,12 @@ object SupportedOpsDocs {
     println()
     println("# General limitations")
     println("## `Decimal`")
-    println("The `Decimal` type in Spark supports a precision")
-    println("up to 38 digits (128-bits). The RAPIDS Accelerator in most cases stores values up to")
-    println("64-bits and will support 128-bit in the future. As such the accelerator currently only")
-    println(s"supports a precision up to ${DType.DECIMAL64_MAX_PRECISION} digits. Note that")
-    println("decimals are disabled by default in the plugin, because it is supported by a relatively")
-    println("small number of operations presently. This can result in a lot of data movement to and")
-    println("from the GPU, slowing down processing in some cases.")
-    println("Result `Decimal` precision and scale follow the same rule as CPU mode in Apache Spark:")
+    println("The `Decimal` type in Spark supports a precision up to 38 digits (128-bits). ")
+    println("The RAPIDS Accelerator supports 128-bit starting from version 21.12 and decimals are ")
+    println("enabled by default.")
+    println("Please check [Decimal Support](compatibility.md#decimal-support) for more details.")
+    println()
+    println("`Decimal` precision and scale follow the same rule as CPU mode in Apache Spark:")
     println()
     println("```")
     println(" * In particular, if we have expressions e1 and e2 with precision/scale p1/s1 and p2/s2")
diff --git a/tools/src/main/scala/org/apache/spark/sql/rapids/tool/AppBase.scala b/tools/src/main/scala/org/apache/spark/sql/rapids/tool/AppBase.scala
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2021, NVIDIA CORPORATION.
+ * Copyright (c) 2021-2022, NVIDIA CORPORATION.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -128,17 +128,10 @@ abstract class AppBase(
     }
   }
 
-  // Decimal support on the GPU is limited to less than 18 digits and decimals
-  // are configured off by default for now. It would be nice to have this
-  // based off of what plugin supports at some point.
-  private val decimalKeyWords = Map(".*promote_precision\\(.*" -> "DECIMAL",
-    ".*decimal\\([0-9]+,[0-9]+\\).*" -> "DECIMAL",
-    ".*DecimalType\\([0-9]+,[0-9]+\\).*" -> "DECIMAL")
-
   private val UDFKeywords = Map(".*UDF.*" -> "UDF")
 
   protected def findPotentialIssues(desc: String): Set[String] =  {
-    val potentialIssuesRegexs = UDFKeywords ++ decimalKeyWords
+    val potentialIssuesRegexs = UDFKeywords
     val issues = potentialIssuesRegexs.filterKeys(desc.matches(_))
     issues.values.toSet
   }
diff --git a/tools/src/test/resources/ProfilingExpectations/rapids_duration_and_cpu_expectation.csv b/tools/src/test/resources/ProfilingExpectations/rapids_duration_and_cpu_expectation.csv
@@ -1,9 +1,9 @@
 appIndex,App ID,sqlID,SQL Duration,Contains Dataset or RDD Op,App Duration,Potential Problems,Executor CPU Time Percent
-1,local-1626104300434,0,1260,false,131104,DECIMAL:NESTED COMPLEX TYPE,92.65
-1,local-1626104300434,1,259,false,131104,DECIMAL:NESTED COMPLEX TYPE,76.79
+1,local-1626104300434,0,1260,false,131104,NESTED COMPLEX TYPE,92.65
+1,local-1626104300434,1,259,false,131104,NESTED COMPLEX TYPE,76.79
 1,local-1626104300434,2,130,false,131104,NESTED COMPLEX TYPE,90.48
-1,local-1626104300434,3,76,false,131104,DECIMAL:NESTED COMPLEX TYPE,97.56
+1,local-1626104300434,3,76,false,131104,NESTED COMPLEX TYPE,97.56
 1,local-1626104300434,4,65,false,131104,NESTED COMPLEX TYPE,100.0
 1,local-1626104300434,5,479,false,131104,NESTED COMPLEX TYPE,87.32
-1,local-1626104300434,6,95,false,131104,DECIMAL:NESTED COMPLEX TYPE,96.3
-1,local-1626104300434,7,65,false,131104,DECIMAL:NESTED COMPLEX TYPE,95.24
+1,local-1626104300434,6,95,false,131104,NESTED COMPLEX TYPE,96.3
+1,local-1626104300434,7,65,false,131104,NESTED COMPLEX TYPE,95.24
diff --git a/tools/src/test/resources/QualificationExpectations/complex_dec_expectation.csv b/tools/src/test/resources/QualificationExpectations/complex_dec_expectation.csv
@@ -1,2 +1,2 @@
 App Name,App ID,Score,Potential Problems,SQL DF Duration,SQL Dataframe Task Duration,App Duration,Executor CPU Time Percent,App Duration Estimated,SQL Duration with Potential Problems,SQL Ids with Failures,Read Score Percent,Read File Format Score,Unsupported Read File Formats and Types,Unsupported Write Data Format,Complex Types,Nested Complex Types
-Spark shell,local-1626104300434,1469.0,DECIMAL:NESTED COMPLEX TYPE,2429,1469,131104,88.35,false,160,"",20,100.0,"","",struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;string>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>;array<string>,struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>
+Spark shell,local-1626104300434,1469.0,NESTED COMPLEX TYPE,2429,1469,131104,88.35,false,0,"",20,100.0,"","",struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;string>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>;array<string>,struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>
diff --git a/tools/src/test/resources/QualificationExpectations/decimal_part_expectation.csv b/tools/src/test/resources/QualificationExpectations/decimal_part_expectation.csv
diff --git a/tools/src/test/resources/QualificationExpectations/write_format_expectation.csv b/tools/src/test/resources/QualificationExpectations/write_format_expectation.csv
diff --git a/tools/src/test/resources/spark-events-qualification/decimal_part_eventlog.zstd b/tools/src/test/resources/spark-events-qualification/decimal_part_eventlog.zstd
diff --git a/tools/src/test/scala/com/nvidia/spark/rapids/tool/qualification/QualificationSuite.scala b/tools/src/test/scala/com/nvidia/spark/rapids/tool/qualification/QualificationSuite.scala

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,5 @@`
`1`	`1`	`/*`
`2`		`- * Copyright (c) 2021, NVIDIA CORPORATION.`
	`2`	`+ * Copyright (c) 2021-2022, NVIDIA CORPORATION.`
`3`	`3`	`*`
`4`	`4`	`* Licensed under the Apache License, Version 2.0 (the "License");`
`5`	`5`	`* you may not use this file except in compliance with the License.`
`@@ -128,17 +128,10 @@ abstract class AppBase(`
`128`	`128`	`}`
`129`	`129`	`}`
`130`	`130`
`131`		`- // Decimal support on the GPU is limited to less than 18 digits and decimals`
`132`		`- // are configured off by default for now. It would be nice to have this`
`133`		`- // based off of what plugin supports at some point.`
`134`		`- private val decimalKeyWords = Map(".promote_precision\\(." -> "DECIMAL",`
`135`		`- ".decimal\\([0-9]+,[0-9]+\\)." -> "DECIMAL",`
`136`		`- ".DecimalType\\([0-9]+,[0-9]+\\)." -> "DECIMAL")`
`137`		`-`
`138`	`131`	`private val UDFKeywords = Map(".UDF." -> "UDF")`
`139`	`132`
`140`	`133`	`protected def findPotentialIssues(desc: String): Set[String] = {`
`141`		`- val potentialIssuesRegexs = UDFKeywords ++ decimalKeyWords`
	`134`	`+ val potentialIssuesRegexs = UDFKeywords`
`142`	`135`	`val issues = potentialIssuesRegexs.filterKeys(desc.matches(_))`
`143`	`136`	`issues.values.toSet`
`144`	`137`	`}`
Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,2 @@`
`1`	`1`	`App Name,App ID,Score,Potential Problems,SQL DF Duration,SQL Dataframe Task Duration,App Duration,Executor CPU Time Percent,App Duration Estimated,SQL Duration with Potential Problems,SQL Ids with Failures,Read Score Percent,Read File Format Score,Unsupported Read File Formats and Types,Unsupported Write Data Format,Complex Types,Nested Complex Types`
`2`		-Spark shell,local-1626104300434,1469.0,DECIMAL:NESTED COMPLEX TYPE,2429,1469,131104,88.35,false,160,"",20,100.0,"","",struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;string>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>;array<string>,struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>
	`2`	+Spark shell,local-1626104300434,1469.0,NESTED COMPLEX TYPE,2429,1469,131104,88.35,false,0,"",20,100.0,"","",struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;string>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>;array<string>,struct<firstname:string;middlename:array<string>;lastname:string>;struct<current:struct<state:string;city:string>;previous:struct<state:map<string;string>;city:string>>;array<struct<city:string;state:string>>;map<string;array<string>>;map<string;map<string;string>>;array<array<string>>