Skip to content

Scanner error while joining two tables. #9

@ajaysant

Description

@ajaysant

I am trying to read two tables from Kudu and join them in the query.

I followed the example steps of reading the Table to DataFrame and registering it as a temp table. I repeat the same steps for a second table and then I query on them.

I have then use the dbGetQuery() method to pass a query joining the two tables and getting it in the data frame.

I get the following error:

Failed to fetch data: org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 8.0 failed 1 times, most recent failure: Lost task 19.0 in stage 8.0 (TID 163, localhost, executor driver): org.apache.kudu.client.NonRecoverableException: Scanner not found at org.apache.kudu.client.KuduException.transformException(KuduException.java:110) at org.apache.kudu.client.KuduClient.joinAndHandleException(KuduClient.java:352) at org.apache.kudu.client.KuduScanner.nextRows(KuduScanner.java:58) at org.apache.kudu.spark.kudu.RowIterator.hasNext(KuduRDD.scala:120) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:148) at org.apache.spark.schedule

The sample query is:
`test_query <- paste("SELECT * FROM tbl1 n0 FULL OUTER JOIN tbl2 n1 on n0.id = n1.id WHERE n0.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') AND n1.id LIKE CONCAT(cast(default.getJulianFromDate('yyyy-MM-dd hh:mm:ss', '", Sys.getenv("START"), "') AS STRING),'%') LIMIT 100",sep="")

table_df <- dbGetQuery(sc, test_query)`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions