You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use executeSelect API to run SQL query and read results from BigQuery. We expected a good speed based on
Reading data using executeSelectAPI is extremely slow.
Reading of 100_000 rows takes 23930 ms.
The profiling showed no prominent places where we spent most of the time.
Are there any recent changes that might cause performance degradation for such an API?
Do you have a benchmark to understand what performance we should expect?
Thanks!
I've created a simplified test to show performance:
@Test
fun`test read`() {
val sql =""" SELECT * FROM `pr`""".trimIndent().replace("\n", "")
val connectionSettings =ConnectionSettings.newBuilder()
.setRequestTimeout(300)
.setUseReadAPI(true)
.setMaxResults(5000)
.setUseQueryCache(true)
.build()
val connection = bigQueryOptionsBuilder.build().service.createConnection(connectionSettings)
val bqResult = connection.executeSelect(sql)
val resultSet = bqResult.resultSet
var n =1var lastTime =Instant.now()
while (++n <1_000_000&& resultSet.next()) {
if (n %30_000==0) {
val now =Instant.now()
val duration =Duration.between(lastTime, now)
println("ROW $n Time: ${duration.toMillis()} ms ${DateTimeFormatter.ISO_INSTANT.format(now)}")
lastTime = now
}
}
}
ROW 30000 Time: 5516 ms 2024-11-14T12:35:54.354169Z
ROW 60000 Time: 11230 ms 2024-11-14T12:36:05.585005Z
ROW 90000 Time: 5645 ms 2024-11-14T12:36:11.230378Z
ROW 120000 Time: 5331 ms 2024-11-14T12:36:16.561915Z
ROW 150000 Time: 5458 ms 2024-11-14T12:36:22.019994Z
ROW 180000 Time: 5391 ms 2024-11-14T12:36:27.411807Z
We use executeSelect API to run SQL query and read results from BigQuery. We expected a good speed based on
Reading data using
executeSelect
API is extremely slow.Reading of 100_000 rows takes 23930 ms.
The profiling showed no prominent places where we spent most of the time.
Are there any recent changes that might cause performance degradation for such an API?
Do you have a benchmark to understand what performance we should expect?
Thanks!
Environment details
com.google.cloud:google-cloud-bigquery:2.43.3
Code example
The text was updated successfully, but these errors were encountered: