Skip to content

Commit fe87bfc

Browse files
committed
Add query of subsetted stats file.
1 parent eed0e84 commit fe87bfc

File tree

1 file changed

+26
-0
lines changed

1 file changed

+26
-0
lines changed

sections/parquet-arrow.qmd

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,32 @@ os.stat("low_coverage.parquet").st_size / (1024 * 1024)
198198
1.4038171768188477
199199
```
200200

201+
And last, we can query this new table that we have created:
202+
203+
```{python}
204+
#| eval: false
205+
duckdb.sql("SELECT * from low_coverage order by sum desc").limit(10).show()
206+
```
207+
```
208+
┌──────────────────────────────────────────────────────────────────────────────────┬───────────────────┐
209+
│ bounds │ sum │
210+
│ varchar │ double │
211+
├──────────────────────────────────────────────────────────────────────────────────┼───────────────────┤
212+
│ [116.99340820312375, 117.00439453124875, 71.32324218750009, 71.33422851562509] │ 9.999709300915525 │
213+
│ [116.99340820312375, 117.00439453124875, 71.32324218750009, 71.33422851562509] │ 9.999709300915525 │
214+
│ [-77.78320312500044, -77.76123046875044, 81.47460937500003, 81.49658203125003] │ 9.999367985118123 │
215+
│ [-154.6655273437501, -154.6435546875001, 70.88378906250009, 70.90576171875009] │ 9.998618022818269 │
216+
│ [-116.36169433593756, -116.35620117187506, 77.39868164062501, 77.40417480468751] │ 9.998248625671277 │
217+
│ [95.05920410156227, 95.06469726562477, 71.65832519531251, 71.66381835937501] │ 9.997617767005346 │
218+
│ [-100.64575195312506, -100.64025878906256, 78.31054687500001, 78.31604003906251] │ 9.997102780461583 │
219+
│ [123.91479492187477, 123.92028808593727, 73.11950683593751, 73.12500000000001] │ 9.996871067852531 │
220+
│ [-167.684326171875, -167.6788330078125, 65.73669433593751, 65.74218750000001] │ 9.995733779230603 │
221+
│ [158.52172851562472, 158.52722167968722, 69.99938964843751, 70.00488281250001] │ 9.995298698978356 │
222+
├──────────────────────────────────────────────────────────────────────────────────┴───────────────────┤
223+
│ 10 rows 2 columns │
224+
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
225+
```
226+
201227
Amazingly, DuckDB and parquet handle all of this high-performance access without any server-side services running. Typically, remote data access would be provided through a server side service like a postgres database or some other heavyweight process. But in this case, all we have is a file on disk be served up by a standard web server, and all of the querying is done completely client-side, and quickly because of the beauty of the parquet file format.
202228

203229
Of course, you can also download the whole parquet file and access it locally through duckdb as well!

0 commit comments

Comments
 (0)