Skip to content

lincc-frameworks/scipy-2025-lsdb-nested

Repository files navigation

Scipy 2025 Tutorial: Large Astronomical Survey Analysis with LSDB & Nested

Template

This repository contains all materials for the LSDB tutorial[TODO: LINK] prepared for the Scipy 2025 Conference in Tacoma, WA.

Main references

Abstract

The exponential growth of large survey catalogs has introduced new challenges for astronomical datasets. Our session showcases the LSDB framework, an analysis framework built upon hierarchically sharded spatial partitioned data with Parquet for efficient cross-matching and analysis. We’ll showcase nested-pandas/nested-dask for time-domain and spectral data, and highlight real-world applications across wide-sky datasets, with the nod toward the upcoming Rubin Survey.

Installation

>> conda create --name lincc python=3.12
>> conda activate lincc
>> pip install lsdb

Hands-On Notebooks

Notebook 1: Basic LSDB Queries

This notebook steps through the basics of the LSDB interface. We query large catalogs for a small chunk of data, and perform some basic filtering and cross-matching between multiple surveys.

In this notebook, we ramp up the scale to analyze a considerably larger section of the sky. We show how to utilize Dask in large scale analysis, showing the available tooling and providing some tips & tricks for optimizing computationally intensive workflows.

This notebook explores the Nested-Pandas API, showing the basics of nesting data and touring the various ways of working with nested data.

This notebook showcases the usage of LSDB and Nested to do large scale time-domain analysis. We build a dataset from multiple input surveys, and select a subset of interesting objects from large sections of the sky. We compute periodograms (or any other function of interest to the user!) on our objects of interest and conclude with working with their spectra.

LINCC Tech Talks

Watch the following LINCC Tech Talk to learn more about LSDB. Other relevant talks can be found in the LSST Discovery Alliance website.

Acknowledgements

This project is supported by Schmidt Sciences.

This project is based upon work supported by the National Science Foundation under Grant No. AST-2003196.

This project acknowledges support from the DIRAC Institute in the Department of Astronomy at the University of Washington. The DIRAC Institute is supported through generous gifts from the Charles and Lisa Simonyi Fund for Arts and Sciences, and the Washington Research Foundation.

About

Proposal for Scipy 2025 tutorial: Large Scale Database (LSDB) + NESTED project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •