Streaming LAS files

@tmontaigu

Moving us over here for now since this is not lazperf-specific.

>  I don't know much about LAX files and what they allow us to do in terms of queries,
but if what we can get from the LAX file queries is a bunch of point indexes then we might be able to get something with acceptable performance in pure Python for LAS files.

So far my solution is to take advantage of the memory mapping that laspy provides. Here is a little example. This is not how the package is structured yet, but where I think I am headed.

Something I will be able to get  from `laxpy` is a set of point indices that are in some quadtree cell. We provide the cell index as an argument:

```{python}
my_cell_index = 512
my_point_indices = laxpy.query(my_cell_index)
```

`.lax` files are rather small relative to their "parent" `.las` files, so parsing these in Python is no problem so far, although I have only used fairly small `.las` files.

`query` returns a numpy array of all of the indices associated with that quadtree cell (not all of them are entirely within, see [this video](https://www.youtube.com/watch?v=FMcBywhPgdg) for an explanation). I can then just instantiate the `laspy` memory map and get the points. This seems to be reasonably efficient:

```{python}
my_las = laspy.file.File('my_las_file.las')
my_las.points[my_point_indices]
```

**So What is the Problem?**

This is all fine and good, but we have to rely on `lasindex` itself to construct the `.lax` file. What would be nice is to do this ourselves via Python or Cython or something. That way it is a "complete package". If you refer to the video I linked above, Martin streams the points in one by one to construct the tree. As you have pointed out, doing this in Python itself is incredibly slow. For reference, `lasindex` can do this for ~600mb las tile in about 5-10 seconds...for the Python version I got tired of waiting!

The other solution is to just raise up our hands and load the entire point cloud into memory, construct the tree and write it to file. But this is silly,  I think. The whole motivation behind creating the index in the first place is for querying large las files.

Any ideas you might have would be great.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming LAS files #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Streaming LAS files #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions