Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame extraction from Geometry and friends #220

Closed
zerothi opened this issue May 15, 2020 · 6 comments
Closed

DataFrame extraction from Geometry and friends #220

zerothi opened this issue May 15, 2020 · 6 comments

Comments

@zerothi
Copy link
Owner

zerothi commented May 15, 2020

I don't know if this belongs to this discussion, but since I already said something about it, I'm going to comment it here. It's regarding this:

By the way, this kind of functionality is very similar to what a pandas dataframe would do isn't it?

This is a wild idea, I know, but maybe it could be useful to generate dataframes containing some attributes of the geometry:

df = geom.to_df(cols=["Z", "neighs", "angles", "force"])

Geometry would know how to build these dataframes because it knows what the attributes mean. Then the user would have all the power of dataframes to do not only complex filters but also groupby, describe, etc...

I just thought that, apart from filtering, this feature in sisl objects could be extremely useful for visualizing. Here's why: For the visualization module I am using plotly. Plotly has a high-level API under plotly.express (https://plotly.com/python/plotly-express/). Plotly express implements some plots like scatter, line, histograms, polar, maps... and more. Basically how it works is that, given a pandas DataFrame, you define your plot as columns of this dataframe. An example of this:

import plotly.express as px

px.scatter3d(df, x="The x column", y="The y column", z="The z column",
         color="Column that defines color", animation_frame="The column that defines the frames", 
         symbol=..., etc  )

So, if sisl objects had a to_df method as proposed, it would be trivial to implement a "sisl.express" in the visualization module that would just parse the object into a dataframe before passing it to any plotly.express method. The possibilities would be endless with very simple code:

import sisl
import sisl.viz.express as sx

geom = sisl.Geometry()
sx.scatter(geom, x="z", y="neighs", color="species")
# or
sx.histogram(geom, x="z", color="species", 
     ...other very useful kwargs of plotly express like marginal="violin") 

And really all that would be happening would be that this line:

sx.*(sisl_obj, x="z", y="neighs", color="species")

is converted into this other line:

px.*(sisl_obj.to_df(cols=["z", "neighs", "species"]), x="z", y="neighs", color="species")

I don't know, seems pretty exciting to me :)

Originally posted by @pfebrer96 in #218 (comment)

@pfebrer
Copy link
Contributor

pfebrer commented May 15, 2020

My part of the deal is ready for when this is implemented :)

It's this simple:

from functools import wraps

import plotly.express as px
from sisl._dispatcher import AbstractDispatch, ClassDispatcher

class WithSislManagement(AbstractDispatch):
    
    def __init__(self, px):
        
        self._obj = px
    
    def dispatch(self, method):

        @wraps(method)
        def with_plot_update(*args, **kwargs):
            
            if len(args) > 0:
                # The first arg is the sisl object
                args = list(args)
                args[0] = args[0].to_df() #In fact, we should generate only the corresponding df

            ret = method(*args, **kwargs)

            return ret

        return with_plot_update
        
    
sx = WithSislManagement(px)      

And this powerful (you can try it):

import sisl
import pandas as pd

geom = sisl.geom.graphene_nanoribbon(20, atom=("N", "B"))

# Fake implementation of to_df
geom.to_df = lambda: pd.DataFrame({"x": geom.xyz[:,0], "y": geom.xyz[:,1], "z": geom.xyz[:,2], 
                                   "Z": geom.atom.Z, "tag": [atom.tag for atom in geom.atom]})

And the magic begins:

sx.histogram(geom, x="x", color="tag", nbins=20, marginal="violin")

image

sx.scatter(geom, x="x", y="y", size="y", color="tag", facet_col="tag")

image

fig = sx.scatter_3d(geom, x="x", y="y", z="z", color="tag")
# Set aspect-ratio to 1:1:1
fig.update_layout(scene={"aspectmode": "data"})

image

Other cool keywords that you can play with: animation_frame (it will automatically build an animation).

I hope that this visual demo can convince you of the capabilities and get you excited about it :)

@pfebrer
Copy link
Contributor

pfebrer commented May 15, 2020

And by the way, this would also be supported by the GUI.

@zerothi
Copy link
Owner Author

zerothi commented Jun 5, 2020

I think we should rename this function to to_dataframe, for clarity. :)

@pfebrer
Copy link
Contributor

pfebrer commented Jun 5, 2020

Yes, sure 👍

@zerothi
Copy link
Owner Author

zerothi commented Aug 26, 2021

@pfebrer isn't this already implemented in Geometry.to.dataframe? Is there something missing?

Once #196 is in the rest may be easier to add.

@pfebrer
Copy link
Contributor

pfebrer commented Aug 26, 2021

Yes this is solved for now 👍

@zerothi zerothi closed this as completed Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants