-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
README Typo and Failure Running retrieve_open_map_data.py #11
Comments
@SpaceCowboy326 not sure I can assign these, so pinging you for advise. |
@allella just committed some changes which should resolve both issues. I've never really run a public repository before, if there's anything I can do to give you some control over that I'd be happy to. I'm also happy to continue fixing issues as they come up. |
Thanks. The access needed will depend on if you want to continue to maintain this repo or have CFG fork it and continue any future development on the fork. There's nothing stopping folks from forking and rolling their own so if as CFG needs to tweak it then a fork my make more sense. |
Now I'm getting the error below. Do we need to run the script inside of a Python environment? If so then is an environments.yml necessary to specify the external packages?
|
@allella yeah I totally forgot to add that in the README. The script relies on two python packages: requests and bs4 (Beautiful Soup HTML parser). I haven't worked with Python much before, but I've been using pip to install packages. It looks like environments.yml is associated with conda, would you recommend I look into that for automating the process? I could probably rewrite it in JS pretty quickly if that makes things easier as well, I really only wrote it in Python for my own edification. |
I wouldn't worry about it. I think what we can/should do is just cache the JSON data on the server and create additional views that allow people to get lists without having to spider the site. If I expose a list of .geojson links then would it be easy to have the script just reference those files? If they are cached / saved as flat files then the serving should be fast. Eventually I'd like to put Cloudflare in front of the site too so the caching is at a more robust level. |
Totally agree regarding caching, wouldn't want this script running every time someone hits the site. I played around with using https://data.openupstate.org/rest/maps?_format=json to retrieve the data, but I think I ran into a COORS issue. I can set it up to pull from whatever works best for you - ideally, I would love to have it be able to make a single request and get all the layers back at once. |
@SpaceCowboy326 I've set the CORS to allow all origins. Will this work now? |
@allella that did the trick. I will change the script to pull from that URL soon. I can pretty easily pull the data right from the browser now, but I'll probably want to add some way to cache the data before I start doing that. |
I'm not sure it's worth caching on that end because the API is already caching on the backend and the browser should cache the response for up to 1 day due to the HTTP Header I'm sure there are some advantages to what you're suggesting, but they may depend on me making the Drupal API caching strategy more flexible. I'm likely to put this domain behind Cloudflare, or similar before long and at that point it would be smart to allow the browser HTTP headers to deal with any data that's only used by/in the browser. That make sense? |
I think that makes sense, but it should actually be pretty easy for me to detect changes and reduce the number of calls I need to make overall while keeping my data in sync with the API. Each of the map layers from https://data.openupstate.org/rest/maps?_format=json looks like it has an "updated" property, so I can avoid making calls to all of the geojson URLs if mine match. Does that seem worthwhile? |
I haven't looked into the code to know if/when you'd want to cache things on the server running this application. If we're talking about your application needing access to the JSON for it's own use, like if you wanted to cache the list of JSON URLs, then that's probably fine the HTTP headers do nothing if you're using curl or wget or some Python fetching method to get a URL on every initial load. However, if we're talking about the web browser making calls to get JSON then I'd pull that directly from the data.openupstate.org URLs since the browser will cache these responses and we'll continue to add more performance improvements to the API server as there are more applications which depend on it. |
Two issues
python retrieve_open_map_data.py > data/map_data.json
generates the error below.The text was updated successfully, but these errors were encountered: