Skip to content

Commit

Permalink
[doc,#91,#123][l]: lots of new docs on publishing data packages.
Browse files Browse the repository at this point in the history
  • Loading branch information
rufuspollock committed Aug 17, 2014
1 parent ea130c5 commit 1732162
Show file tree
Hide file tree
Showing 8 changed files with 221 additions and 31 deletions.
2 changes: 1 addition & 1 deletion app.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ app.get('/about/contribute', function(req, res) {
res.redirect('/contribute');
});
app.get('/contribute', routes.contribute);
app.get('/publish', routes.publish);
app.get('/publish', redirect('/doc/publish'));
app.get('/roadmap', routes.roadmap);
app.get('/roadmap/core-datasets', function(req, res) {
res.render('core-datasets.html', {title: 'Core Datasets'});
Expand Down
6 changes: 5 additions & 1 deletion doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@

Tutorials on how to publish data as data packages.

* [Publish Data as Data Packages - Introduction][intro]
* [Publish Tabular Data][tabular]
* [Publish Geospatial Data (Geodata)][geodata]
* [Publish Any Kind of Data][any]

[intro]: /doc/publish
[tabular]: /doc/publish-tabular
[geodata]: /doc/publish-geodata
[geodata]: /doc/publish-geo
[any]: /doc/publish-any

## Guides to Standards

Expand Down
44 changes: 44 additions & 0 deletions doc/publish-any.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Publish Any Kind of Data

You can publish **all and any kind of data** as Data packages. It's as simple as 1-2-3:

1. Get your data together
2. Add a `datapackage.json` file to wrap those data files up into a useful
whole (with key information like the license and title)
3. [optional] Share it with others, for example, by uploading the data package online

## 1. Get your data together

Get your data together in one folder (you can have data in subfolders of that
folder too if you wish).

## 2. Add a datapackage.json file

The `datapackage.json` is a small file in [JSON][] format that gives a bit of
information about your dataset. You'll need to create this file and then place
it in the directory you created.

<div class="alert">
Don't worry if you don't know what JSON is - we provide some tools that can
automatically create your this file for you.
</div>

There are 2 options for creating the `datapackage.json`:

Option 1: Use the online [datapackage.json creator tool][creator] - just answer
a few questions and give it your data files and it will spit out a
datapackage.json for you to include in your project

Option 2: Do it yourself - if you're familiar with JSON you can just create
this yourself. Take a look at the [Data Package][dp] tutorial.

[creator]: http://data.okfn.org/tools/create
[JSON]: http://en.wikipedia.org/wiki/JSON
[dp]: http://data.okfn.org/doc/data-package

## 3. Put the data package online

See the [step-by-step instructions for putting your Data Package online][online].

[online]: /doc/publish-online/

48 changes: 48 additions & 0 deletions doc/publish-geo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Publishing Geospatial Data as Data Packages

Publishing your Geodata as Data Packages is very easy.

You have two options for publishing your geodata:

* **Geo Data Package** (Recommended). This is a basic Data Package with the
requirement that data be in GeoJSON and with a few special additions to the
metadata for geodata. See the next section for instructions on how to do
this.
* **Generic Data Package**. This allows you to publish geodata in any kind of
format (KML, Shapefiles, Spatialite etc). If you choose this option you will
want to follow the standard [instructions for packaging any kind of data as a
Data Package][any].

We recommend Geo Data Package if that is possible as it makes it much easier
for you to use 3rd party tools with your Data Package. For example, the [data
package viewer][viewer] on this site will automatically preview a Geo Data Package.

> *Note: this document focuses on *vector* geodata &ndash; i.e. points, lines polygons etc (not
imagery or raster data).*

[any]: /doc/publish-any/
[viewer]: /tools/view

## Geo Data Packages

### Examples

#### [Exemplar Geo Data Package](https://github.com/datasets/ex-geojson)

Demonstrates `multipolygon` and `point` geometry

[View it with the Data Package Viewer][view-1]

[view-1]: http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fdatasets%2Fex-geojson

<script src="http://gist-it.appspot.com/github/datasets/ex-geojson/blob/master/datapackage.json"></script>

#### [Traffic signs of Hansbeke, Belgium](https://github.com/peterdesmet/traffic-signs-hansbeke)

Example of using `point` geometries with described properties in real world situation.

[View it with the Data Package Viewer][view-2]

[view-2]: http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fpeterdesmet%2Ftraffic-signs-hansbeke

<script src="http://gist-it.appspot.com/github/datasets/ex-geojson/blob/master/datapackage.json"></script>
28 changes: 0 additions & 28 deletions doc/publish-geodata.md

This file was deleted.

66 changes: 66 additions & 0 deletions doc/publish-online.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Put it Online

This tutorial is about how to publish your Data Package online for others to
find and use.

It assumes you have already finished packaging up your data as a Data Package
(if not, [check out the instructions here][publish]).

[publish]: /doc/publish/

## It's Just Files Online

Publishing your Data Package is incredibly simple: you just need to post it
online somewhere that others can access.

> Note: if you just want to to share your Data Package with a few others you
> can just send it directly, for example via email.
Since a Data Package is just some files there are as many ways to do
this as there are ways to put files online.

Here we will just provide some general tips and illustrate some of the most
popular publishing options.

## Key Tips

However you publish your Data Package there are a few key points to keep in
mind:

* All the files in the Data Package should be accessible online
* The structure of your Data Package should be preserved. Specifically the
paths between your `datapackage.json` and the data files must be preserved.
For example, if your Data Package directory looked like this on disk:

datapackage.json
data.csv
somedir/other-data.csv

then online it should look like:

http://your.website.com/mydatapackage/datapackage.json
http://your.website.com/mydatapackage/data.csv
http://your.website.com/mydatapackage/somedir/other-data.csv

This can be a problem with services like e.g. Google Drive where files in a
given folder don't have a web address that relates to that folder. The reason
we need to preserve relative paths is that when using the Data Package client
software will compute the full path from the location of the `datapackage.json`
itself plus the relative path for the file give in the `datapackage.json`
resources section.

## Github, Bitbucket etc

One nice option for the more sophisticated is to manage your Data Package in a
git or mercurial repo and push it to github, gitorious, bitbucket or similar.

## S3, Google Storage etc

Cloud storage like S3 and Google Storage are perfect for storing your Data
Packages.

## Dropbox, Google Drive etc

Dropbox and similar providers are a little more problematic because they do not
replicate your local file structure at online URLs.

2 changes: 1 addition & 1 deletion doc/publish-tabular.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Publishing Tabular Data

Here's how to publish your tabular data as [Simple Data Format Data
Here's how to publish your tabular data as [Tabular Data
Packages][sdf]. There are 4 simple steps:

1. Create a folder (directory) - this folder will hold your "data package"
Expand Down
56 changes: 56 additions & 0 deletions doc/publish.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Publish Data as Data Packages

You can publish **any kind of data** as a Data Package.

Turning your existing data into a Data Package is incredibly simple:

1. Get your data together in one place
2. Add a single `datapackage.json` file

That's it!

And once you have packaged up your data making it aviailable for others is as simple as [putting it online][online].

[online]: /doc/publish-online

<div class="row">
<div class="span2">
<h3>
I want to
<br />
package up
<br />
and publish &hellip;
</h3>
</div>
<div class="span4">
<div class="well">
<h3>
Tabular Data
</h3>
<p>If it's in rows and columns it's tabular &ndash; think spreadsheets!</p>
<a href="/doc/publish-tabular" class="btn btn-large">
Go &raquo;
</a>
</div>
<div class="well">
<h3>
Geospatial Data
</h3>
<p>Map or location related? It's geospatial &hellip;</p>
<a href="/doc/publish-geo" class="btn btn-large">
Go &raquo;
</a>
</div>
<div class="well">
<h3>
Any Kind of Data
</h3>
<p>Any kind of data you have &hellip;</p>
<a href="/doc/publish-any" class="btn btn-large">
Go &raquo;
</a>
</div>
</div>
</div>

0 comments on commit 1732162

Please sign in to comment.