Skip to content

Commit 54486aa

Browse files
committed
Refactor, improve API, documentation and packaging
1 parent 6824721 commit 54486aa

17 files changed

+310
-2078
lines changed

CHANGELOG.rst

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
==========
2+
Change Log
3+
==========
4+
5+
6+
Unreleased
7+
==========
8+
9+
Added
10+
-----
11+
12+
* Basic functionality to manipulate HSD-data in Python.
13+
14+
* Pip installation

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright (c) 2011-2020 DFTB+ developers group
1+
Copyright (c) 2011-2021 DFTB+ developers group
22

33
All rights reserved.
44

README.rst

+80-28
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,40 @@
1-
************************************
2-
HSD — Human-friendly Structured Data
3-
************************************
1+
**********************************************
2+
HSD — Make your structured data human friendly
3+
**********************************************
44

5-
This Python package contains utilities to read and write files in
6-
the Human-friendly Structured Data (HSD) format.
5+
This package contains utilities to read and write files in the Human-friendly
6+
Structured Data (HSD) format.
77

8-
It is licensed under the *BSD 2-clause license*.
8+
The HSD-format is very similar to both JSON and YAML, but tries to minimize the
9+
effort for **humans** to read and write it. It ommits special characters as much
10+
as possible (in contrast to JSON) and is not indentation dependent (in contrast
11+
to YAML). It was developed originally as the input format for the scientific
12+
simulation tool (`DFTB+ <https://github.com/dftbplus/dftbplus>`_), but is
13+
of general purpose. Data stored in HSD can be easily mapped to a subset of JSON
14+
or XML andvica versa.
915

1016

1117
Installation
1218
============
1319

14-
To install the python package in development mode use
20+
The package can be installed via conda-forge::
1521

16-
.. code::
22+
conda install hsd-python
1723

18-
pip install -e src
24+
Alternatively, the package can be downloaded and installed via pip into the
25+
active Python interpreter (preferably using a virtual python environment) by ::
1926

27+
pip install hsd
2028

21-
The HSD format
22-
==============
29+
or into the user space issueing ::
30+
31+
pip install --user hsd
2332

24-
The HSD-format is very similar to both JSON and XML, but tries to minimize the
25-
effort for humans to read and write it. It ommits special characters as much as
26-
possible but (in contrast to YAML for example) is not indentation dependent.
2733

28-
It was developed originally as the input format for a scientific simulation tool
29-
(`DFTB+ <https://github.com/dftbplus/dftbplus>`_), but is absolutely general. A
30-
typical input written in HSD looks like ::
34+
Quick tutorial
35+
==============
36+
37+
A typical, self-explaining input written in HSD looks like ::
3138

3239
driver {
3340
conjugate_gradients {
@@ -45,11 +52,13 @@ typical input written in HSD looks like ::
4552
}
4653
filling {
4754
fermi {
48-
temperature [kelvin] = 1e-8
55+
# This is comment which will be ignored
56+
# Note the attribute (unit) of the field below
57+
temperature [kelvin] = 100
4958
}
5059
}
5160
k_points_and_weights {
52-
supercell_folding = {
61+
supercell_folding {
5362
2 0 0
5463
0 2 0
5564
0 0 2
@@ -59,13 +68,56 @@ typical input written in HSD looks like ::
5968
}
6069
}
6170

62-
Content in HSD format can be represented as JSON. Content in JSON format can
63-
similarly be represented as HSD, provided it satisfies one restriction for
64-
arrays: Either all elements of an array must be objects or none of them. (This
65-
allows for a clear separation of structure and data and allows for the very
66-
simple input format.)
71+
The above input can be parsed into a Python dictionary with::
72+
73+
import hsd
74+
hsdinput = hsd.load_file("test.hsd")
75+
76+
The dictionary ``hsdinput`` will then look as::
77+
78+
{
79+
"driver": {
80+
"conjugate_gradients" {
81+
"moved_atoms": [1, 2, "7:19"],
82+
"max_steps": 100
83+
}
84+
},
85+
"hamiltonian": {
86+
"dftb": {
87+
"scc": True,
88+
"scc_tolerance": 1e-10,
89+
"mixer": {
90+
"broyden": {}
91+
},
92+
"filling": {
93+
"fermi": {
94+
"temperature": 100,
95+
"temperature.attrib": "kelvin"
96+
}
97+
}
98+
"k_points_and_weights": {
99+
"supercell_folding": [
100+
[2, 0, 0],
101+
[0, 2, 0],
102+
[0, 0, 2],
103+
[0.5, 0.5, 0.5]
104+
]
105+
}
106+
}
107+
}
108+
}
109+
110+
Being a simple Python dictionary, it can be easily queried and manipulated in
111+
Python ::
112+
113+
hsdinput["driver"]["conjugate_gradients"]["max_steps"] = 200
114+
115+
and then stored again in HSD format ::
116+
117+
hsd.dump_file(hsdinput, "test2.hsd")
118+
119+
120+
License
121+
========
67122

68-
Content in HSD format can be represented as XML (DOM-tree). Likewise content in
69-
XML can be converted to HSD, provided it satisfies the restriction that every
70-
child has either data (text) or further children, but never both of
71-
them. (Again, this ensures the simplicity of the input format.)
123+
The hsd-python package is licensed under the `BSD 2-clause license <LICENSE>`_.

pyproject.toml

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[build-system]
2+
requires = ["setuptools", "wheel"]
3+
build-backend = "setuptools.build_meta"

setup.cfg

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
[metadata]
2+
name = hsd-python
3+
version = 0.1
4+
author = DFTB+ developers group
5+
author_email = [email protected]
6+
url = https://github.com/dftbplus/hsd-python
7+
description =
8+
Tools for reading, writing and manipulating data stored in the human-friendly
9+
structured data (HSD) format
10+
long_description = file: README.rst
11+
long_description_content_type = text/x-rst
12+
license = BSD
13+
license_file = LICENSE
14+
platform = any
15+
classifiers =
16+
Intended Audience :: Developers
17+
License :: OSI Approved :: BSD License
18+
Programming Language :: Python :: 3 :: Only
19+
Programming Language :: Python :: 3.7
20+
Programming Language :: Python :: 3.8
21+
Programming Language :: Python :: 3.9
22+
Programming Language :: Python :: 3.7
23+
24+
[options]
25+
include_package_data = True
26+
package_dir =
27+
= src
28+
packages = hsd
29+
30+
[options.packages.find]
31+
where = src

src/LICENSE

-1
This file was deleted.

src/MANIFEST.in

-6
This file was deleted.

src/hsd/__init__.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
"""
99
Central module for the hsd package
1010
"""
11-
from .dump import dump, dumps
12-
from .parser import HsdParser
1311
from .dictbuilder import HsdDictBuilder
12+
from .eventhandler import HsdEventHandler
13+
from .io import load, load_string, load_file, dump, dump_string, dump_file
14+
from .parser import HsdParser

src/hsd/common.py

+18-36
Original file line numberDiff line numberDiff line change
@@ -12,40 +12,10 @@
1212

1313
class HsdException(Exception):
1414
"""Base class for exceptions in the HSD package."""
15-
pass
16-
17-
18-
class HsdQueryError(HsdException):
19-
"""Base class for errors detected by the HsdQuery object.
20-
21-
22-
Attributes:
23-
filename: Name of the file where error occured (or empty string).
24-
line: Line where the error occurred (or -1).
25-
tag: Name of the tag with the error (or empty string).
26-
"""
27-
28-
def __init__(self, msg="", node=None):
29-
"""Initializes the exception.
30-
31-
Args:
32-
msg: Error message
33-
node: HSD element where error occured (optional).
34-
"""
35-
super().__init__(msg)
36-
if node is not None:
37-
self.tag = node.gethsd(HSDATTR_TAG, node.tag)
38-
self.file = node.gethsd(HSDATTR_FILE, -1)
39-
self.line = node.gethsd(HSDATTR_LINE, None)
40-
else:
41-
self.tag = ""
42-
self.file = -1
43-
self.line = None
4415

4516

4617
class HsdParserError(HsdException):
4718
"""Base class for parser related errors."""
48-
pass
4919

5020

5121
def unquote(txt):
@@ -56,11 +26,23 @@ def unquote(txt):
5626

5727

5828
# Name for default attribute (when attribute name is not specified)
59-
DEFAULT_ATTRIBUTE = "attribute"
29+
DEFAULT_ATTRIBUTE = "unit"
30+
31+
# Suffix to mark attribute
32+
ATTRIB_SUFFIX = ".attrib"
33+
34+
# Length of the attribute suffix
35+
LEN_ATTRIB_SUFFIX = len(ATTRIB_SUFFIX)
36+
37+
# Suffix to mark hsd processing attributes
38+
HSD_ATTRIB_SUFFIX = ".hsdattrib"
39+
40+
# Lengths of hsd processing attribute suffix
41+
LEN_HSD_ATTRIB_SUFFIX = len(HSD_ATTRIB_SUFFIX)
42+
43+
44+
HSD_ATTRIB_LINE = "line"
6045

46+
HSD_ATTRIB_EQUAL = "equal"
6147

62-
HSDATTR_PROC = "processed"
63-
HSDATTR_EQUAL = "equal"
64-
HSDATTR_FILE = "file"
65-
HSDATTR_LINE = "line"
66-
HSDATTR_TAG = "tag"
48+
HSD_ATTRIB_TAG = "tag"

src/hsd/dictbuilder.py

+10-8
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,8 @@
99
Contains an event-driven builder for dictionary based (JSON-like) structure
1010
"""
1111
import re
12-
from .parser import HsdEventHandler
13-
14-
__all__ = ['HsdDictBuilder']
12+
from .common import ATTRIB_SUFFIX, HSD_ATTRIB_SUFFIX
13+
from .eventhandler import HsdEventHandler
1514

1615

1716
_TOKEN_PATTERN = re.compile(r"""
@@ -29,21 +28,24 @@
2928
class HsdDictBuilder(HsdEventHandler):
3029
"""Deserializes HSD into nested dictionaries
3130
32-
Note: hsdoptions passed by the generating events are ignored.
31+
Note: hsdattrib passed by the generating events are ignored.
3332
"""
3433

35-
def __init__(self, flatten_data=False):
36-
HsdEventHandler.__init__(self)
34+
def __init__(self, flatten_data=False, include_hsd_attribs=False):
35+
super().__init__()
3736
self._hsddict = {}
3837
self._curblock = self._hsddict
3938
self._parentblocks = []
4039
self._data = None
4140
self._flatten_data = flatten_data
41+
self._include_hsd_attribs = include_hsd_attribs
4242

4343

44-
def open_tag(self, tagname, attrib, hsdoptions):
44+
def open_tag(self, tagname, attrib, hsdattrib):
4545
if attrib is not None:
46-
self._curblock[tagname + '.attribute'] = attrib
46+
self._curblock[tagname + ATTRIB_SUFFIX] = attrib
47+
if self._include_hsd_attribs and hsdattrib is not None:
48+
self._curblock[tagname + HSD_ATTRIB_SUFFIX] = hsdattrib
4749
self._parentblocks.append(self._curblock)
4850
self._curblock = {}
4951

src/hsd/eventhandler.py

+56
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""Contains an event handler base class."""
2+
3+
4+
class HsdEventHandler:
5+
"""Base class for event handler implementing simple printing"""
6+
7+
def __init__(self):
8+
"""Initializes the default event handler"""
9+
self._indentlevel = 0
10+
self._indentstr = " "
11+
12+
13+
def open_tag(self, tagname, attrib, hsdattrib):
14+
"""Handler which is called when a tag is opened.
15+
16+
It should be overriden in the application to handle the event in a
17+
customized way.
18+
19+
Args:
20+
tagname: Name of the tag which had been opened.
21+
attrib: String containing the attribute of the tag or None.
22+
hsdattrib: Dictionary of the options created during the processing
23+
in the hsd-parser.
24+
"""
25+
indentstr = self._indentlevel * self._indentstr
26+
print("{}OPENING TAG: {}".format(indentstr, tagname))
27+
print("{}ATTRIBUTE: {}".format(indentstr, attrib))
28+
print("{}HSD OPTIONS: {}".format(indentstr, str(hsdattrib)))
29+
self._indentlevel += 1
30+
31+
32+
def close_tag(self, tagname):
33+
"""Handler which is called when a tag is closed.
34+
35+
It should be overriden in the application to handle the event in a
36+
customized way.
37+
38+
Args:
39+
tagname: Name of the tag which had been closed.
40+
"""
41+
indentstr = self._indentlevel * self._indentstr
42+
print("{}CLOSING TAG: {}".format(indentstr, tagname))
43+
self._indentlevel -= 1
44+
45+
46+
def add_text(self, text):
47+
"""Handler which is called with the text found inside a tag.
48+
49+
It should be overriden in the application to handle the event in a
50+
customized way.
51+
52+
Args:
53+
text: Text in the current tag.
54+
"""
55+
indentstr = self._indentlevel * self._indentstr
56+
print("{}Received text: {}".format(indentstr, text))

0 commit comments

Comments
 (0)