Skip to content

Commit

Permalink
fixed a few typos in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
pedrofluxa committed Oct 18, 2023
1 parent d6ed8fa commit 8dedb63
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 5 deletions.
15 changes: 11 additions & 4 deletions type_infer/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,20 @@
@dataclass
class TypeInformation:
"""
For a dataset, provides information on columns types, how they're used, and any other potential identifiers.
For a dataset, provides information on columns types.
``TypeInformation`` is generated within :py:func:`infer.infer_types`, where small samples of each column are evaluated in a custom framework to understand what kind of data type the model is. The user may override data types, but it is recommended to do so within a JSON-AI config file.
``TypeInformation`` is generated within :py:func:`infer.infer_types`,
where a small subset of samples of each column are evaluated in a custom
framework to understand what kind of data type the model is. The user
may override data types, but it is recommended to do so within a JSON-AI
config file.
:param dtypes: For each column's name, the associated data type inferred.
:param additional_info: Any possible sub-categories or additional descriptive information.
:param identifiers: Columns within the dataset highly suspected of being identifiers or IDs. These do not contain informatic value, therefore will be ignored in subsequent training/analysis procedures unless manually indicated.
:param additional_info: Any possible sub-categories or additional descriptive
information.
:param identifiers: Columns within the dataset highly suspected of being identifiers
or IDs. These do not contain useful information, and should therefore be
ignored in subsequent training/analysis procedures unless manually indicated.
""" # noqa

dtypes: Dict[str, str]
Expand Down
2 changes: 1 addition & 1 deletion type_infer/dtype.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ class dtype:
- **Complex**: Data types that require custom techniques. Currently ``audio``, ``video`` and ``image`` are available, but highly experimental.
- **Array**: Data in the form of a sequence where order must be preserved. ``tsarray`` dtypes are for "normal" columns that will be transformed to arrays at a row-level because they will be treated as time series.
- **Miscellaneous**: Miscellaneous data descriptors include ``empty``, an explicitly unknown value versus ``invalid``, a data type not currently supported.
Custom data types may be implemented here as a flag for subsequent treatment and processing. You are welcome to include your own definitions, so long as they do not override the existing type names (alternatively, if you do, please edit subsequent parts of the preprocessing pipeline to correctly indicate how you want to deal with these data types).
""" # noqa

Expand Down

0 comments on commit 8dedb63

Please sign in to comment.