This repository was archived by the owner on Aug 9, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 50
(WIP) Session Orientation #450
Open
halfak
wants to merge
17
commits into
master
Choose a base branch
from
session_orientation
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
549c9ca
(WIP) Implements session_oriented.session
halfak 9395a1c
Updates across the package for dealing with new DependentSet
halfak 96834ef
Minor fixes to session-oriented docs.
halfak fe99b9d
Cleans up usage of DependentSet._name
halfak b381f79
Refactor feature modifiers to handle vectors natively
halfak 955e258
Adds docs and tests for operators.
halfak 846d3b7
Adds tests for vector modifiers.
halfak 2666ae2
Adds some missing docs from aggregators.
halfak 134b28d
Fixes documentation generation Warning/Errors for CI.
halfak cd8833d
Applies list_of_tree to temporal features
halfak 9ff5ac1
Standardize structure for temporal and bytes
halfak cee9f33
Applied list_of_tree to features/wikibase
halfak 762973f
Removes unnecessary feature_exp.py file.
halfak 42b4c61
Standardize the type of <feature group>.session.revisions
halfak 235c156
Adds concept of a meta_dependent method to dependencies
halfak 9c4b21e
Use the meta_dependent in list_of_tree
halfak 04bb9df
Apply list_of_tree to wikitext feature tree.
halfak File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,6 +3,7 @@ __pycache__/ | |
| *.py[cod] | ||
| *~ | ||
| ipython/.ipynb_checkpoints | ||
| .pytest_cache | ||
|
|
||
| # C extensions | ||
| *.so | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| revscoring.datasources.session_oriented | ||
| ======================================= | ||
|
|
||
| .. automodule:: revscoring.datasources.session_oriented |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| revscoring.datasources.session_oriented | ||
| ======================================= | ||
|
|
||
| .. automodule:: revscoring.datasources.session_oriented |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| from ..datasource import Datasource | ||
|
|
||
|
|
||
| class list_of(Datasource): | ||
|
|
||
| def __init__(self, dependent, depends_on=None, name=None): | ||
| name = self._format_name(name, [dependent]) | ||
| super().__init__( | ||
| name, self.process, depends_on=depends_on) | ||
| self.dependency = dependent | ||
|
|
||
| def process(self, *lists_of_values): | ||
| return [self.dependency(*values) for values in zip(*lists_of_values)] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,181 @@ | ||
| """ | ||
halfak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Implements a set of datasources oriented off of a single revision. This is | ||
| useful for extracting features of edit and article quality. | ||
|
|
||
| .. autodata:: revscoring.datasources.session_oriented.session | ||
|
|
||
| Supporting classes | ||
| ++++++++++++++++++ | ||
|
|
||
| .. autoclass:: revscoring.datasources.session_oriented.Session | ||
| :members: | ||
| :member-order: bysource | ||
|
|
||
| Supporting functions | ||
| ++++++++++++++++++++ | ||
|
|
||
| .. autofunction:: revscoring.datasources.session_oriented.list_of_tree | ||
|
|
||
| .. autofunction:: revscoring.datasources.session_oriented.list_of_ify | ||
| """ | ||
| import logging | ||
| import re | ||
| from functools import wraps | ||
| from inspect import getmembers, ismethod | ||
|
|
||
| from revscoring import Feature, FeatureVector | ||
| from revscoring.features.meta import expanders as feature_expanders | ||
|
|
||
| from ..dependencies import DependentSet | ||
| from .datasource import Datasource | ||
| from .meta import expanders as datasource_expanders | ||
| from .revision_oriented import Revision, User | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def list_of_tree(dependent_set, rewrite_name=None, cache=None): | ||
accraze marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| """ | ||
| Converts a :class:`~revscoring.DependentSet` and all of the | ||
| :class:`~revscoring.Dependent` named into a new | ||
| :class:`~revscoring.DependentSet` with | ||
| :func:`~revscoring.datasources.session_oriented.list_of_ify` applied. | ||
|
|
||
| :Parameters: | ||
| dependent_set : :class:`~revscoring.DependentSet` | ||
| A dependent set to convert | ||
| rewrite_name : function | ||
| A function to apply to the dependent's name when re-creating it. | ||
| cache : dict(:class:`~revscoring.Feature` | :class:`~revscoring.FeatureVector` | :class:`~revscoring.Datasource`) | ||
| A map of dependents that have already been converted. | ||
| """ | ||
| logger.debug("Applying list_of_tree to {0}".format(dependent_set.name)) | ||
| cache = cache if cache is not None else {} | ||
| rewrite_name = rewrite_name if rewrite_name is not None else \ | ||
| lambda name: name | ||
|
|
||
| # Rewrites all dependents. | ||
| for attr, dependent in dependent_set.dependents.items(): | ||
| new_dependent = list_of_ify(dependent, rewrite_name, cache) | ||
| setattr(dependent_set, attr, new_dependent) | ||
|
|
||
| # Iterate into all sub-DependentSets | ||
| for attr, sub_dependent_set in dependent_set.dependent_sets.items(): | ||
| if attr.startswith("_"): | ||
| pass | ||
| else: | ||
| logger.debug("Running list_of_tree on {0}".format(attr)) | ||
| new_dependent_set = list_of_tree( | ||
| sub_dependent_set, rewrite_name, cache) | ||
| setattr(dependent_set, attr, new_dependent_set) | ||
|
|
||
| # Iterate into all meta-dependents (methods that return a new dependent) | ||
| for attr, method in getmembers(dependent_set, ismethod): | ||
| if not hasattr(method, "meta_dependent"): | ||
| pass | ||
| else: | ||
| list_of_meta_method = meta_list_of_ify( | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
| method, rewrite_name, cache) | ||
| setattr(dependent_set, attr, list_of_meta_method) | ||
|
|
||
| return dependent_set | ||
|
|
||
|
|
||
| def list_of_ify(dependent, rewrite_name, cache): | ||
| """ | ||
| Converts any :class:`~revscoring.Feature`, | ||
| :class:`~revscoring.FeatureVector`, or :class:`~revscoring.Datasource` into | ||
| an equivalent "list of" the same dependent. Dependencies are converted | ||
| recursively and a cache is maintained for memoization. | ||
|
|
||
| :Parameters: | ||
| dependent : (:class:`~revscoring.Feature` | :class:`~revscoring.FeatureVector` | :class:`~revscoring.Datasource`) | ||
| A dependent to convert | ||
| rewrite_name : function | ||
| A function to apply to the dependent's name when re-creating it. | ||
| cache : dict(:class:`~revscoring.Feature` | :class:`~revscoring.FeatureVector` | :class:`~revscoring.Datasource`) | ||
| A map of dependents that have already been converted. | ||
| """ | ||
|
|
||
| new_name = rewrite_name(dependent.name) | ||
| if new_name in cache: | ||
| logger.debug("list_of_ify {0} in the cache".format(dependent.name)) | ||
| return cache[new_name] | ||
| else: | ||
| logger.debug("list_of_ify is modifying {0} into a list_of".format(dependent.name)) | ||
| new_dependencies = [list_of_ify(dependency, rewrite_name, cache) | ||
| for dependency in dependent.dependencies] | ||
|
|
||
| if isinstance(dependent, Datasource): | ||
| new_dependent = datasource_expanders.list_of( | ||
| dependent, depends_on=new_dependencies, name=new_name) | ||
| elif isinstance(dependent, FeatureVector): | ||
| new_dependent = datasource_expanders.list_of( | ||
| dependent, depends_on=new_dependencies, name=new_name) | ||
| elif isinstance(dependent, Feature): | ||
| new_dependent = feature_expanders.list_of( | ||
| dependent, depends_on=new_dependencies, name=new_name) | ||
| else: | ||
| raise TypeError("Cannot convert type {0} into a list_of" | ||
| .format(type(dependent))) | ||
|
|
||
| cache[new_name] = new_dependent | ||
| return cache[new_name] | ||
|
|
||
|
|
||
| def meta_list_of_ify(method, rewrite_name, cache): | ||
| @wraps(method) | ||
| def wrapper(*args, **kwargs): | ||
| dependent = method(*args, **kwargs) | ||
| return list_of_ify(dependent, rewrite_name, cache) | ||
|
|
||
| return wrapper | ||
|
|
||
|
|
||
| def rewrite_name(name): | ||
| return re.sub(r"(^|\.)revision\.", r"\1session.revisions.", name) | ||
|
|
||
|
|
||
| class Session(DependentSet): | ||
| """ | ||
| Represents a session -- an ordered list of revisions | ||
| """ | ||
| def __init__(self, name): | ||
| super().__init__(name) | ||
| self.revisions = list_of_tree(Revision( | ||
| "session.revisions", | ||
| include_page_creation=True, | ||
| include_content=True, | ||
| include_user=False, | ||
| include_page_suggested=True), | ||
| rewrite_name=rewrite_name) | ||
| """ | ||
| :class:`revscoring.datasources.revision_oriented.Revision`: modified by | ||
| :func:`~revscoring.datasources.session_oriented.list_of_tree()` | ||
| """ | ||
|
|
||
| self.user = User( | ||
| name + ".user", | ||
| include_info=True, | ||
| include_last_revision=True | ||
| ) | ||
| """ | ||
| :class:`revscoring.datasources.revision_oriented.User` | ||
| """ | ||
|
|
||
| session = Session("session") | ||
| """ | ||
| Represents the session of interest. Implements this structure: | ||
|
|
||
| * session: :class:`~revscoring.datasources.session_oriented.Session` | ||
| * revisions: :class:`~revscoring.datasources.revision_oriented.Revision` | ||
| * diff: :class:`~revscoring.datasources.revision_oriented.Diff` | ||
| * page: :class:`~revscoring.datasources.revision_oriented.Page` | ||
| * namespace: :class:`~revscoring.datasources.revision_oriented.Namespace` | ||
| * creation: :class:`~revscoring.datasources.revision_oriented.Revision` | ||
| * parent: :class:`~revscoring.datasources.revision_oriented.Revision` | ||
| * user: :class:`~revscoring.datasources.revision_oriented.User` | ||
| * user: :class:`~revscoring.datasources.revision_oriented.User` | ||
| * info: :class:`~revscoring.datasources.revision_oriented.UserInfo` | ||
| * last_revision: :class:`~revscoring.datasources.revision_oriented.Revision` | ||
| """ # noqa | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.