Skip to content

Commit d3a3bd7

Browse files
committed
update Client to v 1.2.1, add fetchAnnotations() function
Change-Id: I8576ccc4ac7ea361a26f09c87ecbd381cffd7e26
1 parent a5046a4 commit d3a3bd7

File tree

4 files changed

+71
-2
lines changed

4 files changed

+71
-2
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Version history
22

3+
## 1.2.1
4+
5+
- Updates recommended RKorAPClient version to 1.2.1
6+
- fetchAnnotations() method added to KorAPQuery class, to fetch annotations for all collected matches
7+
38
## 1.1.0
49

510
- Updates recommended RKorAPClient version to 1.1.0

KorAPClient/__init__.py

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
from packaging import version
1717
from rpy2.robjects.methods import RS4
1818

19-
CURRENT_R_PACKAGE_VERSION = "1.1.0"
19+
CURRENT_R_PACKAGE_VERSION = "1.2.1"
2020

2121
KorAPClient = packages.importr('RKorAPClient')
2222
if version.parse(KorAPClient.__version__) < version.parse(CURRENT_R_PACKAGE_VERSION):
@@ -395,3 +395,48 @@ def fetchAll(self, *args, **kwargs):
395395
super().__init__(res)
396396
return self
397397

398+
def fetchAnnotations(self, *args, **kwargs):
399+
"""Fetches and parses linguistic annotations for the collected matches.
400+
401+
This method enriches the `collectedMatches` DataFrame with additional columns
402+
containing linguistic annotations like lemma, part-of-speech (POS), and
403+
morphology for each token in the match and its context.
404+
405+
Args:
406+
foundry (str, optional): The foundry (annotation layer) to fetch.
407+
Defaults to "tt" (TreeTagger).
408+
overwrite (bool, optional): If True, existing annotation columns will be
409+
overwritten. Defaults to False.
410+
verbose (bool, optional): If True, prints progress information. Defaults
411+
to the verbosity setting of the
412+
KorAPConnection object.
413+
*args: Positional arguments passed to the underlying R function.
414+
**kwargs: Keyword arguments passed to the underlying R function.
415+
416+
Returns:
417+
KorAPQuery: A new KorAPQuery object with the `collectedMatches` DataFrame
418+
updated to include the fetched annotations.
419+
420+
Example:
421+
```
422+
from KorAPClient import KorAPConnection
423+
424+
# Authentication might be required for snippets and annotations
425+
kcon = KorAPConnection(verbose=True).auth()
426+
427+
# Perform a query and fetch all matches
428+
q = kcon.corpusQuery("Ameisenplage", metadataOnly=False).fetchAll()
429+
430+
# Fetch annotations for the matches
431+
q_annotated = q.fetchAnnotations()
432+
433+
# Display the collected matches with new annotation columns
434+
print(q_annotated.slots['collectedMatches'][['snippet', 'lemma.left', 'lemma.match', 'lemma.right']].head())
435+
```
436+
"""
437+
res = KorAPClient.fetchAnnotations(self, *args, **kwargs)
438+
with localconverter(fix_lists_in_dataframes):
439+
df = res.slots['collectedMatches']
440+
res.slots['collectedMatches'] = df
441+
super().__init__(res)
442+
return self

Readme.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,25 @@ example:
157157
python -m KorAPClient -v --query "Hello World" "Hallo Welt" --vc "pubDate in 2017" "pubDate in 2018" "pubDate in 2019"
158158
```
159159
160+
### Querying and fetching annotations
161+
To fetch the annotations for all matches in a KorAPQuery object, use the fetchAnnotations() method (defaults to "tt" (TreeTagger) as foundry):
162+
```python
163+
from KorAPClient import KorAPConnection
164+
165+
# Authentication might be required for snippets and annotations
166+
kcon = KorAPConnection(verbose=True).auth()
167+
168+
# Perform a query and fetch all matches
169+
q = kcon.corpusQuery("Ameisenplage", metadataOnly=False).fetchAll()
170+
171+
# Fetch annotations for the matches
172+
q = q.fetchAnnotations()
173+
174+
# Display the collected matches with new annotation columns
175+
q.slots['collectedMatches'][['snippet', 'lemma.left', 'lemma.match', 'lemma.right']].head()
176+
```
177+
The annotations (here: lemmas) are stored in `q.slots['collectedMatches'][['lemma.left', 'lemma.match', 'lemma.right']]`.
178+
160179
### Accessed API Services
161180
By using the KorAPClient you agree to the respective terms of use of the accessed KorAP API services which will be printed upon opening a connection.
162181

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "KorAPClient"
3-
version = "1.1.0"
3+
version = "1.2.1"
44
description = "Client package to access KorAP's web service API"
55
authors = [
66
{name = "Marc Kupietz",email = "[email protected]"},

0 commit comments

Comments
 (0)