- cli
- server
- clone repository
- create virtual environment
windows:
python3 -m venv .ve
unix like:
python3 -m venv .ve
- activate environment / resource termnal
windows:
& .ve/Scripts/Acitvate.ps1
unix like:
source .ve/bin/activate
- install requirements
pip install -r requirements.txt
pip install https://huggingface.co/huspacy/hu_core_news_trf/resolve/main/hu_core_news_trf-any-py3-none-any.whl
start docker container for emtsv
docker run --rm -p5000:5000 -it mtaril/emtsv
start application - result will be written to stdout
python .\anonimization.py --file-input "path/to/file" --format=[emagyar, huspacy]
First of all the .env
file should be created based on the example.env
file. The PORT and the GPU ids should be set.
docker compose up -d --build
the server is available on the previously allocated port. available endpoints:
- /docs : SWAGGER based documentation of the API
- /anonymization : segment based execution of the anonymization program
- /tokenize/emagyar : only tokenizes the input
- /tokenize/huspacy
- /swap/emagyar
- /swap/huspacy
all endpoints requires a file input or body:{"text":"text to process"}
component diagram @startuml agent text queue "morphological analysis" as morpho database "Hungarian given names" as given queue "generate form of pseudo anonymized name" as gen queue NER
component emtsv component huspacy component NerKor component PseudoAnonimizator as pseu
text --> pseu pseu -right-> NER NER --> NerKor NerKor --> pseu pseu -right-> morpho morpho -- emtsv morpho -- huspacy pseu -right-> given : select pseudo name pseu --> gen gen -- emtsv gen -- huspacy
@enduml