This webinterface is designed to convert words in Dutch dialects ("dialectopgaven") into standard Dutch keywords ("vernederlandste trefwoorden") which can be used as the basis for interoperable searches through various dialect dictionaries.
You can upload a list of Dutch dialect words (UTF8-format). These will be automatically converted to suggestions for keywords. This may take a while, but the tool will send you an email when it is done. After that the tool will display a couple of suggestions for keywords, from which you can select the correct one. Alternatively you can copy the closest one, and correct it manually. When you are done you can download the result as a text file..
This project is developed within the LaMachine virtual environment. If used within LaMachine, requires no package installation. The dependencies are the following:
You can clone this repository to your machine with the following code:
$ git clone https://github.com/LanguageMachines/dialect2keywords.git
-
Activate your virtual environment. For activation of LaMachine environments, please consult LaMachine Usage Documentation.
-
Before running the program, you need to set the following environment variables each time.
SECRET_KEY
: Random string valueEMAIL_HOST
: Email service hostEMAIL_PORT
: Email service connection portEMAIL_HOST_USER
: Sender email addressEMAIL_HOST_PASSWORD
: Sender email password
Please consult to Django Cryptographic Signing and Sending Emails with Django documentations to have a better understanding on them.
An example code for setting these variables:
$ export EMAIL_HOST_USER='< insert the custom value required >'
-
Go to the main directory of the repository:
$ cd /path/to/the/repository/dialect2keyword/
-
Run the Django server on a specific port number. In the below example, you need to replace the
PORT
variable with a 4-digit number. If not given at all, default port number is set to8000
. For further information consult to Django Runserver Documentation.$ python manage.py runserver PORT Example: $ python manage.py runserver 7658
-
If you are running the program on a remote server, and would like to reach the interface from the browser of your local machine, you can create an SSH tunnel to the remote server. If there is no domain name configured for the remote server, you can connect to the public IP address of the server.
$ ssh -L PORT:localhost:PORT [email protected] OR $ ssh -L PORT:localhost:PORT [email protected] Example: $ ssh -L 7658:localhost:7658 [email protected]
In the above examples,
username
,host.domain.name.com
,public.ip.address
, andPORT
variables needs to be replaced with actual values.PORT
needs to be the same port number you have set while running the Django server (see the previous step). -
After the SHH tunneling, you can go to
localhost:PORT
on your browser to see the interface. You, again, need to replacePORT
with the same 4-digit number which was set while running the Django server. In the case of our examples above, this would belocalhost:7658
.
Replacing the current Phonetisaurus models:
The Phonetisaurus model files, used by the system, are stored in the folder models
. If you have create a new Phonetisaurus model and would like them to be used by the system, moving the new files to the models
folder and replacing the old ones will be enough. Keeping the names of the new files the same as the old ones (as listed below) will prevent a change in the code.
phonetisaurus-model.corpus
phonetisaurus-model.fst
phonetisaurus-model.o8.arpa