Skip to content

TranslatorSRI/NameResolution

Repository files navigation

Name resolution service

This service takes lexical strings and attempts to map them to identifiers (curies) from a vocabulary or ontology.
The lookup is not exact, but includes partial matches.

Multiple results may be returned representing possible conceptual matches, but all of the identifiers have been correctly normalized using the NodeNormalization service.

See the documentation notebook for examples of use.

Setting up NameRes locally

NameRes requires an Apache Solr database and the NameRes frontend running in Python. The easiest way to set this up is by using the Docker Compose setup included in this file, although you will need either (1) a set of synonyms files generated by Babel to load into Solr, or (2) a Solr database backup to load into Solr. The following instructions will work whichever of the two approaches you need to follow.

Starting NameRes locally with loading from a Solr backup

The simplest way to run NameRes locally is by using a Solr backup from another NameRes instance or from Translator.

  1. Make sure you have Docker installed; this should come with Docker Compose.
  2. Create the local directory where your Solr data will be stored -- by default, this is ./data/solr in this directory, but you can change this in docker-compose.yml. This directory will need to have a maximum storage of approx 400G: 104G of the downloaded file (which can be deleted once decompressed), 147G of uncompressed backup (both of which can be deleted once restored) and 147G of Apache Solr databases.
  3. Download the Solr backup URL you want to use into your Solr data directory. It should be approximately 104G in size.
  4. Uncompress the Solr backup file. It should produce a var/solr/data/snapshot.backup directory in the Solr data (by default, ./data/solr/var/solr/data/snapshot.backup). You can delete the downloaded file (snapshot.backup.tar.gz) once it has been decompressed.
  5. Check the docker-compose.yml file to ensure that it is as you expect.
    • By default, the Docker Compose file will use the latest released version of NameRes as the frontend. To use the source code in this repository, you will need to change the build instructions for the nameres service in the Docker Compose file.
  6. Start the Solr and NameRes pods by running docker-compose up. By default, Docker Compose will download and start the relevant pods and show you logs from both sources. You may press Ctrl+C to stop the pods.
  7. Look for a line similar to Uvicorn running on http://0.0.0.0:2433 (Press CTRL+C to quit), which tells you where NameRes is running.
    • By default, the web frontend (http://0.0.0.0:2433/docs) defaults to using the NameRes RENCI Dev — you will need to change the "Servers" setting to use your local NameRes instance.
    • Note that looking up http://0.0.0.0:2433/status will give you an error (Expected core not found.). This is because the Solr database and indexes have not yet been loaded.
  8. Run the Solr restore script using bash, i.e. bash solr-restore/restore.sh. This script assumes that the Solr pod is available on localhost:8983 and contains a var/solr/data/snapshot.backup directory with the data to restore.
  9. Look for the script to end properly (Solr restore complete!). Look up http://localhost:2433/status to ensure that the database has been loaded as expected, and use http://localhost:2433/docs (after changing the server) to try some test queries to make sure NameRes is working properly.
  10. You can now delete the uncompressed database backup in $SOLR_DATA/var to save disk space.

Loading from synonyms files

The best way to do this is by using the data-loading Docker image.

Python packaging

Currently, NameRes is only packaged as a Docker image (see Dockerfile), but you can also run it directly via Uvicorn.

$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ bash main.sh

Kubernetes

Helm charts can be found at https://github.com/helxplatform/translator-devops/helm/name-lookup.

examples

curl -X POST "http://localhost:2433/lookup?string=oxycod&offset=0&limit=10" -H "accept: application/json"

Configuration

NameRes can be configured by setting environmental variables:

  • SOLR_HOST and SOLR_PORT: Hostname and port for the Solr database containing NameRes information.
  • SERVER_NAME: The name of this server (defaults to infores:sri-name-resolver)
  • SERVER_ROOT: The server root (defaults to /)
  • MATURITY_VALUE: How mature is this NameRes (defaults to maturity, e.g. development)
  • LOCATION_VALUE: Where is this NameRes setup (defaults to location, e.g. RENCI)
  • OTEL_ENABLED: Turn on Open TELemetry (default: 'false') -- only 'true' will turn this on.
    • JAEGER_HOST and JAEGER_PORT: Hostname and port for the Jaegar instance to provide telemetry to.
    • JAEGER_SERVICE_NAME: The name of this service (defaults to the value of SERVER_NAME)

About

A service for finding CURIEs from lexical strings.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 7