-
Notifications
You must be signed in to change notification settings - Fork 0
Use Cases
The first and foremost use case is for data generators and curators who wish to deposit their data on a remote OMERO server, typically as part of a data sharing workflow. For simplicity, the instructions provided in this use case assume that the user is running the toolkit on a Windows PC, and therefore include the initial steps for running the relevant Docker container. For users running Windows 10 Pro 64-bit systems, Hyper-V virtualization must be enabled in the system BIOS26, Docker Desktop27 must be installed and the Docker images then built from the Dockerfiles repository28 using the PowerShell29 Command Line Interface (CLI), or simply pulled from the DockerHub repository. Users running Linux can install the Docker Engine Community30 version, and Mac OS users can install Docker Desktop for Mac31. If the user is running the toolkit on Linux or Mac OS 64-bit, then the first x steps can be omitted.
- Pull the Docker images in Windows PowerShell or a terminal:
$ docker pull omero_uploader
- Run the omero-upload container with a read-only volume mounted to a host data directory in the ‘Uploader’ Windows user space:
$ docker run -t -d --name omero-uploader -v ‘/c/Users/Uploader/microscope_data:/home/jovyan/microscope_data:ro’ omero_uploader
- Connect to the running Docker container:
$ docker exec -it omero-uploader /bin/bash
- Ensure the appropriate Conda environment is activated, where ‘{CONDA_ENV}’ is ‘base’ in the Docker images and ‘omero_uploader’ for Linux and Mac OS:
$ source activate {CONDA_ENV}
- Execute the PyOmeroUpload data transfer function to upload the contents of the specified data directory using the following parameters, each of which is described in the table below.
$ python data_transfer_manager.py --data_path /home/jovyan/microscope_data --dataset_name {DATASET_NAME} --hypercube {true|false} --module_path {CUSTOM_MODULE_PATH} --parser_cls {PARSER_CLASSNAME} --image_processor_cls {IMAGE_PROCESSOR_CLASSNAME}
The user specifies the designated metadata parser(s), the target directory and, if desired, an alternative custom data transformation function with which to process collections of single images into n-dimensional images. If no data transformation function is specified but the hypercube option is still present, the uploader will attempt to transform the data into five-dimensional hypercubes according to the following rules:
-
Target directory contains sub-directories named ‘pos{xxx}’, each of which corresponds to a microscope position, where ‘{xxx}’ is a unique numeric identifier for that position
-
Within each sub-directory, there are multiple image files per z-section, time point and channel
-
Each image file adheres to a naming convention of ‘{abc}{z-section}{channel}_{timepoint}’ where ‘{abc}’ can be any arbitrary string [TODO: confirm filename convention]
The second use case is aimed at researchers who wish to perform ad hoc deposits of their data, or those who might wish to browse, search, query and visualize (meta)data assets stored in an OMERO server with all the power and flexibility of a Jupyter notebook. A selection of sample Jupyter notebooks in the OMEROConnect GitHub repository in the ‘omero_jupyter/notebooks’ directory provide examples of key operations, as described in the table below.
These notebooks can be run on any Jupyter notebook server, provided the required PyOmeroUpload, OMERO Python or requests library are installed in the corresponding Conda environment. As with the first use case, the following instructions assume that the user is running the toolkit on a Windows PC, and therefore include the initial steps for running the relevant Docker container.
- Pull the Docker images in Windows PowerShell or a terminal:
$ docker pull omero_jupyter
- Run the omero-jupyter container with a volume mounted to a host data directory in the ‘Uploader’ Windows user space, a writable volume mounted to the sample notebooks directory and the guest port ‘8888’ exposed through the host port ‘8888’:
$ docker run --name omero-jupyter -p 8888:8888 -v ‘/c/Users/Uploader/microscope_data:/home/jovyan/microscope_data:ro’ -v ‘/c/Users/Uploader/git_projects/OMEROConnect/omero_jupyter/notebooks:/home/jovyan/work:rw’ omero_jupyter
-
A Jupyter Notebook server is now running in the Docker container, and the port mapping permits access through an internet browser running on the host computer, i.e. via http://127.0.0.1:8888. An access token string is required, which is either available in the output from the command executed in step 2, or by executing the ‘jupyter notebook list’ command in the Docker container (or on the Jupyter Notebook server host)
-
The sample notebooks are available in the ‘/work’ directory in Jupyter, and any saved modifications are persisted in the original files located on the host computer, due to the writable mounted volume
The PyOmeroUpload data transfer manager functions can be invoked directly, by importing the relevant packages. Assuming the omero-jupyter container has been initiated with the required volume mounts (see the OMEROConnect repository README.md), the sample Jupyter Notebook ‘test_omero_upload.ipynb’ in the work directory can be executed to demonstrate upload to the demonstration OME OMERO server. Users must provide their own login credentials for access to this server, and this requires registration34. Alternatively, if users have access to another OMERO server, the relevant configuration parameters can be adjusted in the notebook’s global variables. Other notebooks in the image are ‘test_omero_query.ipynb’ and ‘test_omero_api.ipynb’, for data retrieval from an OMERO server via either the service level APIs or the JSON API respectively. These notebooks demonstrate interactive querying, and can be adapted for deeper analysis and visualization using the included Python libraries such as Pandas35, NumPy36, Matplotlib37 and seaborn38.
There are myriad file formats, structures and ontologies associated with metadata collected during experimental data generation. Researchers may also discover that standard metadata formats are not adequate for describing their data as richly as desired, and so laboratory groups often adapt default acquisition log and device output to support their (meta)data requirements. Consequently, situations arise where captured metadata are stored either in semi-structured text or according to customized schema. Naturally, it would be futile to attempt to develop software that could extract meaningful information from such a variety of potential inputs.
The PyOmeroUpload toolkit is designed with modularity in mind, allowing specification of a custom metadata parser at runtime. Invoking the data transfer manager with the command line arguments identified in Use Case 1 means that any Python class implementing the metadata parser interface can be substituted for the default parser. The interface is simple and mandates only one function, ‘extract_metadata’, which must return a Python dictionary containing any metadata tags, key-value pairs and table elements in the form of ‘{ “tags”: [], “kvps”: [], “tables”: [] }’. The tags list is simply a collection of string values while the KVPs list is an array of pairs arrays, and the tables list is a collection of Pandas DataFrame39 objects, complete with name attributes that have been assigned a value. The dictionary object that is returned from the metadata extraction is retrieved by the data transfer manager, then passed to the broker instance which processes and uploads the metadata as children objects linked with the parent dataset in the OMERO server.
For developers, the omero_ide Docker image provides a fully-fledged IDE (Integrated Development Environment) featuring JetBrains’ PyCharm40 Community edition. Like the Jupyter image, the IDE image comes with the same pre-built Conda environment and required libraries installed. In addition, the IDE container runs an OpenSSH41 server that enables users to establish an X1142 SSH connection so that the IDE GUI can be displayed, as if the IDE is running on the host system. For Linux and Mac OS users, the connection can be established simply by entering the standard ssh -X jovyan@127.0.0.1 -p 2222 in the command terminal. For Windows users, an X Server application must be installed such as MobaXTerm43 or XMing44, followed by the appropriate instructions to create an X11-enabled SSH session with username ‘jovyan’, host ‘127.0.0.1’ and port ‘2222’.