Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 58 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,57 +8,72 @@

This is the only official Python client library developed and supported by ChEMBL group.

The library helps accessing ChEMBL data and cheminformatics tools from Python. You don't need to know how to write SQL. You don't need to know how to interact with REST APIs. You don't need to compile or install any cheminformatics frameworks. Results are cached.
## Key Features

The client handles interaction with the HTTPS protocol and caches all results in the local file system for faster retrieval. Abstracting away all network-related tasks, the client provides the end user with a convenient interface, giving the impression of working with a local resource. Design is based on the Django QuerySet interface. The client also implements lazy evaluation of results, which means it will only evaluate a request for data when a value is required. This approach reduces number of network requests and increases performance.
* Easy access to ChEMBL data and tools from Python
* You do not need to know interations for SQL or REST APIs
* No need for compling or installing cheminformatics frameworks
* Automatic caching of results
* Abstraction of all network-related tasks
* Lazy evaluation of results to improve performance and reduce network requests

## Installation

Requires Python version >= 3.7. It can be installed using pip:
```bash
pip install chembl_webresource_client
```

## Live Jupyter notebook with examples

[Click here](http://beta.mybinder.org/v2/gh/chembl/chembl_webresource_client/master?filepath=demo_wrc.ipynb)
Explore live Jupyter notebooks [here](http://beta.mybinder.org/v2/gh/chembl/chembl_webresource_client/master?filepath=demo_wrc.ipynb)

## Available filters

The design of the client is based on Django QuerySet (https://docs.djangoproject.com/en/1.11/ref/models/querysets) and most important lookup types are supported. These are:
```bash
_exact
_iexact
_contains
_icontains
_in
_gt
_gte
_lt
_lte
_startswith
_istartswith
_endswith
_iendswith
_range
_isnull
_regex
_iregex
_search
```
## Example

- exact
- iexact
- contains
- icontains
- in
- gt
- gte
- lt
- lte
- startswith
- istartswith
- endswith
- iendswith
- range
- isnull
- regex
- iregex
- search
``` python
from chembl_webresource_client.new_client import new_client

molecule = new_client.molecule
mols = molecule.filter(pref_name__iexact='aspirin')
mols
```

## Only operator
## 'only' operator

`only` is a special method allowing to limit the results to a selected set of fields. only should take a single argument: a list of fields that should be included in result. Specified fields have to exists in the endpoint against which only is executed. Using only will usually make an API call faster because less information returned will save bandwidth. The API logic will also check if any SQL joins are necessary to return the specified field and exclude unnecessary joins with critically improves performance.
The`only` method allows you to limit the results to a selected set of fields. It should take a **single argument**: a list of fields that should be included in result. Specified fields have to exists in the endpoint against which only is executed. Using `only` will usually make an API call faster because less information returned will save bandwidth. The API logic will also check if any SQL joins are necessary to return the specified field and exclude unnecessary joins with critically improves performance.

Please note that only has one limitation: a list of fields will ignore nested fields i.e. calling only(['molecule_properties__alogp']) is equivalent to only(['molecule_properties']).
### Limitations

For many 2 many relationships only will not make any SQL join optimisation.
* A list of fields will ignore nested fields i.e. calling `only(['molecule_properties__alogp'])` is equivalent to `only(['molecule_properties'])`.

* For many-to-many relationships `only` will not make any SQL join optimisation.

## Settings

In order to use settings you need to import them before using the client:
## Settings

To configure the client, import the settings object:
```python
from chembl_webresource_client.settings import Settings
```
Expand All @@ -69,7 +84,7 @@ Settings object is a singleton that exposes Instance method, for example:
Settings.Instance().TIMEOUT = 10
```

Most important options:
Key options inclde:

CACHING: should results be cached locally (default is True)
CACHE_EXPIRE: cache expiry time in seconds (default 24 hours)
Expand All @@ -78,7 +93,20 @@ Most important options:
CONCURRENT_SIZE: total number of concurrent requests (default is 50)
FAST_SAVE: Speedup cache saving up to 50 times but with possibility of data loss (default is True)

## Contributions

If you would like to contribue to this project, please follow these steps:

1. Fork this repository
2. Clone the forked respoitory to your local device
3. Make you changes, commit, and push to your fork.
4. Resolve, if any, merge conflicts
5. Create a Pull request to the main repository

## Citing

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489243/
For citations refer to: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489243/

## License

The content of this respository is licensed under the Apache Software License.