Skip to content

Data Set up

Minh Le edited this page Mar 21, 2019 · 2 revisions

Downloading the Data

The MIMIC3 dataset can be found https://mimic.physionet.org/. In order to access it, you must take an online course. Details can be found on the link.

Data Path in Hoffman

The data can be found on Hoffman clusters as well. Here are some common directories used so far. Prepossessed files by Dat can be found in /u/flashscratch/d/datduong/MIMIC3database. Prepossessed files by Minh can be found in /u/flashscratch/m/minhle/MIMIC3database.

Soft Links

For my projects, I set up a lot of soft links with respect to the root directory of the project. For the code to work out of the box, you should set up similar soft links via ln -s <. If some directories does not exist, make the directories like for example mkdir -p data/MIMIC3database.

  1. data/MIMIC3database/raw -> /u/flashscratch/d/datduong/MIMIC3database
  2. data/processed -> '/MIMIC3database/processed`
  3. data/MIMIC3EachPerson -> /u/flashscratch/d/datduong/MIMIC3EachPerson The preprocessing details are found in these notebooks themselves.

Preprocessing Data

There are a couple scripts/jupyter notebooks that do preprocessing. In particular, they are MIMIC3_data_processing.ipynb and MIMIC3TimeseriesDataPreprocessing.ipynb. To run these Juypter notebook, refer to the experiment section of the wiki. These will read files from the "raw" files(which are Dat's preprocessed code) into data files that are usable by this repo's code.

Clone this wiki locally