|
| 1 | +# Bioconductor Archive Sync |
| 2 | + |
| 3 | +This Ansible playbook automates the process of syncing a Bioconductor release to the Open Storage Network (OSN) archive. It handles the full workflow of: |
| 4 | + |
| 5 | +1. Retrieving the specified Bioconductor version from the master Bioconductor server |
| 6 | +2. Creating the necessary directory structure locally |
| 7 | +3. Transferring the retrieved data to OSN for archival storage |
| 8 | + |
| 9 | +## Prerequisites |
| 10 | + |
| 11 | +### On Your Local Machine (Running Ansible) |
| 12 | + |
| 13 | +- Ansible installed (`pip install ansible`) |
| 14 | +- SSH access to the target server |
| 15 | +- SSH key for connecting to the target server |
| 16 | +- Knowledge of the target server's IP address and SSH user credentials |
| 17 | + |
| 18 | +### On The Target Machine (Running the Sync) |
| 19 | + |
| 20 | +- SSH key `~/.ssh/rsync.pem` for connecting to the Bioconductor master server |
| 21 | +- Rclone configuration file at `~/.rclone.conf` with an [osn] remote defined |
| 22 | + |
| 23 | +Note: The playbook will automatically check for these prerequisites on the target machine. If rclone is not installed on the target machine, the playbook will automatically install it. |
| 24 | + |
| 25 | +## How to Use |
| 26 | + |
| 27 | +The simplest way to run this playbook is by using the provided `run.sh` script. |
| 28 | + |
| 29 | +### Using run.sh |
| 30 | + |
| 31 | +The run.sh script simplifies execution by handling all the necessary parameters: |
| 32 | + |
| 33 | +```bash |
| 34 | +./run.sh <ssh_key_path> <ip_address> [ssh_user] [bioc_version] |
| 35 | +``` |
| 36 | + |
| 37 | +#### Parameters: |
| 38 | + |
| 39 | +- **ssh_key_path**: Path to your SSH private key for connecting to the target server |
| 40 | +- **ip_address**: IP address of the target server where the sync will run |
| 41 | +- **ssh_user**: (Optional) SSH username for connecting to the target server (default: ubuntu) |
| 42 | +- **bioc_version**: (Optional) The Bioconductor version to sync (default: 3.21) |
| 43 | + |
| 44 | +#### Example usage: |
| 45 | + |
| 46 | +```bash |
| 47 | +./run.sh ~/.ssh/my_key.pem 192.168.1.100 ubuntu 3.20 |
| 48 | +``` |
| 49 | + |
| 50 | +### Best Practice: Run Multiple Times |
| 51 | + |
| 52 | +It's recommended to run the script at least twice: |
| 53 | +- The first run will transfer all the data, which may take significant time depending on the size of the Bioconductor release |
| 54 | +- Subsequent runs will be much faster and ensures that all transfers were successful |
| 55 | +- If the second run shows no additional files being transferred or updated, it confirms that the synchronization is complete and consistent |
| 56 | + |
| 57 | +This approach leverages rsync and rclone's internal check mechanisms - it only transfers files that have changed or are missing, making subsequent runs both a verification and a way to complete any interrupted transfers. |
| 58 | + |
| 59 | +### Best Practice: Use Screen for Persistent Sessions |
| 60 | + |
| 61 | +I'd recommended you use `screen`, especially on a VM, to ensure the process continues even if your connection to the VM is interrupted: |
| 62 | + |
| 63 | +```bash |
| 64 | +# Start a new screen session |
| 65 | +screen -S bioc-sync |
| 66 | + |
| 67 | +# Now run the script inside the screen session |
| 68 | +./run.sh ~/.ssh/rsync.pem 192.168.1.100 ubuntu 3.21 |
| 69 | + |
| 70 | +# You can detach from the screen session with: Ctrl+A, then D |
| 71 | +``` |
| 72 | + |
| 73 | +After starting a screen session, you can leave it unattended for a couple of hours, while the transfers happen. |
| 74 | + |
| 75 | +When returning to the session: |
| 76 | + |
| 77 | +```bash |
| 78 | +# If disconnected, you can reconnect to the VM and resume the session with: |
| 79 | +screen -r bioc-sync |
| 80 | +``` |
| 81 | + |
| 82 | +This approach protects your sync process from: |
| 83 | +- Network connectivity issues between your computer and the VM |
| 84 | +- Local computer shutdowns or sleep mode |
| 85 | +- SSH session timeouts |
| 86 | +- Accidental terminal closing |
| 87 | + |
| 88 | +The transfer will continue running on the VM even if your connection drops, and you can easily reconnect to check progress when needed. |
0 commit comments