Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DD KG instructions from email to Wiki #132

Open
jeet-vora opened this issue Mar 5, 2025 · 0 comments
Open

Add DD KG instructions from email to Wiki #132

jeet-vora opened this issue Mar 5, 2025 · 0 comments
Assignees

Comments

@jeet-vora
Copy link

Recommended development process:

  1. Revise the edge and node files for BIOMARKER.
  2. Optional: Upload the edge and node files to the Globus folder.
  3. Take a copy of the latest set of ontology CSVs of the Data Distillery minus the Biomarker data (DD-no-BIOMARKER) and add it to your ETL environment.
  4. Add your new edge and node files to the folder that corresponds to the download folder of your Globus Connect Personal setup. Your copy of edges_nodes.ini should point to this folder. For example, I download everything from Globus to a subfolder of my Documents folder on my MacOs machine. My ini file looks like:[Paths]

Local paths containing ingestion files

...
BIOMARKER=/Users/jas971/documents/globus/Import/BIOMARKER
5. Run the ingestion script to generate a new set of ontology CSVs with the new BIOMARKER (./build_csv.sh -v BIOMARKER), integrating your version of BIOMARKER with the DD-no-BIOMARKER.
6. Using the ontology CSVs generated in step 5, execute the workflow described in ubkg-neo4j to build a Docker container. As you've probably experienced, the longest waits are in the import of the CSVs and the time spent to create the relationship indexes. (Pro [or maybe jaded amateur] tip: if you find the import taking forever, especially for relationships, you're probably running into memory issues. Reboot and do over.)

The Zip of the CSVs for the Jan 3 Data Distillery except for BIOMARKER is available at https://ubkg-downloads.xconsortia.org/.

The file name is DD_no_BIOMARKER03Jan2025.zip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants