Added one table so far with unique codings for sex and race.
Column (variable) names also unique
The table so far search_cloud.cshcodeathon.organoid_profiling_pc_subject_phenotypes_gru
Aiming for three or four such tables from dbGaP. The codings and column names will vary.
The question is: how the machine readable information (schema) provided about each table can help make it easier for a data scientist? We assume they are using tools such as python or R and can transform the data in those tools quite easily as long as they have the information to do so. /table/tablename/info provides that information.
Note that in dbGaP the data used in the table above is controlled access. The dataset available through the GA4GH Search API uses values from the dataset but each record (row) is a simulated example - not a real record.