A Spring Boot application for importing and managing bioinformatics data into the GenPat platform.
BioInfoTools2 is a command-line tool designed to import various types of bioinformatics data:
- Sample uploads from sequencing facilities
- Pipeline analysis results
- NCBI reference sequences
The application integrates with the GenPat database (CMDBuild) and manages relationships between samples, analyses, and reference data.
- Upload Import: Import sample data from BIOINFONAS source
- Pipeline Results: Import analysis results with support for various file formats
- NCBI References: Import and manage reference sequences from NCBI
- Hashed Analysis: Support for specific analysis types (rawreads, import)
- Database Integration: PostgreSQL integration with CMDBuild
- Configuration: External configuration file support for flexible deployment
- Java 8 or higher
- Maven 3.x
- PostgreSQL database (CMDBuild instance)
- Access to bioinformatics data sources
-
Clone this repository:
git clone https://github.com/genpat-it/cohesive-bioinfotools2.git cd cohesive-bioinfotools2 -
Build the JAR file:
mvn clean package
The JAR file will be generated at:
target/bioinfotools2.jar
Create a configuration file /conf/bit2.yml with the following structure:
app:
sample-folder: /path/to/samples
hashed-analysis: [ '0SQ_rawreads', '2AS_import' ]
file-server: 'server-address'
reference:
source: NCBI
material: N1052G
alias-type: REF
data-provider: ALL
sample-relation: REF_NCBI
upload:
source: BIOINFONAS
config-file: /conf/bit2.json
db:
address: localhost:5432
password: your-database-password
username: your-database-usernameCreate /conf/bit2.json for additional application-specific settings.
Run the application with:
java -jar target/bioinfotools2.jar [options]java -jar bioinfotools2.jar --import-uploads --source BIOINFONASjava -jar bioinfotools2.jar --import-results --pipeline-path /path/to/resultsjava -jar bioinfotools2.jar --import-references --source NCBIbioinfotools2/
├── src/
│ └── main/
│ ├── java/
│ │ └── it/izs/bioinfo/bit2/
│ │ ├── ApplicationConfiguration.java
│ │ ├── ImportApp.java
│ │ ├── Parameters.java
│ │ └── model/
│ │ ├── analysis/
│ │ ├── genpat/
│ │ └── qc/
│ └── resources/
│ └── application.yml
├── pom.xml
├── LICENSE
└── README.md
- Spring Boot 2.7.18
- Spring Data JDBC
- PostgreSQL Driver
- OpenCSV
- Lombok
- Hibernate Validator
- Commons Codec
- JSON Library
The application expects a CMDBuild PostgreSQL database with the following main entities:
- Samples
- Analyses
- References
- Uploads
- Quality Control data
- Database credentials should be stored in external configuration files
- Never commit configuration files with credentials to version control
- Use environment variables or secure vaults for production deployments
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
For questions or support, please contact: cohesive@izs.it