Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include documentation for fusion mode #373

Merged
merged 7 commits into from
Oct 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 19 additions & 11 deletions .github/workflows/check-bioc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,16 +107,16 @@ jobs:
uses: actions/cache@v3
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.2-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.2-
key: ${{ env.cache-version }}-${{ runner.os }}-biocversion-RELEASE-r-4.3-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ env.cache-version }}-${{ runner.os }}-biocversion-RELEASE-r-4.3-

- name: Cache R packages on Linux
if: "!contains(github.event.head_commit.message, '/nocache') && runner.os == 'Linux' "
uses: actions/cache@v3
with:
path: /home/runner/work/_temp/Library
key: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.2-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.2-
key: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.3-${{ hashFiles('.github/depends.Rds') }}
restore-keys: ${{ env.cache-version }}-${{ runner.os }}-biocversion-devel-r-4.3-

- name: Install Linux system dependencies
if: runner.os == 'Linux'
Expand Down Expand Up @@ -315,13 +315,21 @@ jobs:
if: github.ref == 'refs/heads/devel' && env.run_pkgdown == 'true' && runner.os == 'Linux'
run: R CMD INSTALL .

- name: Build and deploy pkgdown site

- name: Deploy pkgdown site to GitHub pages 🚀
if: github.ref == 'refs/heads/devel' && env.run_pkgdown == 'true' && runner.os == 'Linux'
run: |
git config --local user.name "$GITHUB_ACTOR"
git config --local user.email "[email protected]"
Rscript -e "pkgdown::deploy_to_branch(new_process = FALSE)"
shell: bash {0}
uses: JamesIves/github-pages-deploy-action@releases/v4
with:
clean: false
branch: gh-pages
folder: docs
# - name: Build and deploy pkgdown site
# if: github.ref == 'refs/heads/devel' && env.run_pkgdown == 'true' && runner.os == 'Linux'
# run: |
# git config --local user.name "$GITHUB_ACTOR"
# git config --local user.email "[email protected]"
# Rscript -e "pkgdown::deploy_to_branch(new_process = FALSE)"
# shell: bash {0}
## Note that you need to run pkgdown::deploy_to_branch(new_process = FALSE)
## at least one locally before this will work. This creates the gh-pages
## branch (erasing anything you haven't version controlled!) and
Expand All @@ -331,7 +339,7 @@ jobs:
if: failure()
uses: actions/upload-artifact@v2
with:
name: ${{ runner.os }}-biocversion-devel-r-4.2-results
name: ${{ runner.os }}-biocversion-RELEASE-r-4.3-results
path: check

- uses: docker/build-push-action@v1
Expand Down
2 changes: 1 addition & 1 deletion R/bambu_utilityFunctions.R
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ checkInputSequence <- function(genomeSequence) {
},
error=function(cond) {
stop("Input genome file not readable.",
"Requires a FASTA or BSgenome name")
" Requires a FASTA or BSgenome name")
}
)}
return(genomeSequence)
Expand Down
17 changes: 15 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
- [Training a model on another species/dataset and applying it](#Training-a-model-on-another-speciesdataset-and-applying-it)
- [Quantification of gene expression](#Quantification-of-gene-expression)
- [Including single exons](#Including-single-exons)
- [Fusion gene/isoform detection](#Fusion-geneisoform-detection)
- [*bambu* Arguments](#Bambu-Arguments)
- [Output Description](#Output-Description)
- [Release History](#Release-History)
Expand Down Expand Up @@ -112,7 +113,7 @@ The bambuAnnotation object can be calculated from:

a) a .gtf file:
```rscript
annotations <- prepareAnnotation(gtf.file)
annotations <- prepareAnnotations(gtf.file)
```
b) a TxDb object
```rscript
Expand Down Expand Up @@ -424,14 +425,26 @@ By default *bambu* does not report single exon transcripts because they are know
se <- bambu(reads = sample1.bam, annotations = annotations, genome = fa.file, opt.discovery = list(min.txScore.singleExon = 0))
```



### Fusion gene/isoform detection

To facilitate fusion gene/isoform detection, *bambu* has implemented a fusion mode. When it is set to TRUE, it will assign multiple GENEIDs to fusion transcripts, separated by ":".

To use this feature, it is recommended to detect the fusion gene breakpoints using fusion detection tools like [JAFFAL](https://github.com/Oshlack/JAFFA) first. Then fusion chromosome fasta file can be created by concatenating the two fusion gene sequences. Similarly, the fusion annotation gtf file can also be created with coordinates of the transcripts from the relevant genes changed to fusion chromosome coordinates. It is then required to do the re-alignment of reads originating from fusion region to the generated fusion chromosome fasta file. Then users can apply *bambu* on the re-aligned bam files with fusion chromosome fasta and gtf files.

```rscript
se <- bambu(reads = fusionAligned.bam, annotations = fusionAnnotations, genome = fusionFasta, fusionMode = TRUE)
```

### *Bambu* Arguments

|argument|description|
|---|---|
|reads|A string or a vector of strings specifying the paths of bam files for genomic alignments, or a BamFile object or a BamFileList object (from Rsamtools).|
| rcOutDir | A string variable specifying the path to where read class files will be saved. |
| annotations | A TxDb object, a path to a .gtf file, or a GRangesList object obtained by prepareAnnotations. |
| genome | A fasta file or a BSGenome object. |
| genome | A fasta file or a BSGenome object. If a fa.gz is provided, the .fai and .gzi must also be present |
| stranded | A boolean for strandedness, defaults to FALSE. |
| ncore | specifying number of cores used when parallel processing is used, defaults to 1. |
| NDR | specifying the maximum NDR rate to novel transcript output among detected transcripts, defaults to 0.1 |
Expand Down
Loading