Skip to content

gjdury/Dm_Promoter_PopGen

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INFO-I 529 Project

List of group members:

  • Krishna Bathina
  • Guillaume Dury
  • Xuan Wang
  • R. Taylor Raborn (co-advisor)

Agenda items from 9-12 meeting

  1. Read and become familiar with background literature, especially Hoskins and Lenhard et al (2011 and 2012, respectively).
  2. To decide which PopGen dataset(s) to use in our study, and to become familiar with it (/them).
  3. To determine whether peaked/broad promoter classifications are available from Hoskins or whether this needs to be via Taylor's pipeline.

Agenda items from 9-19 meeting

  1. Agreed on using the Clark population data from Fly Nexus database.
  2. Plan of attack: create a figure that evaulates pi within the core promoter.
  3. Need to add:i) pi calculator, which can then be incorporated into larger scripts, and ii) the promoter data (TR).
  4. Agreed on using github for code and and small files and Box for very large (>50MB) files.

Misc updates 9-22

  1. Krishna added a pi calculator script: py.pi (thanks to Guillaume for adding his code on Monday). Taylor tweaked the code very slightly so it automatically outputs pi given the embedded test data.
  2. Our storage allocation is now available at /N/dc2/projects/PromoterPopGen . Please check to see if you i) can navigate into the directory and ii) encounter any permissions issues creating a test directory/file.

Misc updates 10-4

  1. Meeting scheduled for 10-11 at 5:30PM (location TBA).
  2. RTR added a relevant Population Genomics link to reading list.
  3. RTR wrote and added 'importSeq', an R file that imports fasta data derived from SEQ data.

Misc updates 10-5

  1. RTR added seqConvert.sh, which converts all SEQ files to FASTA format; the header is the filename before the .seq prefix.
  2. RTR converted all SEQ files in our /N/dc2/projects/PromoterPopGen/ folder to FASTA format using seqConvert.sh.

Misc updated 10-11

  1. Our meeting was extremely productive. We discussed our current progress, the data structure (DNAstring object in R) and the things that need to be done.
  2. We agreed to focus on the following two issues: i) how to manipulate the DNAstring object to calculate pi and ii) subsetting the DNAstring object to retrieve only the intervals of interest (i.e. promoters)
  3. RTR added an example of the DNAstring object (as an R binary in .RData format) to our folder on UITS (see /data). Load the file into your R workspace using the following command: R load("DNAstrings_example.RData")

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 60.3%
  • R 37.2%
  • Shell 2.5%