ASAP Release History

Version Description Date
1.0 Initial Release May 02, 2012
1.1.0 Added annotation and few other features Jun 12, 2012
1.1.1 Bug fixes and added support for flagstat bam summary Jun 26, 2012
1.1.2 Bug fixes Jun 27, 2012
1.1.3 Bug fixes and support for single end reads Nov 04, 2012
1.1.4 Bug fixes and template listing Dec 05, 2012
1.1.5 Bug fixes for serial job execution when saving temporary data Dec 12, 2012
1.1.6 Bug fix Dec 19, 2012

ASAP Issues to be Addressed

Issue Description Status Date Completed Version Completed
Questionable jobs produced during DB rebuild When rebuilding a database where some files already exist, but there are missing files in early stages (such as alignment), ASAP will attempt to produce those aligned files, even though subsequent steps have already been recognized to be complete. Complete May 23, 2012 1.1.1
Add IMPORT function Users should be able to import BAM files that have already been aligned. Once imported, those files will work like bams that were created by one of the ASAP alignment steps Complete May 18, 2012 1.1.1
Add verify for executors This is necessary to catch invalid groups for PBS_ACCOUNT and GROUP_OWNER Complete May 23, 2012 1.1.1
Ignore Partitioned chroms missing from users chrom list Currently, if the user chooses to process only a subset of chromosomes, ASAP will attempt to realign, recalibrate etc over all of the partitions, even if those aren't requested. Complete May 23, 2012 1.1.1
Changing SUBREGION_SIZE can create issues with MOVE_BAM If the user generated scripts for one SUBREGION_SIZE and reran ASAP later with a different size, ASAP would look for all VCFs, even if the old ones can never be created except by manual execution. Complete Dec 03, 2012 1.1.4
Bam Summary Users can now use QPLOT to collect some thorough statistics for bams. Complete   1.1.3
Moved template info into files for greater flexibility Template information resides in separate files which are dynamically loaded at runtime. This allows users to extend asap to some extent without needing to alter any pre-existing code Complete   1.1.3
Split steps into steps and tools To differentiate commands like GENERATE_BATCH from CALL_SNPS, we have moved some functionality into tools. Tools must be specified on the commandline and run immediately. Complete   1.1.3
Support for GATK2.x GATK 2.0 introduced a few changes in parameter names, so ASAP now allows users to switch default settings during configuration generation Complete   1.1.3
Provide option to regular expressions ASAP now allows users to provide a comma separated file instead of regular expression for finding FASTQ and BAMs for GENERATE_BATCH and IMPORT_BAM Complete   1.1.3
List tools beneath "available steps" during normal run Previously, users had to consult the manual to know how to spell tool names. Now, ASAP lists all available tools in the banner under the list of availalble steps. Complete Dec 04, 2012 1.1.4
List templates for all steps Added a tool, LIST_TEMPLATES, to iterate over each step and list all available templates. This is should make it easier to avoid mispelled template names. The search is live, so users who wish to create their own templates will be able to see those as well. Complete Dec 04, 2012 1.1.4
List templates for active steps Optional templates are listed beside each "Active Step" in the banner. Complete Dec 04, 2012 1.1.4
WGA alignments run out of time unless TIME_CALIBRATION is set very high Added a new setting to GENERATE_BATCH that allows the user to "calibrate" the alignment time. We recommend setting this to 6.0 for WGA alignments. Complete Dec 05, 2012 1.1.4
SNP calls are generated even if no bams exist An error has been fixed that allowed the variant caller steps to produce scripts even if there is no available BAMs connecting to it. ASAP now reports that no scripts were produced and attempts to explain the possible reasons for this (it's not always an error). Complete Dec 05, 2012 1.1.4
BAMS can be imported when chromosome appears in dir name Previous, the regular expressions ignored the directory structure. ASAP now allows regular expressions to capture sample/chromosome information from the directory structure as well as filenames (if written correctly) Complete Dec 05, 2012 1.1.4
Samtools halts during recalibration in Serial Mode If the variable, KEEP_TEMPORARY_FILES is set to true, samtools will fail on a merge due to the presence of merged.bam. I have changed samtools to force the output of the file and added a line to delete the file when it is no longer used Complete Dec 12, 2012 1.1.5
Completion of Recalibration deleted file from realignment under certain circumstances This bug was introduced as a fix to serial execution KEEP_TEMPORARY_FILES. Recalibration was deleting files from the realignment directory upon completion if there was no need to merge multiple files. This didn't affect downstream progress, however, if is unintended behavior. Complete Dec 19, 2012 1.1.6
Issues exist when the user deletes data from the "intermediate" steps before ASAP can identify they were completed When run in Serial Mode, ASAP doesn't use tokens and job IDs. So, it has no way to recognize that a job was completed successfully except for the presence of output. If the user deletes intermediate data, say for realignment before rerunning ASAP to catalog it's completion, it will iterate down to realignment, see that no data is there, generate scripts before stepping up to Recalibration and recognizing that those data are there. In this case, halting the serial execution and rerunning it will behave correctly (the recalibration will find it's data is completed before pushing back to realignment). However, it is possible that unnecessary computation will be expended in these cases. In Progress    

ASAP TODO List - Feature Wishlist

Feature Description Status Date Completed Version Present
IMPORT_BAM Currently, ASAP requires bams to have been produced by itself using alignment. In a feature update, we plan to allow users to import BAM files similarly to how they can add FASTQ data. Once imported, users can then perform any non-alignment step on the data as if it were produced originally by ASAP. Complete May 18, 2012 1.1.1
INDEL Call ASAP is set up to allow users to call SNPs, and we provide instructions to create an INDEL caller template, however, in the near future, we expect to provide a parallel INDEL caller that uses either GATK or samtools (and possibly others) Complete May 22, 2012 1.1.1
Annotation In a near-term update, we plan to add annotation as a step to be performed after VCF calls are done. Complete May 23, 2012 1.1.1
Automatically download components ASAP now can automatically download most of it's component applications for the user Complete   1.1.2
Single End Reads Add support for single end reads. I will need to get access to a few examples before I can make the necessary changes Complete Aug, 2012 1.1.3
Ability to ignore upstream data There are several reasons a user might want to ignore upstream output. One obvious reason is to have one or more intermediate steps rerun to take advantage of some configuration changes. Complete May 24, 2012 1.1.1
Make MarkDuplicates & Fixmate Optional Currently, these two changes are made by default at alignment and realignment respectively, but we should allow the user to optionally turn then off Complete May 23, 2012 1.1.1
Provide RESTART Command Basically provide a way for the user to simply relaunch pre-existing scripts and tie in existing dependencies. Requires proper dependency entries in DB Incomplete    
Incorporate Tokens into Serial Execution When running serial data, there is no confirmation that data was completed, except for the final product at each step. If the user cleans out temporary data that really is no longer requred for completed subsequent steps, prior to rerunning ASAP, ASAP will first identify that those files are missing and attempts to create jobs to create those files. Subsequent steps will correctly identify their completed files, but it will be too late to prevent the generation of the earlier steps. Tokens used by cluster based jobs prevent this problem, since those tokens are parsed and the database is properly updated before ASAP begins walking the dependency tree. Incomplete    

ASAP Unlikely Feature List - Unlikely additions that have been considered

Feature Description Reason
Add basic Analysis Add some analysis tools to the pipeline to allow users to get some preliminary information about their data immediately upon completion of the processing Analysis is probably too different from one project to the next.

-- EricTorstenson - 01 May 2012
Topic revision: r16 - 02 Jan 2013, EricTorstenson
 

This site is powered by FoswikiCopyright © 2013-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback