by Zhibiao Mai & Gong Zhang
Translatomics Lab
Jinan University, Guangzhou, China
FANSe2splice is a mapping tool for spliced mapping for next-generation sequencing. It splits a read into two halves and map them to reference genome. This is particularly useful in case of RNA-seq, where RNA splicing may occur.
Minimized false positives and false negatives Spliced mapping algorithms like TopHat2, MapSplice2 and HISAT can do the same task, but with much less robustness due to the flaw of BWT in case of mismatches. FANSe2splice is based on the hyper-accurate FANSe2 algorithm, which provides excellent experimental verifiability. Splice junctions can be mostly validated by experiments.
Detects junction from single-end reads The FANSe2splice detects junction from single-end reads. Therefore, paired-end mode is no more necessary. This is practical for using the cost-efficient single-end mode. It is also a good news for the non-Illumina sequencers like Ion Torrent, which can only produce single-end reads.
Nice for low-throughput sequencing FANSe2splice do not need to estimate the junction borders by piling the non-spliced reads. FANSe2splice directly detects junctions from each single read, without any interference with other reads. This is ideal for low-throughput RNA-seq, where piling reads for low-expression RNAs are difficult.
Click here to download the FANSe2splice (zipped).
Note:
- This version of FANSe2splice is compiled in Windows 64-bit. We tested it under Windows 7, Windows 8.1 and Windows server 2008 R2. You can use WINE to run it under Linux.
- Prior to use, you need to install and configure MPICH2 in your computer. Please refer to FANSe2 for such information.
*********** If annotation file (gtf/gff) is provided ************
Step 1: Create reference mRNA file from annotation
Example:
gtf_gff_refMrna_sort.exe -Tgtf -Ggenes.gtf -Fgenome.fa -Wd:\workParameter explanation (no space between the tag and the parameter):
-T <gtf/gff>. The format of your annotation file. gtf and gff can be accepted.
-G <filename>. The annotation file
-F <filename>. The reference genome (in fasta format).
-W <path>. Work and output folderOutput:
File "refMrna_sort.fa" will be used in step2.
File "refFlat_type1.txt" will be used in step3.Step 2: Mapping reads to annotated reference mRNA sequences using FANSe2
Example:
fanse2.exe -RrefMrna_sort.fa -Ddata.fq -Odata.fanse2 -L80 -E5 -I0 -S14 -M0 -B1 -U0 -C4 -F35 -Wd:\work -Y1 -A0Please refer to FANSe2 manual for parameter explanation.
refMrna_sort.fa was generated from Step 1.Step 3: Spliced mapping to reference genome and report junction list
Example:
FANSe2splice.py -GrefFlat_type1.txt -Rgenome.fa -Ddata.fq -Tdata.fanse2 -Odata -L80 -E5 -C4 -F377 -N50 -X500000 -WworkParameter explanation:
-G <filename>. The "refFlat_type1.txt" created from step1.
-R <filename>. The reference genome (in fasta format).
-D <filename>. The sequencing read file (in fastq format).
-T <filename>. The FANSe2 result from step2.
-O <string>. The basename of the result. If you set the basename as "data", the splice sites file will be "data.junction".
-L,-E,-C,-F,-W: Same as FANSe2
-N: <int>. min intron length (bp)
-X: <int>. max intron length (bp)
*********** If annotation file (gtf/gff) is not provided ************
Simply run FANSe2splice.py directly without -T parameter!
Example:
FANSe2splice.py -Rgenome.fa -Ddata.fq -Odata -L80 -E5 -C6 -F251 -N50 -X500000 -Wd:\work