FANSe3

Please note ...

FANSe3 is a commercial development project of Chi-Biotech Co. Ltd.

The public (free) version of FANSe3 here is only for trials, with only basic and limited features. For full version with unleashed power (both academic and commercial use), please contact Chi-Biotech to obtain a licence.

Feature Comparison of free/commercial versions
Feature	Free version	Commercial version
Parallel CPU cores	Limited to 2	Unlimited (>256)
Cross-node parallelization	No	Standard version: no FANSe3s version: yes
Unique mapping	Supported	Supported
Fast indel detection	Yes	Yes
Masked genome	Supported	Supported
Export all multi-mapped locations	Up to 200	Up to 200
Max read length	1000	Unlimited
Unidirectional mapping	No	Yes, forward or reverse, for strand-specific applications
Sequencer artifact compensation	No	Yes, optimized for Illumina and BGISEQ/MGISEQ flowcells, also compatible with Ion Torrent
Format supported	FASTQ only	FASTQ, FASTA, FQC, one-line nucleotide, etc. (Use FQC format for the best performance)
Disk I/O saving	No	Yes
Batch mode	No	Yes, single indexing, mapping multiple datasets sequentially, saving time of indexing
Performance optimization for RNA-seq	No	Yes, up to 20x faster than free version
Optimized for single-cell RNA-seq	No	Yes, produces much less intermediate files
Trim reads while mapping	No	Yes
Direct quantification for RNA-seq	No	Yes, no need to use other programs to obtain read count and rpkM values

Command-line usage

FANSe3 -R<ref.fa> -D<reads.fq> [-O<out.fanse3>] [-E3] [-S14] [-C2] [-H1] [--indel] [--unique] [--mask]

Options
Option	Optional?	Explanation
-R	compulsory	Reference sequence file (FASTA format). Supports UNC name (like \\server1\myfolder\abc.fa). Supports Chinese characters. In the FASTA file, the sequence name may contain space and special characters. In the FASTA file, no limitation of the sequence name.
-D	compulsory	FASTQ dataset file. Supports UNC name (like \\server1\myfolder\ionS5-1.fq). Supports Chinese characters.
-O	optional	Output file name. Automatic generate when missing. Supports UNC name (like \\server1\myfolder\ionS5-1.fanse3). Supports Chinese characters.
-E	optional	Error allowance (Levenshtein Distance). Default=3. Mismatch and indel are all counted as errors. It can be set as integer or percentage. Integer: like -E5, designate fixed number of errors allowed in the alignment. This is preferred when read length is fixed, e.g. Illumina and BGISEQ/MGISEQ sequencers. Percentage: like -E5%, designate error allowance as a percentage of the read length. This is useful when the read length is variable, e.g. Ion Torrent and Helicos sequencers, or the short fragments after trimming the adapters.
-S	optional	Seed length. Default=14. Can be set as any integer from 6 to 14. Larger seed length will be faster but may lose more reads when the error rate exceeds 6%. Please refer to the FANSe2 paper to set a proper seed length according to your high error rates scenarios for an estimated accuracy.
-C	optional	Parallel CPU cores. Default=2. For free version, max=2.
-H	optional	Batch size: how many reads (in million) will be loaded for each batch. -H2 means 2 million reads per batch. -H0.5 means 0.5 million reads per batch.
--indel	optional	Fast indel detection on. Equivalent to the "-I1" in FANSe2.
--unique	optional	Unique mapping. When this toggle is present, the uniquely mapped reads will be stored in the .fanse3 file, and the multi-mapped reads will be stored in a separate -multimap.fanse3 file.
--mask	optional	Masked genome. When this toggle is present, the lower-case letters in the reference sequences will not be considered. Equivalent to the "-M1" in FANSe2.

Quick examples:

FANSe3 -Rref.fa -Dillumina.fq --unique

FANSe3 -Rref.fa -Diontorrent.fq -E5% --indel --unique

Result file format

There will be four result files:

*.fanse3: the uniquely mapped reads

*-multimap.fanse3: the multi-mapped reads

*.unmapped: the unmapped reads and the reads of highly repetitive sequences

*.log: log file for the parameters and the speed

Format description of the .fanse3 files

The .fanse3 files are very similar to the .fanse2 files, with some additinoal information.

Example of a uniquely mapped read:
42628 AGCAAGGACTAACCCCTATACC .................x....
F NM_001190470 1 142 1

Line 1:
42628 = read name (exactly as in FASTQ, could be a string)
AGCAAGGACTAACCCCTATACC = read nucleotide sequence
.................x.... = alignment. "x"=mismatch; "-"=deletion; nucleotide=insertion.
Line 2:
F = direction (F = forward, R = reverse)
NM_001190470 = mapped reference name
1 = error count (Levenshtein Distance)
142 = position (0-based)
1 = number of mapped locations with equal Levenshtein Distance (1 = uniquely mapped)

Example of a multi-mapped read:
369061 AGCTGGTACAGAAAGCCAAATTCGCTG ....................x......,....................x......
F,F NM_003404,NM_139323 1 405,310 2

For multi-mapped reads, the directions, reference names, positions would be multiple values separated by a comma ",".

Auxiliary programs

FANSe3toBED

This program converts the FANSe3 mapping result file (normally with .fanse3 extension) to .BED file format for visualization. Many visualization tools like UCSC Genome Browser and IGV can visualize .BED files.

Usage:
FANSe3toBED a.fanse3

a.fanse3: the fanse3 mapping result file.
Output: a.BED

Prerequisite: .NET framework 4.7.2. This program runs in 32/64-bit Windows platform.

Download: Click here to download

FANSe3

Please note ...

Command-line usage

Result file format

Format description of the .fanse3 files

Auxiliary programs

FANSe3toBED

Download FANSe3

Environment

Latest News

Related Links

Follow Us