Detailed description of steps performed by racoon

Quality filtering

Sequencing reads are filtered for a Phred score >= 10 inside the unique molecular identifier (UMI) at positions 1-10 of each read to ensure reliable sample and duplicate assignment. The cutoff can be changed by specifiing an other value by the racoon_clip minBaseQuality option.

Demultiplexing, UMI & Adapter trimming

Demultiplexing and 3’ adapters adapter trimming are performed with FLEXBAR (version 3.5.0). FLEXBAR also handles UMIs and trimms barcodes.

If demulitplexing is turned on, this is done with the FLEXBAR via the provided barcode_fasta with FLEXBAR parameters --barcodes {input.barcodes} --barcode-unassigned --barcode-error-rate 0.

3’ adapters adapters are trimmed using FLEXBAR options --adapter-trim-end RIGHT --adapter-error-rate 0.1 --adapter-min-overlap 1 --adapter-cycles <as specified> by default, but adapter trimming can also be turned off.

At the same time, UMIs (and barcodes, if present) are trimmed from the 5’ end of the reads and stored in the read names using FLEXBAR options --umi-tags --barcode-trim-end LTAIL.

Reads that are shorter than 15 nt after trimming are discarded using FLEXBAR option --min-read-length 15. The cutoff can be changed by specifiing an other value by the racoon_clip flexbar_minReadLength option.

See also: FLEXBAR—Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms.

Genome alignment

Reads are aligned to the specified genome with STAR (version 2.7.10). In short, the genome is indexed with STAR –runMode genomeGenerate. Then, the reads of each sample are individually aligned to the genome with STAR –runMode alignReads --sjdbOverhang 139 --outFilterMismatchNoverReadLmax 0.04 --outFilterMismatchNmax 999 --outFilterMultimapNmax 1 --alignEndsType "Extend5pOfRead1" --outReadsUnmapped "Fastx" --outSJfilterReads "Unique". Obtained bam files are indexed with samtools index (version 1.11). All parameters except --alignEndsType "Extend5pOfRead1" can be changed via racoon options.

See also:

Deduplication

Aligned reads were deduplicated with umi_tools dedup --extract-umi-method read_id --method unique (UMI-tools version 1.1.1).

See also UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy