NGS Bacterial Genome Bioinformatics tools

Congrats! you have successfully run your sequencer and got your brand new, fresh FASTQ files.. now what?

Here are the steps:

  • Trim the FASTQ file (tool: Trimommatic)
module load Trimmomatic/0.36-Java-1.8.0_92
java -jar $EBROOTTRIMMOMATIC/trimmomatic-0.36.jar PE -phred33 -basein path/test.fastq.gz -baseout path/trimmed/trimmed_test.fastq.gz ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
  • Serotype (tool: SeqSero for Salmonella spp.)
module load SeqSero/f3bd721-intel-2015B-Python-2.7.10 -m 2 -b sam -i trimmed_1P.fastq.gz  trimmed_2P.fastq.gz &
  • Assembly (tool: Spades)
module load SPAdes/3.11.1-GCCcore-6.3.0 --careful --memory 5 --threads 2 -1 path/reverse.fastq.gz -2 path/forward.fastq.gz -o test_spades
  • ResFinder (SRST2)
module load SRST2/0.2.0-intel-2015B-Python-2.7.10
srst2 --output resfinder_test --input_pe reverse.fastq.gz --log --gene_db ResFinder.fasta  &
Link to ResFinder.fasta --->
  • PlasmidFinder (SRST2)
module load SRST2/0.2.0-intel-2015B-Python-2.7.10
srst2 --output plasmidfinder_test --input_pe reverse.fastq.gz --log --gene_db PlasmidFinder.fasta  &
Link to get PlasmidFinder.fasta --->
  • MLST (SRST2)

Salmonella enterica:

module load SRST2/0.2.0-intel-2015B-Python-2.7.10 --species "Salmonella enterica" 
srst2 --output test_mlst --input_pe test.fastq.gz --mlst_db Salmonella_enterica.fasta --mlst_definitions senterica.txt --mlst_delimiter ‘_'
Escherichia coli : --species "Escherichia coli#1"
srst2 --output test_mlst --input_pe test.fastq.gz --mlst_db Escherichia_coli#1.fasta --mlst_definitions ecoli.txt --mlst_delimiter '_'
  • Annotate (tool: PATRIC or RAST)
  • Run SNP analysis (tool: Parsnp and Gingr)
module load Parsnp/1.2-Linux64 
parsnp -d /path_to_the_file -c -r /path_to_the_reference/reference.fasta -o /path_to_the_output
  • Create phylogenetic trees (tool: FigTree)

P.S. I will share the command-line information or/and  links for the job scripts and links soon. 

Leave a Reply