首页 > 代码库 > GATK errors 及解决办法 (持续更新)
GATK errors 及解决办法 (持续更新)
1, MESSAGE: Input files reads and reference have incompatible contigs: Relative ordering of overlapping contigs differs, which is unsafe.
##### ERROR reads contigs = [Chr1, Chr10, Chr11, Chr12, Chr2, Chr3, Chr4, Chr5, Chr6, Chr7, Chr8, Chr9, ChrSy, ChrUn]
##### ERROR reference contigs = [Chr1, Chr2, Chr3, Chr4, Chr5, Chr6, Chr7, Chr8, Chr9, Chr10, Chr11, Chr12, ChrUn, ChrSy]
RESOLVE: 这种错误是由于你的bam文件和参考序列中contigs的名字顺序不对应。将bam文件中contig名字的顺序调整使两者一致即可,可用picardtools 中的ReorderSam 工具:
java -jar /share/Public/cmiao/picard-tools-1.112/ReorderSam.jar I=L1-2_ATCACG_L003_R_tophat_accepted_hits.sorted.rmp.rg.bam O=order.bam REFERENCE=Osativa_204.fa
2, MESSAGE: Unsupported CIGAR operator N in read HWI-D00258:28:D2EU3ACXX:3:1106:20678:47827 at Chr1:3160. Perhaps you are trying to use RNA-Seq data? While we are currently actively working to support this data type unfortunately the GATK cannot be used with this data in its current form. You have the option of either filtering out all reads with operator N in their CIGAR string (please add --filter_reads_with_N_cigar to your command line) or assume the risk of processing those reads as they are including the pertinent unsafe flag (please add -U ALLOW_N_CIGAR_READS to your command line). Notice however that if you were to choose the latter, an unspecified subset of the analytical outputs of an unspecified subset of the tools will become unpredictable. Consequently the GATK team might well not be able to provide you with the usual support with any issue regarding any output。
RESOLVE: 如果你用的是RAN-seq数据,参考序列为整个基因组,那么在比对的时候,有些read的不同部分可能比对到参考序列的不同region, 这是在这个read的CIGAR中就会有N (含义见sam格式),如果想继续call snp,就要把这些read给过滤掉,加上--filter_reads_with_N_cigar就可以了:
java -jar /share/Public/cmiao/GATK_tools/GenomeAnalysisTK.jar -nct 30 -T HaplotypeCaller -R Osativa_204.fa -I order.bam --filter_reads_with_N_cigar -o gatk.order.vcf