NGS 分析流程 (三)
程序员文章站
2024-03-02 21:59:46
...
NGS 分析流程 (三)
一、注释
1. SnpEff 注释
代码如下(示例):
all_marked_vcf=$1
prefix_anno_vcf=$2
isoform_v1.0.txt="isoform_v1.0.txt"
java -Xmx4g -jar /home/sunxj/software/SnpEff/target/SnpEff-4.4-jar-with-dependencies.jar \
-c /software/SnpEff/conf/snpEff.config \
-s ${prefix_anno_vcf}.html \
-onlyTr /ref/human/b37/snpeff/${isoform_v1.0.txt} \
-haplotype-ouput ${prefix_anno_vcf}.haplotype.anno GRCh37.p13.RefSeq ${all_marked_vcf} > ${prefix_anno_vcf}_hgvs.vcf
链接: https://pcingola.github.io/SnpEff/.
2. bedtools过滤
代码如下(示例):
marked_vcf=$1 #marked.vcf
trimed_vcf=$2
prefix_conf_dir="/yunying/codes/product/module"
trim_in_conf="trim_include_standard.conf"
trim_ex_conf="trim_exclude_standard.conf"
vcf_trim_in_conf=${vcf_trim_in_conf} # "-i '(FILTER == "PASS" | FILTER == "germline" | FILTER == "haplotype" | FILTER == "slippage" | FILTER == "multiallelic" | FILTER == "clustered_events" | FILTER == "orientation" | FILTER == "base_qual" | FILTER == "contamination" | FILTER == "panel_of_normals") & FORMAT/DP >= 10 & FORMAT/AF > 0.0009'"
vcf_trim_ex_conf=${vcf_trim_in_conf} # "-e '(FILTER == "strand_bias" | FILTER == "normal_artifact" | FILTER == "weak_evidence")'
### First, we use very sensitive "FLAG" to keep high confidence mutation sites
cat ${vcf_trim_in_conf} | xargs \
bcftools filter \
-o ${trimed_vcf%.*}_tmp.vcf ${marked_vcf}
### Last, we exclude some likely fasle positive sites with conservative strategy
cat ${vcf_trim_ex_conf} | xargs \
bcftools filter -o ${trimed_vcf} ${trimed_vcf%.*}_tmp.vcf
上一篇: Sqlalchemy - 组合查询
下一篇: SQLAlchemy(二)