欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

生信脚本练习(7)求fastq文件质量值分布

程序员文章站 2022-05-11 13:23:54
...

这里是求fastq文件的质量类型和比例
注意数几种类型用了set() 的方法

with open("Test1.fastq","r") as f:
    lines = f.readlines()
qual = []
for line in lines:
    if lines.index(line)% 4 == 3:
        line= line.strip('\n')
        qual.append(line)
qual = "".join(qual)
unique = list(set(qual))
number = []
for u in unique:
    for q in qual:
        number.append(qual.count(u))
number = list(set(number))
with open("1-2.txt","w") as f:
    f.write("QUAL"+"\t" + "TOTAL"+ '\n')
    for i in range(len(number)):
        f.write(str(unique[i])+"\t"+str(number[i])+"\n")

然后画个饼图

data <- read.table("c:\\Users\\1-2.txt")
#print(data)
jpeg("c:\\Users\\pie.jpeg",width=2000,height=2000,res=300)
pct <- round(val/sum(val)*100)
lbls2 <- paste(key, "  ", pct,"%",sep = "")
pie(val,labels = lbls2,main = "Pie Chart of Qual")#,col = rainbow(length(lbls2)))
dev.off()

生信脚本练习(7)求fastq文件质量值分布

8.14更新更好的解法:
就是用字典嘛

from collections import Counter
with open("c:/Test1.fastq","r") as f:
    count = 0
    qual = {}
    while True:
        line = f.readline().strip()
        count += 1
        if count % 4 == 0:
            q = ((Counter(line)))
            for k,v in q.items():
                if k not in qual:
                    qual[k] = 0
                else:
                    qual[k] += v
        if len(line) == 0:
            break
print(qual)
with open("c:/1-2.txt","w") as f:
    f.write("QUAL"+"\t" + "TOTAL"+ '\n')
    for k,v in qual.items():
        f.write(k + '\t'+ str(v) + '\n')
相关标签: 脚本