码迷,mamicode.com
首页 > 其他好文 > 详细

pysam 模块

时间:2014-07-22 22:43:12      阅读:1418      评论:0      收藏:0      [点我收藏+]

标签:http   使用   os   文件   io   art   

pysam is a python module that makes it easy to read and manipulate mapped short read sequence data stored in SAM/BAM files.

 

要读bam文件, 首先实例化一个Samfile对象

import pysam

samfile = pysam.Samfile(‘ex1.bam‘, ‘rb‘)                  

#r 是reading。 w是writing。b是指bam文件, 不加的话默认是sam. 如果有b, 必须有r或者w。

#如果bai文件存在,会自动打开。没有的话 fetch() 和 pileup() 不能用。

文件打开以后, 可以用fetch对指定的region进行循环。

for alignedread in samfile.fetch(‘Chr1‘, 100, 200):         #不指定的话就是全部

    print alignedread

读完以后关闭文件

samfile.close()

 

pysam.Samfile.fetch() returns all reads overlapping a region sorted by the first aligned base in the reference sequence.

Note that it will also return reads that are only partially overlapping with the region. Thus the reads returned might span a region that is larger than the one queried.

 

统计参考序列某个区域内,call beases in each site

The pileup engine of csamtools iterates over all reads that are aligned to a region. In contrast to fetching, the pileup engine returns for each base in the reference sequence the reads that map to that particular position.

eg:

iter = samfile.pileup( ‘seq1‘, 10, 20 )
for x in iter: print str(x)
Aligned reads are returned as a pysam.PileupColumn.

eg:

import pysam

samfile = pysam.Samfile(‘ex1.bam‘, ‘rb‘)

for pileupcolum in samfile.pileup(‘Chr1‘, 100, 200):

    print ‘coverage at base %s = %s‘ % (pileupcolum.pos, pileupcolumn.n)    #pos返回的是位置,n返回有多少个碱基比对到这个位置

    for pileupread in pileupcolumn.pileups:

        print ‘\tbase in read %s = %s‘ %(pileupread.alignment.qname, pileupread.alignment.seq[pileupread.qpos])

samfile.close()

#pileupColumn.tid  指 chromesome ID as is defined in the header

  pileupColumn.pos  指 the target base coordinate(0-based)

  pileupCloumn.n   指number of reads mapping to this column

  pileupCloumn.pileups  指 list of reads aligned to this column

# pileupread.alignment.qname 返回列中read的名字,

   pileupread.alignment.seq[pileupread.qpos] 返回read在某个位点的碱基

 

NOTE:

Coordinates in pysam are always 0-based(following the python convention). SAM text fils use 1-based coordinates.

 

pysam.Samfile, pysam.Fastafile, pysam.Fastqfile各个类的详细使用说明见:http://pysam.readthedocs.org/en/latest/api.html

 

freemao

FAFU

free_mao@qq.com

 

  

pysam 模块,布布扣,bubuko.com

pysam 模块

标签:http   使用   os   文件   io   art   

原文地址:http://www.cnblogs.com/freemao/p/3861043.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!