首页 > 代码库 > NET format

NET format

The net file format is used to describe the axtNet data that underlie the net alignment annotations in the Genome Browser. For a detailed description of the methods used to generate these data, refer to the Genome Browser description pages that accompany the Net alignment tracks.

 技术分享

File format

The net file consists of 7 fixed fields and a set of optional name/value pairs. In the descriptions below, target refers to the reference species and query refers to the aligning species.

 

  • Fixed fields:
    • Class. Either fill or gap.  Fill refers to a portion of a chain.
    • Start in chromosome (target species)
    • Size (target species)
    • Chromsome name (query species)
    • Relative orientation between target and query species.
    • Start in chromsome (query species)
    • Size (query species)

     

  • Name/value pairs (optional):
    • id -- ID of associated chain (gapped alignment), if any.
    • score -- Score of associated chain.
    • ali   -- Number of bases in alignments in chain.
    • qFar -- For fill that is on the same chromosome as parent, how far fill is from position predicted by parent.   This helps determine if a        rearrangement is local or if a duplication is tandem.
    • qOver -- Number of bases overlapping with parent gap on query side.  Generally, this will be near zero, except for inverts.
    • qDup -- Number of bases in query region that are used twice or more in net. This helps distinguish between a rearrangement and a duplication.
    • type -- One of the following values:   
      • top -- Chain is top-level, not a gap filler   
      • syn -- Chain is on same chromosome and in same direction as parent   
      • inv -- Chain is on same chromosome on opposite direction from parent
      • nonSyn -- Chain is on a different chromosome from parent
    • tN -- Number of unsequenced bases (Ns) on target side
    • qN -- Number of unsequenced bases on query side
    • tR -- Number of bases in RepeatMasker masked repeats on target.
    • qR -- Number of bases in RepeatMasker masked repeats on query.
    • tNewR -- Bases in lineage-specific repeats on target.
    • qNewR -- Bases in lineage-specific repeats on query.
    • tOldR -- Bases in repeats predating split on target.
    • qOldR -- Bases in repeats predating split on query.
    • tTrf -- Bases in trf (Tandem Repeat Finder) repeats on target.
    • qTrf -- Bases in trf repeats on query.
   
 

    

NET format