[asm_release] Xenopus tropicalis version 7.0

Assembly Stats:
---------------
Main genome scaffold total: 7730
Main genome contig total:   55234
Main genome scaffold sequence total: 1437.6 MB
Main genome contig sequence total:   1366.0 MB (->  5.0% gap)
Main genome scaffold N/L50: 5/124.1 MB
Main genome contig N/L50:   5479/72.2 KB
Number of scaffolds > 50 KB: 553
% main genome in scaffolds > 50 KB: 96.6%

 Minimum    Number    Number     Total        Total     Scaffold
Scaffold      of        of      Scaffold      Contig     Contig
 Length   Scaffolds  Contigs     Length       Length    Coverage
--------  ---------  -------  -----------  -----------  --------
    All     7,730     55,234  1,437,594,934  1,365,995,008    95.02%
   1 kb     7,730     55,234  1,437,594,934  1,365,995,008    95.02%
 2.5 kb     7,177     54,627  1,436,680,707  1,365,101,391    95.02%
   5 kb     3,772     49,478  1,423,507,169  1,352,583,703    95.02%
  10 kb     1,691     45,805  1,409,692,503  1,339,406,747    95.01%
  25 kb       771     43,423  1,396,249,822  1,326,922,921    95.03%
  50 kb       553     42,499  1,388,151,794  1,320,882,810    95.15%
 100 kb       320     40,895  1,372,189,873  1,308,976,908    95.39%
 250 kb       138     38,749  1,343,049,004  1,284,937,801    95.67%
 500 kb        58     37,393  1,315,432,664  1,260,206,616    95.80%
   1 mb        23     36,403  1,290,693,462  1,237,049,603    95.84%
 2.5 mb        13     35,784  1,277,440,684  1,224,920,878    95.89%
   5 mb        12     35,665  1,273,302,690  1,221,086,034    95.90%

Assembly Notes:
---------------
Main assembly performed using ARACHNE.  The available genetic map was used to identify
misjoins.  Misjoins were characterized by a combination of an abrupt change in the 
linkage group combined within a region of low BAC/Fosmid support. A total of 45 breaks 
were identified.

Scaffolds were then oriented, ordered, and joined together using a combination of the
genetic map and Galus galus (chicken) synteny.  Initial, scaffold order/orientation was 
generated using the genetic map.  Scaffolds that do not contain marker placements, but 
were syntenic, were leveraged in using (a) BAC/Fosmid support, and (b) associated synteny 
where syntenic scaffolds were placed in areas where they could fill a gap in synteny 
between 2 consecutive syntenic scaffolds that hit a marker.  Finally, synteny was used to 
orient scaffolds having insufficient marker placements to determine orientation.

A total of 1,200 joins were made to form a final assembly containing 10 chromosomes.  Significant
telomeric sequence was properly oriented in the assembly. Chromosomes 3, 5, and 8 are fragmented into 
3, 3b, 5, 5b, 8, 8b, and 8c.  The subscripted chromosomes are small linkage groups that are 
cytogenetically placed (via FISH) on corresponding chromosomes, but show no linkage with each other 
in genotyping/linkage analysis.  The final set of chromosomes are ordered in the release as follows:

scaffold_1   LG_1
scaffold_2   LG_2
scaffold_3   LG_3
scaffold_4   LG_3b
scaffold_5   LG_4
scaffold_6   LG_5
scaffold_7   LG_5b
scaffold_8   LG_6
scaffold_9   LG_7
scaffold_10  LG_8
scaffold_11  LG_8b
scaffold_12  LG_8c
scaffold_13  LG_9
scaffold_14  LG_10

Coverage Stats:
---------------

 LIB     COV.    INSERT   STDDEV
-----   ------   ---------------
OTHER    0.40x      UNCLASSIFIED
UJS      0.32x     2928 +/-  606
TKS      0.38x     2954 +/-  547
ZWK      0.40x     2955 +/-  572
ZWL      0.26x     2961 +/-  558
AASO     0.64x     3041 +/-  616
AFHY     0.75x     3070 +/-  937
FROG     0.40x     3142 +/-  755
AKAT     0.67x     8302 +/-  950
ANUT     0.54x     8306 +/-  968
AHIN     0.65x     8313 +/-  956
AIUY     0.52x     8316 +/-  967
AKKX     0.59x     8316 +/-  956
ANHX     0.33x     8320 +/-  959
APPS     0.08x    37810 +/- 3925
AOPZ     0.11x    37816 +/- 3995
AHSY     0.12x    38611 +/- 3339
AHSX     0.19x    38954 +/- 3291
AHIO     0.02x    39157 +/- 3307
ISB1     0.03x    55694 +/- 20848
XEO      0.01x    69566 +/- 27557
CH216    0.03x   134898 +/- 40190
Total    7.44x

Completeness Notes:
-------------------
We placed full length cDNAs at 90% identity and 85% coverage. This is a test 
to determine whether we are missing large portions of the genome.

9098 total sequences.
7414 sequences (87.38%) placed at 90% identity and 85% coverage.
613 library artifacts (6.74%)
7078/7414 (95.46%) of the placements are in chromosomes.
522 sequences are not found (6.15%).

The "not found" set of cDNAs were analyzed and many of them were found to be
partially repetitive, fungal contaminant, or they fall in areas of the genome that
contain a collapsed repeat and do not place well.  

The assembly release directory can be found at:
-----------------------------------------------
/home/jazz_analysis/assembly_releases/2006/Xenopus_tropicalis/20100930

The scaffolds are divided into the following sets:
--------------------------------------------------
1) Main genome
2) Mitochondrion
3) Unanchored rDNA
4) Repeat scaffolds ( < 50kb scaffolds composed of >=95% 24mers >4x in >=50kb scaffolds )
5) Excluded ( < 1kb )
6) Prokaryote
7) Eukaryote
8) Alternative Haplotype (scaffolds placing at >95% identity and >95% coverage in assembled chromosomes)

Prokaryotic contamination consists primarily of Acidovorax, Pseudomonas, and Delftia.
Eukaryotic contamination consists of Homo sapien and Galus galus.

Final Production Stats:
-----------------------

 LIB    INSERT      READS       FAILED(%)       VECTOR(%)
-----   ------   --------   -------------   -------------
TKS       3500    1491279    249316(16.7)     25881( 1.7)
AFHY      3500    1888479    112261( 5.9)     35631( 1.9)
ZWL       3500     747601     49888( 6.7)     17235( 2.3)
ZWK       3500    1244657    112700( 9.1)     27015( 2.2)
UJS       3500    1030209    110076(10.7)     20620( 2.0)
FROG      3500    1041061     58016( 5.6)     19531( 1.9)
AASO      3500    1872127    124601( 6.7)     29058( 1.6)
AKAT      8100    1890190    188718(10.0)     35660( 1.9)
AIUY      8100    1834931    333354(18.2)     41842( 2.3)
AHIN      8200    1854651    196877(10.6)     33779( 1.8)
ANHX      8200    1010537    142430(14.1)     19986( 2.0)
AKKX      8200    1870149    313055(16.7)     37095( 2.0)
ANUT      8200    1807412    343400(19.0)     39306( 2.2)
AOPZ     37700     485758    142515(29.3)      9111( 1.9)
APPS     37700     293472     59818(20.4)      5653( 1.9)
AHSY     38500     497372    117222(23.6)      7672( 1.5)
AHSX     38600     652537    137138(21.0)      9183( 1.4)
AHIO     38800      84096     17273(20.5)      1426( 1.7)
XEO     100000      38496      1766( 4.6)       803( 2.1)
Total      N/A   21635014   2810424(13.0)    416487( 1.9)

 LIB      UNPAIRED(%)       PAIRED(%)      GOODP20 STDDEV
-----   -------------   -------------    ----------------
TKS      148728(10.0)   1067354(71.6)    545.48 +/-126.53
AFHY      94571( 5.0)   1646016(87.2)    719.49 +/-131.15
ZWL       32118( 4.3)    648360(86.7)    638.65 +/-117.58
ZWK       72010( 5.8)   1032932(83.0)    633.76 +/-129.18
UJS       75251( 7.3)    824262(80.0)    628.90 +/-129.97
FROG      55256( 5.3)    908258(87.2)    703.99 +/-131.12
AASO      93910( 5.0)   1624558(86.8)    650.04 +/-123.81
AKAT     110938( 5.9)   1554874(82.3)    671.32 +/-135.96
AIUY     153269( 8.4)   1306466(71.2)    599.41 +/-156.99
AHIN     111801( 6.0)   1512194(81.5)    671.52 +/-137.86
ANHX      65501( 6.5)    782620(77.4)    659.87 +/-152.31
AKKX     144439( 7.7)   1375560(73.6)    647.80 +/-146.94
ANUT     163032( 9.0)   1261674(69.8)    638.16 +/-156.98
AOPZ      46290( 9.5)    287842(59.3)    595.69 +/-169.75
APPS      29239(10.0)    198762(67.7)    648.17 +/-170.85
AHSY      43304( 8.7)    329174(66.2)    612.04 +/-161.54
AHSX      43402( 6.7)    462814(70.9)    641.11 +/-155.39
AHIO       6015( 7.2)     59382(70.6)    665.24 +/-152.93
XEO        1353( 3.5)     34574(89.8)    727.76 +/-130.39
Total   1490427( 6.9)   16917676(78.2)    647.38 +/-146.22

Please contact us if you require any additional information concerning this assembly.

Best regards,

Jeremy Schmutz and Jerry Jenkins

