The following scaffolds from the initial BLAST results were looked at in further detail and re-classified to a tentative contamination category: Unknown Scaffolds ================= scaffold_3457 PROKARYOTIC The scaffold was about 10 kb in length and had a GC content of 0.62, where the other prokaryotic-only scaffolds were clustering. This scaffold had a 50-base hit to bacteria Brukholderia psuedomallei K96243. scaffold_3618 PROKARYOTIC The scaffold had 1 hit to Symbiobacterium thermophilum. The scaffold, itself, had a GC content of 0.64. scaffold_8655 PROKARYOTIC The scaffold had 2 alignments to Nocardia farcinica IFM 10152 DNA (a bacterial and plant plastid)--lineage derived from bacteria. scaffold_13086 PROKARYOTIC The scaffold was at high GC (0.67) and had 2 hits to Burkholderia. scaffold_15371 PROKARYOTIC The scaffold was a small scaffold (4-5kb) and had a GC content of 0.57. There were two hits to Burkholderia. scaffold_15925 PROKARYOTIC The 3-4kb scaffold had 1 hit to Bacteroides fragilis YCH46. scaffold_16211 PROKARYOTIC The 3-4kb scaffold had 1 hit to Bacteroides fragilis YCH46. The alignment, however, was at the start of the scaffold. No file available to view the read layout of the scaffolds (is it chimeric?). This read will be tentatively labeled as prokaryotic. scaffold_19834 PROKARYOTIC The small scaffold had 1 hit to Bacteroides fragilis YCH46 in the last 225 bases of the scaffold. No files were available to view the read layout of the scaffold; this read will be tentatively labeled as prokaryotic. MITOCHONDRION SCAFFOLDS: ------------------------ scaffold_284 EUKARYOTIC The scaffold had a spurious 6-base hit to the X.tropicalis mitochondrion. scaffold_18302 MITOCHONDRION The scaffold had full-length alignments to the mitochondrion sequence. scaffold_96 EUKARYOTIC The large scaffold had a 478-base alignment with low percent ID to the mitochondrion sequence. scaffold_18005 MITOCHONDRION The first half of the scaffold blat-aligned to the mitochondrion sequence with 99%-100% identity. scaffold_19023 MITOCHONDRION The scaffold had full-length alignments to the mitochondrion sequence. scaffold_18215 MITOCHONDRION The scaffold had full-length alignments to the mitochondrion sequence. scaffold_43 EUKARYOTIC The large scaffold had a 858-base alignment to the mitochondrion sequence. Eukaryotic/Prokaryotic Overlap: =============================== scaffold_119 EUKARYOTIC The scaffold had various hits throughout the scaffold to eukaryotic sequences. scaffold_168 EUKARYOTIC The scaffold had eukaryotic hits to Xenopus and other eukaryotic sequences across the entire scaffold. scaffold_333 EUKARYOTIC The scaffold had eukarytoc hits to Xenopus and other eukaryotic sequences across the entire scaffold. scaffold_809 EUKARYOTIC The scaffold had stacks of hits to eukaryotic sequences on various parts of the scaffold. scaffold_970 EUKARYOTIC The scaffold had hits to eukaryotic genes. The prokaryotic contamination on the scaffold was small and possibly spurious. scaffold_1577 EUKARYOTIC The scaffold had hits to Xenopus with spurious hits to prokaryotic contamination. scaffold_1732 EUKARYOTIC The scaffold was at high GC and had hits to Bordetella, but it also had hits to Xenopus finished cDNA sequence. scaffold_1781 EUKARYOTIC The scaffold had short spurious hits to Bordetella, Ralstonia, Pseudomonas, but had hits to Xenopus tropicalis sequences, including genes. scaffold_1788 EUKARYOTIC The scaffold was at high GC and had mostly hits to Bordetella genus. Most of the eukaryotic hits were to Xenopus tropicalis BAC sequences, so this scaffold will be tentatively labeled as eukaryotic. scaffold_1865 PROKARYOTIC The scaffold had mostly prokaryotic hits within the last 6 kb of the scaffold. This could be a misassembly or contamination? scaffold_2104 EUKARYOTIC The majority of the prokaryotic hits had low percent ID and there were hits to Xenopus protein, cDNA, and BAC sequences. scaffold_2257 EUKARYOTIC There were hits to Oncorhynchus keta Tc-1 like transposable element and hits to Bordetella. However, there were hits to Xenopus tropicalis BAC clone containing gene for egr, Xenopus tropicalis chromosome, and other X.tropicalis clones. scaffold_2705 EUKARYOTIC The scaffold had hits to Xenopus tropicalis, including hits to X.tropicalis BAC clone containing gene for BMP7 and hypothetical protein MGC76232, but it also had hits to Ralstonia with high percent ID. scaffold_2769 PROKARYOTIC The scaffold had mostly hits to prokaryotic contamination and was at high GC. scaffold_4104 PROKARYOTIC The scaffold had few hits which were located within 1400 bases of the scaffold. The scaffold was at high GC. Because there was not evidence that this scaffold is eukaryotic, it will be tentatively labeled prokaryotic. scaffold_4717 PROKARYOTIC The hits were mostly to prokaryotes and the eukaryotic looked like a spurious hit to Medicago sativa. scaffold_6978 PROKARYOTIC The scaffold had full-length alignments to Escherichia. scaffold_7379 PROKARYOTIC The scaffold had hits to Ralstonia and Mycobacterium. The scaffold was at very high GC at 0.66 scaffold_8420 EUKARYOTIC The scaffold had significant hits to Ralstonia but also had hits to Xenopus clones. Most of the eukaryotic hits to Xenopus tropicalis BAC clones; this will be tentatively be labeled as eukaryotic. scaffold_8452 PROKARYOTIC The scaffold had various hits to prokaryotic contamination throughout the scaffold. The eukaryotic hit was a small hit to C.reinhardtii which was 60 bases but had a high percent ID. To err on the side of caution, this scaffold will be tentatively labled as prokaryotic. scaffold_8689 PROKARYOTIC The scaffold only had 2 hits which seemed spurious. The eukaryotic hit was to a strain of Anopheles gambiae which was about 50 bases near the end of the scaffold, while the prokaryotic hit was a 70-base hit to Ralstonia. This will be classified as prokaryotic tentatively. scaffold_8816 EUKARYOTIC The scaffold had hits to Xenopus BAC and chromosome sequences. The prokaryotic hits were spurious. scaffold_8828 EUKARYOTIC The scaffold had spurious prokaryotic hits but had very high percent identity to Xenopus tropicalis MGC69309 protein and finished cDNA clones. scaffold_9248 PROKARYOTIC The scaffold had two hits to Bacteria and a strain of Anopheles gambiae. This will be tentatively classified as prokaryotic. scaffold_9860 PROKARYOTIC The scaffold had hits to Bacteria and a strain of Anopheles gambiae. This scaffold will be tentatively classified as prokaryotic. scaffold_9987 EUKARYOTIC The scaffold had hits to Xenopus tropicalis BAC clones. There was only one hit to Bacteria which had low percent ID and seemed spurious. scaffold_10508 PROKARYOTIC The scaffold was at high GC and hits to various prokaryotic contamination. The eukaryotic hits were in a stack with the prokaryotic hits and were less significant. scaffold_13369 PROKARYOTIC There were few hits in general to this scaffold. The eukaryotic hits were to Oryzias latipes HSP70-1 protein gene and had a high percent ID but only covered 44 bases of the entire scaffold. The prokaryotic hits also looked somewhat spurious, but the scaffold will classified as prokaryotic due to the high GC tentatively. scaffold_14916 PROKARYOTIC The scaffold had two hits to Bacteria and a strain of Anopheles gambiae. This will be tentatively classified as prokaryotic. scaffold_15072 PROKARYOTIC The scaffold had one stack of hits, mostly to prokaryotic contamination. scaffold_16608 PROKARYOTIC The scaffold had hits to Corynebacterium, Rhodococcus, and other prokaryotes. The eukaryotic hit was in a stack with the prokaryotic hits but had low percent ID. This scaffold will be tentatively labeled as prokaryotic. scaffold_1919 EUKARYOTIC The scaffold had only one hit to bacteria, but had hits to X.tropicalis BAC clones, chromosome, and finished cDNA. Noncellular Scaffolds: ====================== scaffold_1099 EUKARYOTIC but with noncellular contamination This is a large eukaryotic scaffold with cloning vector hits spliced in the scaffold. Coordinates to alignments in the noncellular_hits.summary file. - The scaffolds initially flagged as noncellular were sorted by fraction of coverage of the noncellular hits. Those that covered greater than 5.0% coverage were looked at in closer detail: scaffold_7163 NONCELLULAR The scaffold had about 50% coverage to plasmid sequence and full-length coverage to Pseudomonas putida plasmid pWW0. scaffold_7287 NONCELLULAR The scaffold had full-length read coverage on both ends of the scaffold to synthetic construct. scaffold_2659 NONCELLULAR The scaffold had full coverage to noncellular junk, including hits to Bacteriophage lambda, cloning vector, phage lambda lac Z scaffold_6653 NONCELLULAR The scaffold had hits to cloning vector and bacteriophage lambda. scaffold_7238 PROKARYOTIC The scaffold had one hit to cloning vector that covered 20% of the scaffold, but most of the hits were to E.coli and Shigella with high coverage and high percent ID. scaffold_5596 NONCELLULAR The scaffold had hits to cloning vector and bacteriophage lambda for about 700-800 bases on each end of the scaffold. scaffold_6567 EUKARYOTIC The hit was to Expression vector pET3-H2A H2A gene for Xenopus laevis-like histone H2A. scaffold_6447 EUKARYOTIC The hit was an expression vector gene for Xenopus. scaffold_17079 EUKARYOTIC Hits to Expression vector pET3-H2A H2A gene for Xenopus laevis-like histones. scaffold_5393 POPLAR CHLOROPLAST The scaffold had a spurious noncellular hit, but had other eukaryotic hits. There were a number of hits to chloroplast and the scaffolds were blat'd against the Poplar chloroplast sequence. This was the only scaffold with significant alignments to the sequence. scaffold_6020 EUKARYOTIC The hit was an expression vector gene for Xenopus. scaffold_2356 PROKARYOTIC The hits were aligning to a 1500 base section of the scaffold to various prokaryotic contamination, which was about 8% coverage of the scaffold. Because there were no hits leading this scaffold to be classified as eukaryotic, and the mean GC content of the scaffold is at 0.62, this scaffold will be flagged tentatively as prokaryotic. scaffold_15607 EUKARYOTIC A hit to expression vector gene for Xenopus histone. scaffold_14682 EUKARYOTIC The noncellular hit was to Expression vector pET3-H3 H3 gene for Xenopus laevis-like histone H3 scaffold_13958 EUKARYOTIC The noncellular hit was to Expression vector pET3-H3 H3 gene for Xenopus laevis-like histone H3 - There were a handful of scaffolds between 1.0 and 5.0% coverage that were looked at in more detail. Most of these were eukaryotic, hitting sequences such as "Expression vector for Xenopus histones" or the hits were spurious to Human viruses. The following scaffolds were looked at in more detail due to the types of hits: scaffold_5826 PROKARYOTIC This scaffold had a stack of small hits (about 60 bases) to various prokaryotic contamination. This scaffold will be tentatively labeled as prokaryotic contamination. scaffold_5991 PROKARYOTIC This scaffold had a 100-300 bases of the scaffold with hits to prokaryotic contamination and no hits to any eukaryotic sequences. To err on the side of caution, this scaffold will be tentatively labeled as prokaryotic. scaffold_13922 EUKARYOTIC Has a stack of hits 70 base hits to cloning vector at the end of the scaffold. Because we cannot look at the read layout of this scaffold in closer detail, this will be left in as eukaryotic. - Noncellular scaffolds with less than 1.0% coverage were not looked at in further detail and were unflagged as noncellular. Prokaryotic-Only Scaffolds: =========================== The subset of scaffolds that were flagged as prokaryotic were not looked at in further detail since there were no other types of eukaryotic hits to consider these main genome scaffolds. These are tentatively kept as prokaryotic-only scaffolds.