Databases available for ViroBLAST

Nucleotide sequence databases

Barley all CDS Morex v2.0 (2019)
Number of Sequences: 63,658; Longest sequence: 16,071 bp; avg. sequence size: 57,082,667 bp; L50: 13,735 bp (see publication https://doi.org/10.1101/631648 and dataset for download at eDal repository).

Barley Pseudomolecules Morex v2.0 2019
Number of Pseudomolecules: 8; Largest Scaffold: 675,310,294 bp; avg. sequence size: 542,842,368 bp; L50: 624,247,919 bp (see publication https://doi.org/10.1101/631648 and dataset for download at eDal repository).

Barley CDS HC Morex v1.0, IBSC_v2 (Mai 2016)
Number of Contigs: 39,734; Contig size: 38,486,703 bp; Largest Contig: 15,048 bp; avg. Contig size: 969 bp; L50: 1,404 bp

Barley CDS LC Morex v1.0, IBSC_v2 (Mai 2016)
Number of Contigs: 41,545; Contig size: 16,162,413 bp; Largest Contig: 7,137 bp; avg. Contig size: 389 bp; L50: 453 bp

Barley representative Transcripts HC (including introns) Morex v1.0, IBSC_v2 (Mai 2016)
Number of Contigs: 39,734; Contig size: 59,562,803 bp; Largest Contig: 19,746 bp; avg. Contig size: 1,499 bp; L50: 2,068 bp

Barley representative Transcripts LC (including introns) Morex v1.0, IBSC_v2 (Mai 2016)
Number of Contigs: 41,949; Contig size: 39,770,363 bp; Largest Contig: 26,603 bp; avg. Contig size: 948 bp; L50: 1,362 bp

Barley Genomic (start at 1st exon end of last exon) HC Genes Morex v1.0, IBSC_v2 (Mai 2016)
Number of Contigs: 39,734; Contig size: 238,841,754 bp; Largest Contig: 820,598 bp; avg. Contig size: 6,011 bp; L50: 18,143 bp

Barley Genomic (start at 1st exon end of last exon) LC Genes Morex v1.0, IBSC_v2(Mai 2016)
Number of Contigs: 40,819; Contig size: 92,975,688 bp; Largest Contig: 1,059,687 bp; avg. Contig size: 2,278 bp; L50: 7,865 bp

Barley Pseudomolecules Masked Morex v1.0, IBSC_v2 (April 2016)
Number of Contigs: 8; Contig size: 4,833,791,107 bp

Barley Pseudomolecules Aug2015
Number of Contigs: 8; Contig size: 4,833,791,107 bp

Barley Pseudomolecule Contigs Masked Morex v1.0, IBSC_v2 (April 2016)
Number of Contigs: 464,895; Contig size: 4,787,302,407 bp; Largest Contig: 297,092 bp; avg. Contig size: 10,298 bp; L50: 79,239 bp

Barley Pseudomolecule Contigs Aug2015
Number of Contigs: 464,895; Contig size: 4,787,302,407 bp; Largest Contig: 297,092 bp; avg. Contig size: 10,298 bp; L50: 79,239 bp

Barley BAC Assemblies Aug2015
Number of Contigs: 850,266; Contig size: 11,303,595,359 bp; Largest Contig: 467,463 bp; avg. Contig size: 13,294 bp; L50: 60,140 bp

assembly_WGSMorex (IBSC_v1 - 2012)
Assembly from Whole Genome Shotgun sequencing of barley, cultivar Morex. Illumina Paired End 500 and Mate Pair 2500, total sequencing depth: ~55x (see http://dx.doi.org/10.1038/nature11543).
Number of Contigs: 2,670,738, Largest Contig: 36,084, avg. Contig size: 700 bp, L50: 1,425 bp

assembly_WGSBarke
Assembly from Whole Genome Shotgun sequencing of barley, cultivar Barke. Illumina Paired End 500, sequencing depth: ~30x (see http://dx.doi.org/10.1038/nature11543).
Number of Contigs: 2,742,077, Largest Contig: 38,386, avg. Contig size: 736 bp, L50: 1,419 bp

assembly_WGSBowman
Assembly from Whole Genome Shotgun sequencing of barley, cultivar Bowman. Illumina Paired End 500, sequencing depth: ~35x (see http://dx.doi.org/10.1038/nature11543).
Number of Contigs: 2,077,901, Largest Contig: 37,442, avg. Contig size: 856 bp, L50: 1,986 bp

HC_genes_CDS_Seq_2012 and LC_genes_CDS_Seq_2012 Morex IBSC_v1 (2012)
Barley gene annotations derived from the Morex 55x WGS sequence.
In general, this gene set is a combination of RNA-seq-derived and barley flCDNAs-derived predictions.
Per gene locus, only one representative gene model was selected (in case of alternative transcripts structures).
Gene models were filtered for high-confidence (HC) and low-confidence (LC) predictions based on sequence homology to other angiosperm proteins (OrthoMCL and BLAST used).
Both sets contain POPSEQ anchoring information (see http://dx.doi.org/10.1111/tpj.12319) and Blast2GO annotations (see http://www.blast2go.com).
WARNING: This gene set is likely to be incomplete with respect to the barley full gene complement and many genes may be truncated (not full length) due to the WGS assembly sequence structure!!! (see http://dx.doi.org/10.1038/nature11543).

454BacContigs Morex IBSC_v1 (2012)
Mira assemblies for 4,095 barley BAC Clones. BACs were barcoded and sequenced in pools using either Roche 454 GS FLX or FLX Titanium Platform. (see http://dx.doi.org/10.1038/nature11543).
Number of Contigs: 86,251, Largest contig: 181,550, avg. Contig size: 5,944 bp, L50: 34,519 bp

IlluminaBacContigs
Velvet assemblies for 2,183 barley BAC Clones.
For further details see (see http://www.harvest-web.org/utilmenu.wc?job=RTRVFORM&db=MOREX_HV3_9)

BacEndSequences (IBSC_v1 - 2012)
Sanger sequenced ends of 304,523 barley BAC Clones. (see http://dx.doi.org/10.1038/nature11543).
Number of sequences: 571,814, Largest sequence: 1,004, avg. size: 653 bp, L50: 723 bp

sortedChromosomes
Sequencing of flow sorted chromosome arms of Barley using 454 Platform. Sequencing depth per arm: >1x.

full length cDNA
barley full length cDNA (see http://www.ncbi.nlm.nih.gov/pubmed/21415278).

ipk 206633 barley ESTs
EST sequences of IPK barley cDNA clones (see http://pgrc.ipk-gatersleben.de/cr-est)

Exome Capture Regions 10x
Sequence fragments at least 20bp in size from Morex WGS contigs that had at least 10x coverage in captured samples from 13 barley cultivars (see http://dx.doi.org/10.1111/tpj.12294).

Exome Capture Regions 5x
Sequence fragments at least 20bp in size from Morex WGS contigs that had at least 5x coverage in captured samples from 13 barley cultivars (see http://dx.doi.org/10.1111/tpj.12294).

Barley Agilent Array
Transcript data from HarvEST assembly 35 (www.harvest.ucr.edu), two RNAseq experiments (Kohl et al., 2012; Thiel et al., 2012), and a full-length cDNA collection (Matsumoto et al., 2011) were assembled to 46,114 unique barley contigs. Unambiguous 60bp oligomer probes were derived using eArray (Agilent Technologies, Santa Clara, USA). For a detailed description see Kohl et al., 2015. Microarray design is available at EMBL-EBI ArrayExpress, accession A-MTAB-530. Best hits from BLASTx similarity searches against UniRef90 (www.uniprot.org), and high-confidence (HC) genes of the barley genome (2012) (IPK Barley Blast Server) are included in the header.

Amino acid sequence databases

Barley AA (HC and LC) Morex v2.0 2019
Number of Sequences: 63,658; Longest aa: 65357; avg. sequence size: 19,027,355 aa; L50: 13734 aa (see publication https://doi.org/10.1101/631648 and dataset for download at eDal repository).

Barley HC Proteins Mai2016
Number of Proteins: 39,734; Largest Protein: 5,016 aa; avg. Protein size: 323 aa

Barley LC Proteins Mai2016
Number of Proteins: 41,545; Largest Protein: 2,379 aa; avg. Protein size: 130 aa

HC_genes_AA_Seq_2012 and LC_genes_AA_Seq_2012
Barley gene annotations derived from the Morex 55x WGS sequence.
In general, this gene set is a combination of RNA-seq-derived and barley flCDNAs-derived predictions.
Per gene locus, only one representative gene model was selected (in case of alternative transcripts structures).
Gene models were filtered for high-confidence (HC) and low-confidence (LC) predictions based on sequence homology to other angiosperm proteins (OrthoMCL and BLAST used).
Both sets contain POPSEQ anchoring information (see http://dx.doi.org/10.1111/tpj.12319) and Blast2GO annotations (see http://www.blast2go.com).
WARNING: This gene set is likely to be incomplete with respect to the barley full gene complement and many genes may be truncated (not full length) due to the WGS assembly sequence structure!!! (see http://dx.doi.org/10.1038/nature11543).