A curated 23S rRNA database for quick species identification
MIMt-23S is composed of sequences directly taken from the manually curated repository Targeted Loci from Refseq for the large subunit of the Prokaryote ribosome and sequences extracted from fully sequenced genomes deposited in Genbank and Refseq with the tool RNAmmer 1.2. For every genome, all 23S sequences were extracted and only kept in the database if they were not identical to another sequence present in the database.
For the curated version (M2c), 23S sequences from Targeted Loci were joint to sequences extracted from Refseq genomes with RNAmmer software. With the sequences from both sources a clustering was made at 100% so that sequences completely identical to another were only represented by one of them and a accessory file was created with all species represented by a unique sequence in the database. This file was named as MIMt-23S_XX_XX_redundancy.txtÂ
The full version of the MIMt 23S database is composed by all sequences contained in the curated version plus sequences from genomes of new species available only at Genbank.
In total, MIMt-23S_M2c contains 31,825 sequences belonging to 18,835 different species.
Full version of MIMt-18S contains in total 32,675 sequences belonging to 19,590 species.