Announcement CodonUsageDatabaseisanextendedWWWversionofCUTG(CodonUsageTabulatedfromGenBank).ThefrequencyofcodonuseineachorganismismadesearchablethroughthisWorldWideWebsite. CUTGwasoriginallydevelopedbyProf.ToshimichiIkemuraatLaboratoryofEvolutionaryGenetics,NationalInstituteofGenetics. CodonUsageDatabaseisdevelopedandmainteinedbyYasukazuNakamuraatTheFirstLaboratoryforPlantGeneResearch,KazusaDNAResearchInstitute. Filespri(primatesequenceentries),rod(rodentsequenceentries),mam(othermammaliansequenceentries),vrt(othervertebratesequenceentries),inv(invertebratesequenceentries),pln(plantsequenceentries),bct(bacterialsequenceentries),vrl(viralsequenceentries)andphg(phagesequenceentries)wereused. Filesforest(EST:expressedsequencetagsequenceentries)andpat(patentsequenceentries),rna(StructuralRNAsequenceentries),sts(STS:sequencetaggedsitesequenceentries),syn(syntheticandchimericsequenceentries)anduna(unanotatedsequenceentries)werenotused. Allofthecompletesequencedproteincodinggenes(CDS"s)areused.Codonscontainingambiguousbasewereexcludedfromcount. Nameoforganism,shownintheanswerlistforqueryoralphabeticallist,isfollowedbynameofdivisionofGenBank[gbbct,gbinvetc.],colonandnumberofcompiledCDS.Likethis; ArABIdopsisthaliana[gbpln]:23188 Ifyouselectalinkforanorganism,codonusagetablefortheorganismwillbeshown.Thetableshowsfrequency(perthousand)andcountforeachcodonasasumofallCDS"softheorganism.TablewhichincludeaminoacidsorwhichisformattedasGCGstyle(construction)arealsoshownwhenonegeneticcodesystemisselected.BacktranslationprogramwhichisusefultodesignPCRprimersinproteincodingareaalsoavailable(construction). Selectingthelink"CodonusageofeachCDS"underthetable,youwillbrowseordownloadallcodonusagetablesofCDS"sintheorganism.TheformatoftableisCUTGstyle(Seebelowlinkfor"CODEN_LABEL"fileinCUTG). AdocumentREADMEcontainsthelatestinformationonthedatabaseinplaintextformat.CODON_LABELandSPSUM_LABELfilesshowfileformats. [*PresentaddressofDr.UgawaisEnvironmentalEducationCenter,MiyagiUniversityofEducation] ThisworkwassuportedbyaGrant-in-AidforScientificResearch(Grant-in-AidforPublicationofScientificResearchResults)fromJapanSocietyforthePromotionofScience. (NARDatabaseIssuepage) Thisarticlegivesreferencestoearlierpapers. QUERYBoxforsearchwithLatinnameoforganism Alphabeticallistsofallorganisms CUTG:CodonUsageTabulatedfromGenBank(ftpdistribution) AdditionalServiceCountcodonprogram:compilationasequenceintoacodonusagetableDatasource
NCBI-GenBankFlatFileRelease141.0[May112004].
Dataamount
21,733organisms1,125,820completeproteincodinggenes(CDS"s)
Usage
Aqueryboxtosearchacodonusagetableforanorganism,ispresented.SearchcanbedonewithLatinnameoritssub-stringoforganism.Defaultsearchprocessiscasesensitive.Caseinsensitiveoptioncouldbeselected.Ambiguousquerywhichhitsover100organismsreturnsnoanswer.
CUTG:CodonUsageTabulatedfromGenBank
CodonusagetablesforallCDS"sforeachGenBankdivision(pri,rod,mam,vrt,inv,pln,bct,vrlandphg)willbedownloadedfrom"FTPlinksforCUTGfiles"linkintoppage.
Acknowledgment
WewishtothankDr.Ugawa*atTheDNAInformationandStockCenter,NationalInstituteofAgroBIOLOGicalResourcesforhelpinconstructinganddistributingthedatabasefrom1996to1999.
Pleasecite
CodonusagetabulatedfromtheinternationalDNAsequencedatabases:statusfortheyear2000.Nakamura,Y.,Gojobori,T.andIkemura,T.(2000)Nucl.AcidsRes.28,292.
ABCDEFGHIJKLMNOPQRSTUVWXYZChloroplastMitochondrionOthers(intialsarenotcapital)