suba logo

Welcome to SUBA5

SUBA5 provides a central resource for exploring Arabidopsis protein subcellular location data. Proteins have specific functions and locations within the plant cell. They generate or are themselves products important for plant growth and response. Protein subcellular location and the proximity relationship of proteins are important clues to function within the metabolic household. Subcellular location can be determined by fluorescent protein tagging or mass spectrometry detection in subcellular purifications and by prediction using protein sequence features. SUBA5 contains a subcellular data query platform, protein sequence BLAST alignment, a high confidence subcellular locations reference standard and analytic tools.

Most recent SUBA description published

Hooper CM, Castleden I, Tanz SK, Aryamanesh N, and Millar AH (2017) SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations Nucleic Acids Res. Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041 (PubMed)

SUBA5 bulk data: Hooper CM, Castleden I, Tanz SK, Grasso SV, Aryamanesh N, and Millar AH (2022). Subcellular Localisation database for Arabidopsis proteins version 5. doi:10.26182/8dht-4017

SUBA5 citation guide

Data Citation

If you find our SUBA5 resources useful please help us grow and cite us.

What is data citation? When using data generated by an author in a manuscript, it is the practice of providing a reference to the data source or description. For more informations on citing data please visit this resource.

Since SUBA5 is a mixed resource derived from collated data sets and generated data, it is not always clear how to cite this. All data sets retrieved from other published studies have been directly linked to their repository record in PUBMED. This is how we directly acknowledge every published study we collate into SUBA. If you query SUBA5 and retrieve a mixed data set from a number of studies it is more applicable to cite the latest SUBA publication to refer to the version that was used to retrieve the data.

Citing SUBA

SUBA interface
Used SUBA for refining, retrieving and interpreting your data? Please cite SUBA:
Hooper CM, Castleden I, Tanz SK, Aryamanesh, and Millar, AH (2017) SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations Nucleic Acids Res. Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041 (PubMed)

You used lists of SUBAcon calls? Please cite SUBAcon in your methods:
Hooper CM, Tanz SK, Castleden I, Vacher M, Small I; Millar, AH (2014) SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome. Bioinformatics. 30(23): 3356-64. (PubMed)


SUBA bulk data

The SUBA versions are archieved and for bulk downloads in sql and csv format in the UWA repository. If you choose to cite the data set please cite as followed:

SUBA5:
Hooper CM, Castleden I, Tanz SK, Grasso SV, Aryamanesh N, and Millar AH (2022). Subcellular Localisation database for Arabidopsis proteins version 5. doi:10.26182/8dht-4017

SUBA4:
Hooper CM, Castleden I, Tanz SK, Aryamanesh N and Millar AH (2017). Subcellular Localisation database for Arabidopsis proteins version 4. doi:10.4225/23/581055ddcb1ce

SUBA3:
Hooper CM, Castleden I, Tanz SK, Small ID, and Millar AH (2012). Subcellular Localisation database for Arabidopsis proteins version 3. doi:10.4225/23/59151525d122a



SUBA publication list

Hooper CM, Castleden I, Tanz SK, Aryamanesh N, and Millar AH (2017) SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations Nucleic Acids Res. Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041 (PubMed)

Hooper CM, Tanz SK, Castleden I, Vacher M, Small I and Millar AH (2014) SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome. Bioinformatics. 30(23): 3356-64. (PubMed)

Tanz SK, Castleden I, Hooper CM, Small I and Millar, AH (2014) Using the SUBcellular database for Arabidopsis proteins to localize the Deg protease family. Front Plant Sci5:396 (PubMed)

Tanz SK, Castleden I, Hooper CM, Vacher M, Small I and Millar AH (2013) SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis. Nucleic Acids Res. 41: D1185-91 (PubMed)

Heazlewood JL1, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007) SUBA: the Arabidopsis Subcellular Database. Nucleic Acids Res. 35:D213-8. (PubMed)

Heazlewood JL1, Tonti-Filippini J, Verboom RE, Millar AH(2005) Combining experimental and predicted datasets for determination of the subcellular location of proteins in Arabidopsis. Plant Physiol. 139(2):598-609. (PubMed)

SUBA5 Notice board

SUBA5 contains more experimental data increasing proteome coverage (TAIR10) to 44% (53% incl. PPI). For data summary see the AboutSUBA5 page
Bibliographic references up to 31st October 2020
SUBAcon was last retrained using data up to 30th June 2022

Have a non-Arabidopsis protein? Use our protein BLAST to see the closest Arabidopsis match and retrieve results at the same time!

Wish there was SUBA for crops? GO TO cropPAL now for 12 crops including wheat, barley, rice, maize, canola, tomato and more.

Subcellular location of metabolic processes are less conserved across species than thought? Have a read: CropPAL for discovering divergence in protein subcellular location in crops to support strategies for molecular crop breeding.

Can't find what you are looking for? Email directly to Cornelia Hooper (cornelia.hooper [at] uwa.edu.au)

Quick Search

... and we will search for them in TAIR descriptions and experimental abstracts.

Quick BLAST to retrieve SUBA data from the closest Arabidopsis match to your sequence
... below protein sequence fragments have BLAST similarity bit-scores

Bit Score is log2Neff-log2(E-value) where E-value = pval × Neff is the p-value times the effective search space size. The larger the bit-score the better since pval = P(random seq having a better score) = 2-(bit-score). The p-value measures the statistical significance of the match but since we tried Neff times to find a match we need to make a correction. Multiplying by the number of possible matches gives the e-value or the expected number of hits with a better match just by random chance. (See here).

How to submit new localisation data to SUBA

SUBA5 is updated annually with the latest update date shown on the home page notice board. If you have published or found data that is not in SUBA5 please submit this subcellular location data to us. Currently we accept data in the format of PubMedID;location;AGI (e.g. 25900983;golgi;AT5G16280.1). The location categories must be cytoskeleton, cytosol, endoplasmic reticulum, extracellular, golgi, mitochondrion, nucleus, peroxisome, plasma membrane, plastid or vacuole. If you have data fitting these criteria please click the following:

We will assess your data and add it to our next scheduled update. If you have suborganellar data for any other location categories or Protein-Protein interaction data, please contact:

  • Cornelia Hooper (cornelia.hooper(at)uwa.edu.au)
Send us published subcellular localization data

How to submit your data

  • We require the data in the form:
    "PubMedID{sep}Location{sep}AGI" where {sep} is a comma or a tab or a space.
  • One AGI per line.
  • Location must be one that SUBA5 recognises: cytoskeleton, cytosol, endoplasmic reticulum, extracellular, golgi, mitochondrion, nucleus, peroxisome, plasma membrane, plastid or vacuole.
  • AGIs that are not in TAIR10 will be ignored.
  • Duplicates will be ignored: that means we have already added your data to SUBA5.
  • AGIs without splice isoforms will automatically have the .1 added.
  • PubMed IDs are 5-12 digits
PubMedID {sep} Location {sep} AGI
OR: 
If you have any problems please contact:
Cornelia Hooper (cornelia.hooper [at] uwa.edu.au) or
Ian Castleden (ian.castleden [at] uwa.edu.au)