Welcome to SUBA5
SUBA5 provides a central resource for exploring Arabidopsis protein subcellular location data. Proteins have specific functions and locations within the plant cell. They generate or are themselves products important for plant growth and response. Protein subcellular location and the proximity relationship of proteins are important clues to function within the metabolic household. Subcellular location can be determined by fluorescent protein tagging or mass spectrometry detection in subcellular purifications and by prediction using protein sequence features. SUBA5 contains a subcellular data query platform, protein sequence BLAST alignment, a high confidence subcellular locations reference standard and analytic tools.
Most recent SUBA description published
Hooper CM, Castleden I, Tanz SK, Aryamanesh N, and Millar AH (2017) SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations Nucleic Acids Res. Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041 (PubMed)
SUBA5 bulk data: Hooper CM, Castleden I, Tanz SK, Grasso SV, Aryamanesh N, and Millar AH (2022). Subcellular Localisation database for Arabidopsis proteins version 5. doi:10.26182/8dht-4017
- Find out all about SUBA5
- |
- Find the SUBA5 Tutorials Page
- |
- Cite SUBA5 using the citation guide
- |
- Try out an instant query for
SUBA5 citation guide
If you find our SUBA5 resources useful please help us grow and cite us.
What is data citation? When using data generated by an author in a manuscript, it is the practice of providing a reference to the data source or description. For more informations on citing data please visit this resource.
Since SUBA5 is a mixed resource derived from collated data sets and generated data, it is not always clear how to cite this. All data sets retrieved from other published studies have been directly linked to their repository record in PUBMED. This is how we directly acknowledge every published study we collate into SUBA. If you query SUBA5 and retrieve a mixed data set from a number of studies it is more applicable to cite the latest SUBA publication to refer to the version that was used to retrieve the data.
Citing SUBA
SUBA interface
Used SUBA for refining, retrieving and interpreting your data? Please cite SUBA:Hooper CM, Castleden I, Tanz SK, Aryamanesh, and Millar, AH (2017) SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations Nucleic Acids Res. Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041 (PubMed)
You used lists of SUBAcon calls? Please cite SUBAcon in your methods:
Hooper CM, Tanz SK, Castleden I,
Vacher M, Small I; Millar, AH (2014) SUBAcon: a consensus algorithm for unifying the subcellular
localization data of the Arabidopsis proteome.
Bioinformatics. 30(23): 3356-64.
(PubMed)
SUBA bulk data
The SUBA versions are archieved and for bulk downloads in sql and csv format in the UWA repository. If you choose to cite the data set please cite as followed:
SUBA5: Hooper CM, Castleden I, Tanz SK, Grasso SV, Aryamanesh N, and Millar AH (2022). Subcellular Localisation database for Arabidopsis proteins version 5. doi:10.26182/8dht-4017 SUBA4: Hooper CM, Castleden I, Tanz SK, Aryamanesh N and Millar AH (2017). Subcellular Localisation database for Arabidopsis proteins version 4. doi:10.4225/23/581055ddcb1ce SUBA3: Hooper CM, Castleden I, Tanz SK, Small ID, and Millar AH (2012). Subcellular Localisation database for Arabidopsis proteins version 3. doi:10.4225/23/59151525d122aSUBA publication list
Hooper CM, Castleden I, Tanz SK, Aryamanesh N, and Millar AH (2017)
SUBA4: the interactive data analysis centre for Arabidopsis subcellular protein locations
Nucleic
Acids Res.
Jan 4;45(D1):D1064-D1074. doi: 10.1093/nar/gkw1041
(PubMed)
Hooper CM, Tanz SK, Castleden I,
Vacher M, Small I and Millar AH (2014) SUBAcon: a consensus algorithm for unifying the subcellular
localization data of the Arabidopsis proteome.
Bioinformatics. 30(23): 3356-64.
(PubMed)
Tanz SK, Castleden I, Hooper CM, Small I and
Millar, AH (2014) Using the SUBcellular database for Arabidopsis proteins to localize the Deg
protease
family.
Front
Plant
Sci5:396
(PubMed)
Tanz SK, Castleden I, Hooper CM, Vacher M, Small I and
Millar AH (2013) SUBA3: a database for integrating experimentation and prediction to define the
SUBcellular location of
proteins in Arabidopsis.
Nucleic Acids
Res.
41: D1185-91
(PubMed)
Heazlewood JL1, Verboom RE, Tonti-Filippini J, Small I, Millar AH (2007) SUBA: the Arabidopsis
Subcellular
Database.
Nucleic Acids
Res.
35:D213-8.
(PubMed)
Heazlewood JL1, Tonti-Filippini J, Verboom RE, Millar AH(2005) Combining experimental and predicted
datasets for determination of the subcellular location of proteins in Arabidopsis.
Plant Physiol.
139(2):598-609.
(PubMed)
SUBA5 Notice board
SUBA5 contains more experimental data increasing proteome coverage (TAIR10) to 44% (53% incl. PPI). For data summary see the AboutSUBA5 page
Bibliographic references up to | 31st October 2020 |
SUBAcon was last retrained using data up to | 30th June 2022 |
Have a non-Arabidopsis protein? Use our protein BLAST to see the closest Arabidopsis match and retrieve results at the same time!
Wish there was SUBA for crops? GO TO cropPAL now for 12 crops including wheat, barley, rice, maize, canola, tomato and more.
Subcellular location of metabolic processes are less conserved across species than thought? Have a read: CropPAL for discovering divergence in protein subcellular location in crops to support strategies for molecular crop breeding.
Can't find what you are looking for? Email directly to Cornelia Hooper (cornelia.hooper [at] uwa.edu.au)
Quick Search
... and we will search for them in TAIR descriptions and experimental abstracts.
Bit Score is
log2Neff-log2(E-value)
where E-value = pval × Neff
is the p-value times the
effective search space size. The larger the bit-score the better since
pval = P(random seq having a better score) = 2-(bit-score)
. The p-value
measures the statistical significance of the match but since we tried Neff
times to find a match we need to make a correction. Multiplying by the number of possible matches
gives the e-value
or the expected number of hits with a better match just by random chance.
(See here).
Bit Score is
log2Neff-log2(E-value)
where E-value = pval × Neff
is the p-value times the
effective search space size. The larger the bit-score the better since
pval = P(random seq having a better score) = 2-(bit-score)
. The p-value
measures the statistical significance of the match but since we tried Neff
times to find a match we need to make a correction. Multiplying by the number of possible matches
gives the e-value
or the expected number of hits with a better match just by random chance.
(See here).
How to submit new localisation data to SUBA
SUBA5 is updated annually with the latest update date shown on the home page notice board. If you
have published or found data
that is not in SUBA5 please submit this subcellular location data to us. Currently we accept data
in the format of PubMedID;location;AGI
(e.g. 25900983;golgi;AT5G16280.1
).
The location categories
must be cytoskeleton, cytosol, endoplasmic reticulum, extracellular, golgi, mitochondrion,
nucleus, peroxisome, plasma membrane, plastid
or vacuole
. If you have data fitting
these criteria please click the following:
We will assess your data and add it to our next scheduled update. If you have suborganellar data for any other location categories or Protein-Protein interaction data, please contact:
- Cornelia Hooper (cornelia.hooper(at)uwa.edu.au)