PNAC

a protein nucleolar association classifier

Michelle S. Scott, Francois-Michel Boisvert, Angus I. Lamond, Geoffrey J. Barton

    Research output: Contribution to journalArticle

    7 Citations (Scopus)

    Abstract

    Background: Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.

    Results: To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.

    Conclusions: Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.

    Original languageEnglish
    Article number74
    Pages (from-to)-
    Number of pages14
    JournalBMC Genomics
    Volume12
    DOIs
    Publication statusPublished - 27 Jan 2011

    Keywords

    • PROTEOMIC ANALYSIS
    • BUDDING YEAST
    • LOCALIZATION
    • DATABASE
    • PREDICTION
    • SEQUENCE
    • SIGNALS
    • ATLAS
    • CELL
    • SVM

    Cite this

    Scott, Michelle S. ; Boisvert, Francois-Michel ; Lamond, Angus I. ; Barton, Geoffrey J. / PNAC : a protein nucleolar association classifier. In: BMC Genomics. 2011 ; Vol. 12. pp. -.
    @article{65c8757b1b664546b35e8de74803d6b6,
    title = "PNAC: a protein nucleolar association classifier",
    abstract = "Background: Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.Results: To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.Conclusions: Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.",
    keywords = "PROTEOMIC ANALYSIS, BUDDING YEAST, LOCALIZATION, DATABASE, PREDICTION, SEQUENCE, SIGNALS, ATLAS, CELL, SVM",
    author = "Scott, {Michelle S.} and Francois-Michel Boisvert and Lamond, {Angus I.} and Barton, {Geoffrey J.}",
    year = "2011",
    month = "1",
    day = "27",
    doi = "10.1186/1471-2164-12-74",
    language = "English",
    volume = "12",
    pages = "--",
    journal = "BMC Genomics",
    issn = "1471-2164",
    publisher = "Springer Verlag",

    }

    PNAC : a protein nucleolar association classifier. / Scott, Michelle S.; Boisvert, Francois-Michel; Lamond, Angus I.; Barton, Geoffrey J.

    In: BMC Genomics, Vol. 12, 74, 27.01.2011, p. -.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - PNAC

    T2 - a protein nucleolar association classifier

    AU - Scott, Michelle S.

    AU - Boisvert, Francois-Michel

    AU - Lamond, Angus I.

    AU - Barton, Geoffrey J.

    PY - 2011/1/27

    Y1 - 2011/1/27

    N2 - Background: Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.Results: To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.Conclusions: Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.

    AB - Background: Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.Results: To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.Conclusions: Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.

    KW - PROTEOMIC ANALYSIS

    KW - BUDDING YEAST

    KW - LOCALIZATION

    KW - DATABASE

    KW - PREDICTION

    KW - SEQUENCE

    KW - SIGNALS

    KW - ATLAS

    KW - CELL

    KW - SVM

    U2 - 10.1186/1471-2164-12-74

    DO - 10.1186/1471-2164-12-74

    M3 - Article

    VL - 12

    SP - -

    JO - BMC Genomics

    JF - BMC Genomics

    SN - 1471-2164

    M1 - 74

    ER -