Evaluating language environment analysis system performance for Chinese: a pilot study in Shanghai

Jill Gilkerson, Yiwen Zhang, Dongxin Xu, Jeffrey A. Richards, Xiaojuan Xu, Fan Jiang, James Harnsberger, Keith Topping (Lead / Corresponding author)

    Research output: Contribution to journalArticle

    17 Citations (Scopus)

    Abstract

    Purpose: The purpose of this study was to evaluate performance of the Language Environment Analysis (LENA) automated language-analysis system for the Chinese Shanghai dialect and Mandarin (SDM) languages. Method: Volunteer parents of 22 children aged 3–23 months were recruited in Shanghai. Families provided daylong in-home audio recordings using LENA. A native speaker listened to 15 min of randomly selected audio samples per family to label speaker regions and provide Chinese character and SDM word counts for adult speakers. LENA segment labeling and counts were compared with rater-based values. Results: LENA demonstrated good sensitivity in identifying adult and child; this sensitivity was comparable to that of American English validation samples. Precision was strong for adults but less so for children. LENA adult word count correlated strongly with both Chinese characters and SDM word counts. LENA conversational turn counts correlated similarly with rater-based counts after the exclusion of three unusual samples. Performance related to some degree to child age. Conclusions: LENA adult word count and conversational turn provided reasonably accurate estimates for SDM over the age range tested. Theoretical and practical considerations regarding LENA performance in non-English languages are discussed. Despite the pilot nature and other limitations of the study, results are promising for broader cross-linguistic applications.

    Original languageEnglish
    Pages (from-to)445-452
    Number of pages8
    JournalJournal of Speech, Language, and Hearing Research
    Volume58
    Issue number2
    DOIs
    Publication statusPublished - Apr 2015

    Fingerprint

    systems analysis
    Language
    language
    performance
    dialect
    Shanghai
    language analysis
    Linguistics
    Population Groups
    recording
    parents
    exclusion
    Volunteers
    Parents
    linguistics

    Cite this

    Gilkerson, Jill ; Zhang, Yiwen ; Xu, Dongxin ; Richards, Jeffrey A. ; Xu, Xiaojuan ; Jiang, Fan ; Harnsberger, James ; Topping, Keith. / Evaluating language environment analysis system performance for Chinese : a pilot study in Shanghai. In: Journal of Speech, Language, and Hearing Research. 2015 ; Vol. 58, No. 2. pp. 445-452.
    @article{b14498d0e8fc436f806ed39f3472c5b1,
    title = "Evaluating language environment analysis system performance for Chinese: a pilot study in Shanghai",
    abstract = "Purpose: The purpose of this study was to evaluate performance of the Language Environment Analysis (LENA) automated language-analysis system for the Chinese Shanghai dialect and Mandarin (SDM) languages. Method: Volunteer parents of 22 children aged 3–23 months were recruited in Shanghai. Families provided daylong in-home audio recordings using LENA. A native speaker listened to 15 min of randomly selected audio samples per family to label speaker regions and provide Chinese character and SDM word counts for adult speakers. LENA segment labeling and counts were compared with rater-based values. Results: LENA demonstrated good sensitivity in identifying adult and child; this sensitivity was comparable to that of American English validation samples. Precision was strong for adults but less so for children. LENA adult word count correlated strongly with both Chinese characters and SDM word counts. LENA conversational turn counts correlated similarly with rater-based counts after the exclusion of three unusual samples. Performance related to some degree to child age. Conclusions: LENA adult word count and conversational turn provided reasonably accurate estimates for SDM over the age range tested. Theoretical and practical considerations regarding LENA performance in non-English languages are discussed. Despite the pilot nature and other limitations of the study, results are promising for broader cross-linguistic applications.",
    author = "Jill Gilkerson and Yiwen Zhang and Dongxin Xu and Richards, {Jeffrey A.} and Xiaojuan Xu and Fan Jiang and James Harnsberger and Keith Topping",
    year = "2015",
    month = "4",
    doi = "10.1044/2015_JSLHR-L-14-0014",
    language = "English",
    volume = "58",
    pages = "445--452",
    journal = "Journal of Speech, Language, and Hearing Research",
    issn = "1092-4388",
    publisher = "American Speech-Language-Hearing Association",
    number = "2",

    }

    Evaluating language environment analysis system performance for Chinese : a pilot study in Shanghai. / Gilkerson, Jill; Zhang, Yiwen; Xu, Dongxin; Richards, Jeffrey A.; Xu, Xiaojuan; Jiang, Fan; Harnsberger, James; Topping, Keith (Lead / Corresponding author).

    In: Journal of Speech, Language, and Hearing Research, Vol. 58, No. 2, 04.2015, p. 445-452.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Evaluating language environment analysis system performance for Chinese

    T2 - a pilot study in Shanghai

    AU - Gilkerson, Jill

    AU - Zhang, Yiwen

    AU - Xu, Dongxin

    AU - Richards, Jeffrey A.

    AU - Xu, Xiaojuan

    AU - Jiang, Fan

    AU - Harnsberger, James

    AU - Topping, Keith

    PY - 2015/4

    Y1 - 2015/4

    N2 - Purpose: The purpose of this study was to evaluate performance of the Language Environment Analysis (LENA) automated language-analysis system for the Chinese Shanghai dialect and Mandarin (SDM) languages. Method: Volunteer parents of 22 children aged 3–23 months were recruited in Shanghai. Families provided daylong in-home audio recordings using LENA. A native speaker listened to 15 min of randomly selected audio samples per family to label speaker regions and provide Chinese character and SDM word counts for adult speakers. LENA segment labeling and counts were compared with rater-based values. Results: LENA demonstrated good sensitivity in identifying adult and child; this sensitivity was comparable to that of American English validation samples. Precision was strong for adults but less so for children. LENA adult word count correlated strongly with both Chinese characters and SDM word counts. LENA conversational turn counts correlated similarly with rater-based counts after the exclusion of three unusual samples. Performance related to some degree to child age. Conclusions: LENA adult word count and conversational turn provided reasonably accurate estimates for SDM over the age range tested. Theoretical and practical considerations regarding LENA performance in non-English languages are discussed. Despite the pilot nature and other limitations of the study, results are promising for broader cross-linguistic applications.

    AB - Purpose: The purpose of this study was to evaluate performance of the Language Environment Analysis (LENA) automated language-analysis system for the Chinese Shanghai dialect and Mandarin (SDM) languages. Method: Volunteer parents of 22 children aged 3–23 months were recruited in Shanghai. Families provided daylong in-home audio recordings using LENA. A native speaker listened to 15 min of randomly selected audio samples per family to label speaker regions and provide Chinese character and SDM word counts for adult speakers. LENA segment labeling and counts were compared with rater-based values. Results: LENA demonstrated good sensitivity in identifying adult and child; this sensitivity was comparable to that of American English validation samples. Precision was strong for adults but less so for children. LENA adult word count correlated strongly with both Chinese characters and SDM word counts. LENA conversational turn counts correlated similarly with rater-based counts after the exclusion of three unusual samples. Performance related to some degree to child age. Conclusions: LENA adult word count and conversational turn provided reasonably accurate estimates for SDM over the age range tested. Theoretical and practical considerations regarding LENA performance in non-English languages are discussed. Despite the pilot nature and other limitations of the study, results are promising for broader cross-linguistic applications.

    UR - http://www.scopus.com/inward/record.url?scp=84927665052&partnerID=8YFLogxK

    U2 - 10.1044/2015_JSLHR-L-14-0014

    DO - 10.1044/2015_JSLHR-L-14-0014

    M3 - Article

    C2 - 25614978

    AN - SCOPUS:84927665052

    VL - 58

    SP - 445

    EP - 452

    JO - Journal of Speech, Language, and Hearing Research

    JF - Journal of Speech, Language, and Hearing Research

    SN - 1092-4388

    IS - 2

    ER -