• Title of article

    The statistical signature of morphosyntax: A study of Hungarian and Italian infant-directed speech

  • Author/Authors

    Gervain، نويسنده , , Judit and Guevara Erra، نويسنده , , Ramَn، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2012
  • Pages
    25
  • From page
    263
  • To page
    287
  • Abstract
    Does statistical learning (Saffran, Aslin, & Newport, 1996) offer a universal segmentation strategy for young language learners? Previous studies on large corpora of English and structurally similar languages have shown that statistical segmentation can be an effective strategy. However, many of the world’s languages have richer morphological systems, with sometimes several affixes attached to a stem (e.g. Hungarian: iskoláinkban: iskolá-i-nk-ban school.pl.poss1pl.inessive ‘in our schools’). In these languages, word boundaries and morpheme boundaries do not coincide. Does the internal structure of words affect segmentation? What word forms does segmentation yield in morphologically rich languages: complex word forms or separate stems and affixes? The present paper answers these questions by exploring different segmentation algorithms in infant-directed speech corpora from two typologically and structurally different languages, Hungarian and Italian. The results suggest that the morphological and syntactic type of a language has an impact on statistical segmentation, with different strategies working best in different languages. Specifically, the direction of segmentation seems to be sensitive to the affixation order of a language. Thus, backward probabilities are more effective in Hungarian, a heavily suffixing language, whereas forward probabilities are more informative in Italian, which has fewer suffixes and a large number of phrase-initial function words. The consequences of these findings for potential segmentation and word learning strategies are discussed.
  • Keywords
    Statistical segmentation , Cross-linguistic variation , Morphological complexity , Infant-directed speech corpus
  • Journal title
    Cognition
  • Serial Year
    2012
  • Journal title
    Cognition
  • Record number

    2077543