شاخص‌ها و مراحل ساخت پيكرة زباني: گونة نوشتاري و گفتاري

عنوان به زبان ديگر

Steps to be followed in corpus construction: written and spoken language corpora

پديد آورندگان

علايي ابوذر, الهام پژوهشگاه علوم و فناوري اطلاعات ايران (ايرانداك) تهران

تعداد صفحه

از صفحه

267

از صفحه (ادامه)

تا صفحه

290

تا صفحه(ادامه)

كليدواژه

شاخص‌هاي كليديِ , پيكره و ساخت پيكره , فرآيند ساخت پيكره , گونة نوشتاري و گونة گفتاري

چكيده فارسي

اﯾﻦ ﭘﮋوﻫﺶ ﺗﻼش دارد ﺑﺎ ﺟﻤﻊآوري اﻃﻼﻋﺎت ﻣﺮﺑﻮط ﺑﻪ ﺷﺎﺧﺺﻫﺎ و ﻣﺮاﺣﻞ ﺳﺎﺧﺖ ﭘﯿﮑﺮة زﺑﺎﻧﯽ، ﺑﻪ ﭘﮋوﻫﺸﮕﺮان در زﻣﯿﻨﮥ ﺳﺎﺧﺖ اﻧﻮاع ﭘﯿﮑﺮهﻫﺎي زﺑﺎﻧﯽ ﮐﻤﮏ ﮐﻨﺪ. در اﯾﻦ راﺳﺘﺎ، در اﯾﻦ ﻣﻘﺎﻟﻪ، ﭘﺲ از ﺑﺮرﺳﯽ ﻧﻈﺮات ﭘﮋوﻫﺸﮕﺮاﻧﯽ ﮐﻪ اﻗﺪام ﺑﻪ ﺳﺎﺧﺖ ﭘﯿﮑﺮهﻫﺎﯾﯽ در زﺑﺎنﻫﺎي ﻣﺨﺘﻠﻒ ﮐﺮدهاﻧﺪ، ﺑﻪ ﺷﺎﺧﺺﻫﺎي ﮐﻠﯽ ﺳﺎﺧﺖ ﭘﯿﮑﺮهﻫﺎي زﺑﺎﻧﯽ ﭘﺮداﺧﺘﻪ ﻣﯽﺷﻮد. اﯾﻦ ﺷﺎﺧﺺﻫﺎ ﻣﺮﺑﻮط ﺑﻪ ﺳﺎﺧﺖ ﮔﻮﻧﻪﻫﺎي ﻣﺘﻨﯽ و ﮔﻔﺘﺎري ﭘﯿﮑﺮه اﺳﺖ ﮐﻪ ﻧﻤﻮﻧﻪﮔﯿﺮي، ﻧﻤﺎﯾﻨﺪﮔﯽ، ﺗﻮازن، اﻧﺪازه، ﻧﻮع ﭘﯿﮑﺮه و ﯾﮏ دﺳﺘﯽ را ﺷﺎﻣﻞ ﻣﯽﺷﻮﻧﺪ. ﺳﭙﺲ، ﻓﺮآﯾﻨﺪ ﺳﺎﺧﺖ ﭘﯿﮑﺮة ﻣﺘﻨﯽ اراﺋﻪ ﻣﯽﺷﻮد ﮐﻪ اﻧﺘﺨﺎب ﻣﺘﻮن، ﭘﯿﺶﭘﺮدازش ﻣﺘﻮن و ﺣﺎﺷﯿﻪﻧﻮﯾﺴﯽ را در ﺑﺮ ﻣﯽﮔﯿﺮد و در اﯾﻦ راﺳﺘﺎ ﺑﻪ ﺗﻔﺼﯿﻞ درﺑﺎرة ﻫﺮ ﯾﮏ از ﻣﺮاﺣﻞ ﺗﻮﺿﯿﺢ داده ﻣﯽﺷﻮد. در ﭘﺎﯾﺎن، ﻓﺮآﯾﻨﺪ ﺳﺎﺧﺖ ﭘﯿﮑﺮة ﮔﻔﺘﺎري ﺑﯿﺎن ﻣﯽﺷﻮد ﮐﻪ ﺟﻤﻊآوري دادهﻫﺎ، آواﻧﻮﯾﺴﯽ، ﻧﻤﺎﯾﺶ و ﺣﺎﺷﯿﻪﻧﻮﯾﺴﯽ و دﺳﺘﺮﺳﯽ را در ﺑﺮ ﻣﯽﮔﯿﺮد. درﺑﺎرة ﻫﺮ ﯾﮏ از ﻣﺮاﺣﻞ ﻣﺬﮐﻮر ﻧﯿﺰ ﺑﻪ ﺗﻔﺼﯿﻞ ﺗﻮﺿﯿﺢ داده ﻣﯽﺷﻮد.

چكيده لاتين

The aim of this paper is to take readers through the basic steps involved in building a corpus of language data for different purposes. This is done via gathering information about corpus construction from related sources. After a review of literature (regarding corpus construction and the use of corpus in different fields) , this article offers advice in a non-technical style to help the researchers to make sure that their corpus is well-designed and fit for the intended purpose. Key points to be considered in constructing any corpus (written or spoken language) include: Sampling, Size, Representativeness, Balance, General vs. Specialized corpus and Homogeneity. The steps involved in constructing a text corpus are: text selection, text normalization and different kinds of annotation. The steps to be followed in constructing a spoken language/speech-based corpus are: data gathering, transcription, representation, annotation and access. In this paper all the afore-mentioned steps have been explained with related details.

سال انتشار

1398

عنوان نشريه

زبان شناسي گويش هاي ايراني

فايل PDF

8173521

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=8&DC=1156500