نخستين پيكرة جامع زبان فارسي: از آغاز تا امروز

عنوان به زبان ديگر

The first comprehensive corpus for the Persian language: From beginning to the present

پديدآورندگان

عاصي مصطفي s_m_assi@ihcs.ac.ir استاد زبان‌شناسي پژوهشگاه علوم انساني

تعداد صفحه

كليدواژه

پيكرة زبان فارسي , پايگاه داده‌ها (دادگان) , پيكرة تاريخي , نشانه‌گذاري پيكره

سال انتشار

1395

عنوان كنفرانس

دومين همايش ملي زبان شناسي پيكره اي

زبان مدرك

فارسي

چكيده فارسي

از ديرباز پژوهندگان زبان و زبان‌شناسان مي‌كوشيده‌اند بررسي‌هاي خود را بر بنياد داده‌هاي واقعي زباني به‌انجام برسانند و نام پيكره را براي هرگونه داده و به هر اندازه‌اي به‌كار مي‌برده‌اند. اما از پيدايش پيكره‌هاي زباني با تعريف امروزيشان بيش از سه چهار دهه‌اي نمي‌گذرد. در ايران نيز همين پيشينه را شاهد هستيم كه با تلاش‌هاي فرهنگستان زبان ايران براي تدوين واژه‌نامه‌هاي بسامدي با كمك رايانه در نيمة اول دهة 1350 به‌عنوان نخستين گام براي تدوين پيكره‌هاي زباني آغاز مي‌شود. نخستين پيكرة بزرگ زبان فارسي امروز با امكانات پردازشي در سال‌هاي 1372 تا 1374 در پژوهشگاه علوم انساني فراهم‌آمد و تا 1384 از راه اينترنت به‌رايگان در دسترس همگان قرار گرفت. گام بعدي افزودن پيكرة تاريخي فارسي از قرن چهارم هجري تا دوران معاصر بود كه از سال 1393 آغاز گرديده و بخش مهمي از آن به‌انجام رسيده است. اكنون زمان برداشتن گام ديگري براي ارتقاء پايگاه داده‌هاي زبان فارسي است.

چكيده لاتين

Since old ages, language scholars have tried to base their investigations on real linguistic data which in any size or type was called corpus. It has been only from the last three or four decades, however, that corpora in their modern sense have come into use. A similar history can be noticed in Iran, too. The modern era begins with the endeavors of the linguists at the Iranian Academy of Language for compiling manual and computerized concordances in the first half of the 1350s (1970s), as the first step towards the compilation of a Persian corpus. However, the first comprehensive corpus for the Persian language with processing facilities was established in the Institute for Humanities and Cultural Studies during the years 1372-1374 (1993-1995), and became available to the public through Internet by 1384 (2005). The next step was to add historical corpus to the database covering the 4th century (A.H.) to the present time. This stage started in 1393 (2014) and still is in progress. Now it is the time to take steps towards the enhancement and development of the Persian Linguistic Database.

كشور

ايران

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=36&DC=200218