مرکز منطقه ای اطلاع رساني علوم و فناوري - Annotated Corpora beneath Application Programming Interface

شماره ركورد كنفرانس :

3220

عنوان مقاله :

Annotated Corpora beneath Application Programming Interface

پديدآورندگان :

Tavassolinia .Amir H University, Shiraz Branch, Department of English

كليدواژه :

Python , Machine Learning , Corpus Linguistics , Computational Linguistics , Natural Language Processing , Data Mining

سال انتشار :

اسفند 1396

عنوان كنفرانس :

نخستين كنفرانس ملي پژوهش هاي كاربردي در زيان شناسي رايانشي (با محوريت خط و زبان فارسي)

زبان مدرك :

انگليسي

چكيده لاتين :

Evidence of language mechanism of the human brain is being collected in forms of storing electronic naturally occurring texts called corpora. Corpus Linguistics tends to reveal the rules of languages to implement by machines that use corpus dependent Natural Language Processing systems. Persian NLP systems are functioning well however, few unlimited Persian corpora were compiled to test them to be improved. In the technology era, texts are produced by Persians in websites and applications of any kind to imply meaningful sentiments. The applied linguist in this study used python to get authorized to a social media API to examine the amount and lexical density of natural Persian streaming in the API. The corpus contained half a million words with tagged information about real-time users. By accessing such APIs and scraping websites, the unlimited Persian Today Corpus will start compiling for NLP technologies.

كشور :

ايران

تعداد صفحه 2 :

از صفحه :

تا صفحه :

لينک به اين مدرک :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=36&DC=156836