Title of article :
A Semantic Analysis and Community Detection-Based Artificial Intelligence Model for Core Herb Discovery from the Literature: Taking Chronic Glomerulonephritis Treatment as a Case Study
Author/Authors :
Zhang, Yun School of Information and Software Engineering - University of Electronic Science and Technology of China - Chengdu, China , Liu, Yongguo School of Information and Software Engineering - University of Electronic Science and Technology of China - Chengdu, China , Zhu, Jiajing School of Information and Software Engineering - University of Electronic Science and Technology of China - Chengdu, China , Zhai, Shuangqing School of Basic Medical Science - Beijing University of Chinese Medicine - Beijing, China , Jin, Rongjiang Chengdu University of Traditional Chinese Medicine - Chengdu, China , Wen, Chuanbiao Chengdu University of Traditional Chinese Medicine - Chengdu, China
Abstract :
The Traditional Chinese Medicine (TCM) formula is the main treatment method of TCM. A formula often contains multiple herbs
where core herbs play a critical therapeutic effect for treating diseases. It is of great significance to find out the core herbs in
formulae for providing evidences and references for the clinical application of Chinese herbs and formulae. In this paper, we
propose a core herb discovery model CHDSC based on semantic analysis and community detection to discover the core herbs
for treating a certain disease from large-scale literature, which includes three stages: corpus construction, herb network
establishment, and core herb discovery. In CHDSC, two artificial intelligence modules are used, where the Chinese word
embedding algorithm ESSP2VEC is designed to analyse the semantics of herbs in Chinese literature based on the stroke,
structure, and pinyin features of Chinese characters, and the label propagation-based algorithm LILPA is adopted to detect herb
communities and core herbs in the herbal semantic network constructed from large-scale literature. To validate the proposed
model, we choose chronic glomerulonephritis (CGN) as an example, search 1126 articles about how to treat CGN in TCM from
the China National Knowledge Infrastructure (CNKI), and apply CHDSC to analyse the collected literature. Experimental
results reveal that CHDSC discovers three major herb communities and eighteen core herbs for treating different CGN
syndromes with high accuracy. The community size, degree, and closeness centrality distributions of the herb network are
analysed to mine the laws of core herbs. As a result, we can observe that core herbs mainly exist in the communities with more
than 25 herbs. The degree and closeness centrality of core herb nodes concentrate on the range of [15, 40] and [0.25, 0.45],
respectively. Thus, semantic analysis and community detection are helpful for mining effective core herbs for treating a certain
disease from large-scale literature.
Keywords :
Detection-Based , Glomerulonephritis , TCM , LILPA
Journal title :
Computational and Mathematical Methods in Medicine