DocumentCode :
2922037
Title :
Clustering patent document in the field of ICT (Information & Communication Technology)
Author :
Widodo, Agus ; Budi, Indra
Author_Institution :
Fac. of Comput. Sci., Univ. of Indonesia, Jakarta, Indonesia
fYear :
2011
fDate :
28-29 June 2011
Firstpage :
203
Lastpage :
208
Abstract :
The current classification of patent data that refers to the IPC (International Patent Classification) of the WIPO (World Intellectual Property Organization), deemed not reflect the classification of the field of ICT (Information & Communication Technology). ICT applications are usually included in sections G (Physics) and H (Electricity). This paper will evaluate the eight groupings of patents based on the IPC classes (G01, G06, G09, G11, H01, H03, H04, and H06) of patents registered in the Directorate General of Intellectual Property Rights in Indonesia, from the year 1991 to 2000. The algorithm used to grouping is KMeans, KMeans++, Hierchical Clustering, and a combination of these three algorithms with SVD (Singular Value Decomposition). For external validation, Purity and F-Measure are used, whereas Silhouette is used for internal validation. From the experimental results it can be concluded that SVD provides improvements to the clustering results. In addition, the use of abstract does not necessarily improve the performance of clustering, and the use of phrase does not always yield better cluster than the use of the word as index. Moreover, no cluster has purity measure greater than 50%, which means that the existing IPC classification has not been able to accommodate the field of ICT appropriately.
Keywords :
document handling; information technology; patents; pattern classification; pattern clustering; singular value decomposition; F-Measure validation; ICT field; KMeans++ algorithm; SVD; Silhouette validation; WIPO; hierarchical clustering; information and communication technology; international patent classification; patent data classification; patent document clustering; patent registration; singular value decomposition; world intellectual property organization; Abstracts; Clustering algorithms; Indexing; Matrix decomposition; Patents; Singular value decomposition; Clustering; Information & Communication Technology; Kmeans; Patent; Singular Value Decomposition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Technology and Information Retrieval (STAIR), 2011 International Conference on
Conference_Location :
Putrajaya
Print_ISBN :
978-1-61284-354-4
Electronic_ISBN :
978-1-61284-353-7
Type :
conf
DOI :
10.1109/STAIR.2011.5995789
Filename :
5995789
Link To Document :
بازگشت