DocumentCode :
393076
Title :
Extracting fixed information from miscellaneous documents on net auction
Author :
Kusumura, Yukitaka ; Hijikata, Yoshinori ; Nishida, Shogo
Author_Institution :
Graduate Sch. of Eng. Sci., Osaka Univ., Japan
fYear :
2003
fDate :
27-29 March 2003
Firstpage :
446
Lastpage :
453
Abstract :
Net auctions have been widely utilized with the recent development of the Internet. However it has a problem that there are too many items for bidders to select the most suitable one. We aim at supporting bidders on net auctions by automatically extracting the information of the item´s features from Web pages in net auctions and generating a table containing the features of some items for comparison. But because descriptions are not uniform in net auctions, there are two problems in extracting the features. The first problem is that there are some formats. The second problem is that the keywords of features are sometimes omitted. We proposed the solutions to the problems. The solution to the first problem is to distinguish the format type from tables, items and sentences, and extract the feature values in the most suitable way. The solution to the second problem is to learn the keywords in extracting from the descriptions with the keywords. And after that, the keywords are used in extracting from the descriptions without keywords. And we constructed the system which collects the information of items, extracts their features from their text information by text mining methods and generates the table containing extracted features.
Keywords :
Internet; Web design; data mining; feature extraction; financial data processing; text analysis; Internet; Web pages; automatic information extraction; bidder support; format type; item features; keyword learning; net auctions; table generation; text mining methods; Application software; Concrete; Data mining; Electronic commerce; Feature extraction; Internet; Microcomputers; Text mining; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications, 2003. AINA 2003. 17th International Conference on
Print_ISBN :
0-7695-1906-7
Type :
conf
DOI :
10.1109/AINA.2003.1192919
Filename :
1192919
Link To Document :
بازگشت