• DocumentCode
    3474990
  • Title

    Development of a Synthetic Data Set Generator for Building and Testing Information Discovery Systems

  • Author

    Lin, Pengyue J. ; Samadi, Behrokh ; Cipolone, Alan ; Jeske, Daniel R. ; Cox, Sean ; Rendón, Carlos ; Holt, Douglas ; Xiao, Rui

  • Author_Institution
    California Univ., River side, CA
  • fYear
    2006
  • fDate
    10-12 April 2006
  • Firstpage
    707
  • Lastpage
    712
  • Abstract
    Data mining research has yielded many significant and useful results such as discovering consumer-spending habits, detecting credit card fraud, and identifying anomalous social behavior. Information discovery and analysis systems (IDAS) extract information from multiple sources of data and use data mining methodologies to identify potential significant events and relationships. This research designed and developed a tool called IDAS data and scenario generator (IDSG) to facilitate the creation, testing and training of IDAS. IDSG focuses on building a synthetic data generation engine powerful and flexible enough to generate synthetic data based on complex semantic graphs
  • Keywords
    data mining; complex semantic graph; data mining methodology; data mining research; information discovery analysis system; information extraction; multiple data source; scenario generator; synthetic data generation engine; synthetic data set generator; Buildings; Credit cards; Data mining; Engines; Information analysis; Java; Medical diagnostic imaging; Performance analysis; Power generation; System testing; Client-Server; Data Generation; Data Mining; Java; Semantic Gra[j;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: New Generations, 2006. ITNG 2006. Third International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    0-7695-2497-4
  • Type

    conf

  • DOI
    10.1109/ITNG.2006.51
  • Filename
    1611688