• DocumentCode
    1365621
  • Title

    Synthetic Generation of High-Dimensional Datasets

  • Author

    Albuquerque, Georgia ; Löwe, Thomas ; Magnor, Marcus

  • Author_Institution
    Comput. Graphics Lab., TU, Braunschweig, Germany
  • Volume
    17
  • Issue
    12
  • fYear
    2011
  • Firstpage
    2317
  • Lastpage
    2324
  • Abstract
    Generation of synthetic datasets is a common practice in many research areas. Such data is often generated to meet specific needs or certain conditions that may not be easily found in the original, real data. The nature of the data varies according to the application area and includes text, graphs, social or weather data, among many others. The common process to create such synthetic datasets is to implement small scripts or programs, restricted to small problems or to a specific application. In this paper we propose a framework designed to generate high dimensional datasets. Users can interactively create and navigate through multi dimensional datasets using a suitable graphical user-interface. The data creation is driven by statistical distributions based on a few user-defined parameters. First, a grounding dataset is created according to given inputs, and then structures and trends are included in selected dimensions and orthogonal projection planes. Furthermore, our framework supports the creation of complex non-orthogonal trends and classified datasets. It can successfully be used to create synthetic datasets simulating important trends as multidimensional clusters, correlations and outliers.
  • Keywords
    data handling; graphical user interfaces; pattern classification; pattern clustering; statistical distributions; classified datasets; complex nonorthogonal trends; graphical user interface; multidimensional clusters; multidimensional correlations; multidimensional outliers; orthogonal projection planes; statistical distributions; synthetic high dimensional datasets generation; user defined parameters; Correlation; Data processing; Probability density function; Scattering parameters; Synthetic data generation; high-dimensional data; interaction.; multivariate data;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2011.237
  • Filename
    6064998