• DocumentCode
    3686932
  • Title

    A Practical Approach to Scalable Big Data Computing for the Personalization of Services at Samsung

  • Author

    Jong Hoon Ahnn

  • fYear
    2014
  • Firstpage
    64
  • Lastpage
    73
  • Abstract
    We observe that the recent advances in big data computing have empowered the personalization of service including model-based services such as speech recognition, face recognition, and context-aware service. Various sources of user´s logs can be utilized in remodeling, adapting, and personalizing pretrained models to improve the quality of service. We propose a system that can support store/retrieve data and process them in a scalable manner on top of Samsung´ big data infrastructure. An automatic speech recognition (ASR) service such as Samsung´s S-Voice, Apple´s SIRI is one of the representative examples. Recently advances in ASR married with big data technologies drive more personalized services in many areas of services. A speaker adaptation is now a well-accepted technology that requires huge computation cost in creating a personalized acoustic model and corresponding language model over several billions of Samsung product users. We implement a personalized and scalable ASR system powered by the big data infrastructure which brings data-driven personalized opportunities to voice-enabled services such as voice-to-text transcriber, voice-enabled web search in a peta bytes scale. We verify the feasibility of speaker adaptation based on 107 testers´ recordings and obtain about 10% of recognition accuracy. An optimal set of performance optimization is suggested to have the best performance such as workflow compaction, file compression, best file system selection among several distributed file systems.
  • Keywords
    "Adaptation models","Speech","Acoustics","Computational modeling","Engines","Big data","Speech recognition"
  • Publisher
    ieee
  • Conference_Titel
    Big Data Computing (BDC), 2014 IEEE/ACM International Symposium on
  • Type

    conf

  • DOI
    10.1109/BDC.2014.11
  • Filename
    7321730