• DocumentCode
    2646466
  • Title

    A Simple Approach to Unsupervised Speaker Indexing

  • Author

    Ofoegbu, Uchechukwu O. ; Iyer, Ananth N. ; Yantorno, Robert E. ; Smolenski, Brett Y

  • Author_Institution
    Lab. of Signal Process., Temple Univ., Philadelphia, PA
  • fYear
    2006
  • fDate
    12-15 Dec. 2006
  • Firstpage
    339
  • Lastpage
    342
  • Abstract
    Unsupervised speaker indexing is a rapidly developing field in speech processing, which involves determining who is speaking when, without having prior knowledge about the speakers being observed. In this research, a distance-based technique for indexing telephone conversations is presented. Sub-models are formed (using data of approximately equal sizes) from the conversations, from which two references models are judiciously chosen such that they represent the two different speakers in the conversation. Models are then matched to the reference speakers based on a technique referred to as the restrained-relative minimum distance (RRMD) approach. Some models, which fail to meet the RRMD criteria, are considered "undecided" and left unmatched with either of the reference speakers. Analysis is made to determine the appropriate size (or length of data to be used) for these models, which are formed using cepstral coefficients of the speech data. The T-square statistic is used for speaker differentiation. Evaluation is performed based on the indexing accuracy as well as the amount of undecided speech obtained. The proposed system was able to yield a minimum indexing error of about 9% with a maximum undecided error of 18.5% , and an equal error rate of 11% on 245 files (with an average length of about 400 seconds each) from the SWITCHBOARD database
  • Keywords
    speaker recognition; speech processing; T-square statistics; distance-based technique; restrained-relative minimum distance approach; speaker differentiation; speech data cepstral coefficients; speech processing; telephone conversation indexing; unsupervised speaker indexing; Cepstral analysis; Indexing; Laboratories; Predictive models; Signal processing; Speech analysis; Speech processing; Statistics; Telephony; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Signal Processing and Communications, 2006. ISPACS '06. International Symposium on
  • Conference_Location
    Yonago
  • Print_ISBN
    0-7803-9732-0
  • Electronic_ISBN
    0-7803-9733-9
  • Type

    conf

  • DOI
    10.1109/ISPACS.2006.364901
  • Filename
    4212288