• DocumentCode
    140887
  • Title

    Effective location identification from microblogs

  • Author

    Guoliang Li ; Jun Hu ; Jianhua Feng ; Kian-Lee Tan

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • fYear
    2014
  • fDate
    March 31 2014-April 4 2014
  • Firstpage
    880
  • Lastpage
    891
  • Abstract
    The rapid development of social networks has resulted in a proliferation of user-generated content (UGC). The UGC data, when properly analyzed, can be beneficial to many applications. For example, identifying a user´s locations from microblogs is very important for effective location-based advertisement and recommendation. In this paper, we study the problem of identifying a user´s locations from microblogs. This problem is rather challenging because the location information in a microblog is incomplete and we cannot get an accurate location from a local microblog. To address this challenge, we propose a global location identification method, called Glitter. Glitter combines multiple microblogs of a user and utilizes them to identify the user´s locations. Glitter not only improves the quality of identifying a user´s location but also supplements the location of a microblog so as to obtain an accurate location of a microblog. To facilitate location identification, GLITTER organizes points of interest (POIs) into a tree structure where leaf nodes are POIs and non-leaf nodes are segments of POIs, e.g., countries, states, cities, districts, and streets. Using the tree structure, Glitter first extracts candidate locations from each microblog of a user which correspond to some tree nodes. Then Glitter aggregates these candidate locations and identifies top-k locations of the user. Using the identified top-k user locations, Glitter refines the candidate locations and computes top-k locations of each microblog. To achieve high recall, we enable fuzzy matching between locations and microblogs. We propose an incremental algorithm to support dynamic updates of microblogs. Experimental results on real-world datasets show that our method achieves high quality and good performance, and scales very well.
  • Keywords
    fuzzy set theory; pattern matching; social networking (online); trees (mathematics); POIs; UGC; candidate location extraction; fuzzy matching; glitter; global location identification method; incremental algorithm; location-based advertisement; location-based recommendation; microblogs; nonleaf nodes; points of interest; social networks; top-k user location identification; tree nodes; tree structure; user-generated content; Aggregates; Cities and towns; Educational institutions; Films; Heuristic algorithms; Indexes; Twitter;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2014 IEEE 30th International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/ICDE.2014.6816708
  • Filename
    6816708