• DocumentCode
    2509666
  • Title

    Some remarks on vector representations of legal documents

  • Author

    Schweighofer, Erich ; Rauber, Andreas ; Merkl, Dieter

  • Author_Institution
    Inst. of Public Int. Law, Wien Univ., Austria
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1087
  • Lastpage
    1091
  • Abstract
    Vector representation of legal documents is still the best way for computing classification clusters and labelling of its contents. This paper deals with the problem of diversity of legal documents making vector representation a difficult task. Extensive experiments with three text corpora of about 580 documents in three languages have shown that binary or weighted vector representation may not be sufficient. Even quite successful approaches of similarity computation have problems in identifying the best context of classification. The LabelSOM method can be seen as a very efficient tool for verification of similarity because common elements are explicitly identified. Finally, some proposals for the “best” vector representation are discussed: weighted vectors, feature vectors and hierarchies of vectors using XML information for identifying similar contexts
  • Keywords
    classification; document handling; knowledge representation; law administration; self-organising feature maps; LabelSOM method; XML information; classification clusters; content labelling; feature vectors; legal documents; similarity verification; text corpora; vector hierarchies; vector representation; HTML; Information retrieval; Internet; Labeling; Law; Legal factors; Neural networks; Proposals; World Wide Web; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 2000. Proceedings. 11th International Workshop on
  • Conference_Location
    London
  • ISSN
    1529-4188
  • Print_ISBN
    0-7695-0680-1
  • Type

    conf

  • DOI
    10.1109/DEXA.2000.875162
  • Filename
    875162