• DocumentCode
    1227039
  • Title

    Deciphering human language [information extraction]

  • Author

    Taylor, Sarah M.

  • Author_Institution
    Lockheed Martin, Cleveland, OH, USA
  • Volume
    6
  • Issue
    6
  • fYear
    2004
  • Firstpage
    28
  • Lastpage
    34
  • Abstract
    Most readily available tools - basic search engines, possibly a news or information service, and perhaps agents and Web crawlers - are inadequate for many information retrieval tasks and downright dangerous for others. These tools either return too much useless material or miss important material. Even when such tools find useful information, the data is still in a text form that makes it difficult to build displays or diagrams. Employing the data in data mining or standard database operations, such as sorting and counting, can also be difficult. An emerging technology called information extraction (IE) is beginning to change all that, and you might already be using some very basic IE tools without even knowing it. Companies are increasingly applying IE behind the scenes to improve information and knowledge management applications such as text search, text categorization, data mining, and visualization (Rao, 2003). IE has also begun playing a key role in fields such as national security, law enforcement, insurance, and biomedical research, which have highly critical information and knowledge needs. In these fields, IE\´s powerful capabilities arc necessary to save lives or substantial investments of time and money. IE views language up close, considering grammar and vocabulary, and tries to determine the details of "who did what to whom" from a piece of text. In its most in-depth applications, IE is domain focused; it does not try to define all the events or relationships present in a piece of text, but focuses only on items of particular interest to the user organization.
  • Keywords
    data mining; information management; information retrieval; knowledge management; natural languages; data mining; information extraction; information management; information retrieval; knowledge management; natural language processing; Crawlers; Data mining; Databases; Displays; Humans; Information retrieval; Knowledge management; Layout; Search engines; Sorting;
  • fLanguage
    English
  • Journal_Title
    IT Professional
  • Publisher
    ieee
  • ISSN
    1520-9202
  • Type

    jour

  • DOI
    10.1109/MITP.2004.82
  • Filename
    1390870