Title :
Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool for Investigative Journalists
Author :
Brehmer, Matthew ; Ingram, Stephen ; Stray, Jonathan ; Munzner, Tamara
Author_Institution :
Univ. of British Columbia, Vancouver, BC, Canada
Abstract :
For an investigative journalist, a large collection of documents obtained from a Freedom of Information Act request or a leak is both a blessing and a curse: such material may contain multiple newsworthy stories, but it can be difficult and time consuming to find relevant documents. Standard text search is useful, but even if the search target is known it may not be possible to formulate an effective query. In addition, summarization is an important non-search task. We present Overview, an application for the systematic analysis of large document collections based on document clustering, visualization, and tagging. This work contributes to the small set of design studies which evaluate a visualization system “in the wild”, and we report on six case studies where Overview was voluntarily used by self-initiated journalists to produce published stories. We find that the frequently-used language of “exploring” a document collection is both too vague and too narrow to capture how journalists actually used our application. Our iterative process, including multiple rounds of deployment and observations of real world usage, led to a much more specific characterization of tasks. We analyze and justify the visual encoding and interaction techniques used in Overview´s design with respect to our final task abstractions, and propose generalizable lessons for visualization design methodology.
Keywords :
data mining; data visualisation; graphical user interfaces; pattern clustering; text analysis; Freedom of Information Act request; Overview; data summarization; document clustering; document collection analysis; document tagging; document visualization; frequently-used language; in-the-wild visualization system; interaction techniques; investigative journalists; iterative process; multiple newsworthy stories; published story production; query processing; self-initiated journalists; standard text search; task abstractions; visual document mining tool adoption; visual document mining tool analysis; visual document mining tool design; visual encoding; visualization design methodology; Data visualization; Document handling; Encoding; Text analysis; Text mining; Design study; investigative journalism; task and requirements analysis; text analysis; text and document data;
Journal_Title :
Visualization and Computer Graphics, IEEE Transactions on
DOI :
10.1109/TVCG.2014.2346431