Author/Authors :
Vetle I. Torvik and Marc Weeber، نويسنده , , Don R. Swanson، نويسنده , , Neil R. Smalheiser ، نويسنده ,
Abstract :
We present a model for estimating the probability that a
pair of author names (sharing last name and first initial),
appearing on two different Medline articles, refer to the
same individual. The model uses a simple yet powerful
similarity profile between a pair of articles, based on title,
journal name, coauthor names, medical subject headings
(MeSH), language, affiliation, and name attributes (prevalence
in the literature, middle initial, and suffix). The similarity
profile distribution is computed from reference sets
consisting of pairs of articles containing almost exclusively
author matches versus nonmatches, generated in
an unbiased manner. Although the match set is generated
automatically and might contain a small proportion of
nonmatches, the model is quite robust against contamination
with nonmatches. We have created a free, public
service (“Author-ity”: http://arrowsmith.psych.uic.edu)
that takes as input an author’s name given on a specific
article, and gives as output a list of all articles with that
(last name, first initial) ranked by decreasing similarity,
with match probability indicated