Science Citation: Indexing
Science Citation: Indexing
Source: Price D. J. D. Little science, big science...and beyond. New York: Columbia University
Press, 1986.
Paper R
R contains a reference to C,
good technical term by using the words
citation and reference interchangeably.
I therefore propose and adopt the
convention that if Paper R contains a
bibliographic footnote using and
describing Paper C, then R contains…
Paper C
Paper C
1) Paper X
2) Paper Y
3) Paper R
4) Paper Q
Citation Indexing
Adapted from ‘Citation Indexing - Its Theory and Application in Science, Technology, and
Humanities’ by Eugene Garfield
Bias in Citation Databases
Bibliometric indicators do not represent all publishing -
though these databases have an international coverage, they
have a certain amount of bias-
They contain more minor US journals than minor European journals
Non-English language journals are not as comprehensively indexed
From a non-English speaking world perspective, bibliometric
indicators represent only international level, predominantly English
language, higher impact, peer-reviewed, publicly available research
output.
Source: Bibliometric Indicators and the Social Sciences, prepared for ESRC, J. Sylvan Katz
SPRU, University of Sussex UK, December 1999
Bias in Citation Databases
One of the recurrent criticisms – journal selection is biased by
the internal management decisions of ISI.
Only journals are indexed- monographs are left out.
A lack of correlation between the most highly cited authors
based on the journal sample and those based on the
monograph sample suggests that there may be two distinct
populations of highly cited authors.
Source: Blaise Cronin and Herbert W. Snyder. Comparative citation rankings of authors in
monographic and journal literature: a study of sociology. Journal of Documentation,53(3):263–
273, 1997.
ResearchIndex/CiteSeer
ResearchIndex: A scientific literature digital library that
incorporates
Autonomous citation indexing
Citation context
Full-text indexing
Related document identification
Query sensitive summaries
Awareness and tracking
Citation graph analysis
https://citeseerx.
Source: Presentation on “Searching the World Wide Web General and Scientific Information
Access”, Steve Lawrence
CiteSeer – How does it work?
Downloads Convert to Obtain Store them
papers text and Citations & in
from the parse Do Full Text Database
Web Indexing
Query by citations
or key words
Source: CiteSeer: An Automatic Citation Indexing System (1998),C. Lee Giles, Kurt D.
Bollacker, Steve Lawrence, Digital Libraries 98 - The Third ACM Conference on Digital
Libraries
CiteSeer - Document Acquisition
Web search engines used for crawling
Heuristics used to locate papers
Pages containing words “publications”, “papers”,
“postscript”, etc.).
locates and downloads Postscript files identified by
“.ps”, “.ps.Z”, or “.ps.gz” extensions.
URLs and Postscript files that are duplicates of those
already found are detected and skipped.
Document Parsing
The downloaded Postscript files are first
converted into text
Information extracted include- URL , Header,
Abstract, Introduction, Citations, Citation context
and Full text
Issues in Citation Parsing include:
Natural language citations
Source: Autonomous Citation Matching (1999) Steve Lawrence, C. Lee Giles, Kurt Bollacker
Proceedings of the Third International Conference on Autonomous Agents
Areas of Improvement
1. Does not cover the significant journals comprehensively.
(might be less of a disadvantage over time as more journals become
available online)
Source: DIVA: A Visualization System for Exploring Document Databases For Technology
Forecasting by Steven Morris, Zheng Wu, Camille DeYong, Sinan Salman, Dagmawi Yemenu
Computers and Industrial Engineering, Vol. 43, No. 4
Clustering of documents
Document Maps
Document timelines
Document timelines
Document timelines
Document timelines
‘Polymers’ cluster report showing a plot of links to all other clusters by year
Document timelines
‘Polymers’ cluster report showing a plot of links to each other cluster by year.
A comment on bibliometric analysis
Compared to “a drunk
who is looking for his keys
under a street lamp” .