0% found this document useful (0 votes)
36 views

Science Citation: Indexing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Science Citation: Indexing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

Science Citation Indexing

– By: saif kazem radhi


– Scientific Supervisor: Dr. Amir Hossein Sari
Outline
 Introduction to Citation Indexing
 What is Citation Indexing
 Concept
 Web of Science
 Bias
 Autonomous Citation Indexing
 Future Application
 Technology Forecasting
 Summary
Concept of Citations

 Citations symbolize the conceptual association of scientific


ideas as recognized by publishing research authors.

 By the references they cite in their papers, authors make


explicit linkages between their current research and prior
work in the archive of scientific literature.
Distinction between "citation" and
"reference"
 If Paper R contains a bibliographic footnote using and
describing Paper C, then
 R contains a reference to C,
 C has a citation from R.
 The number of references a paper has is measured by the
number of items in its bibliography as endnotes, footnotes,
etc.,
 The number of citations a paper has is found by looking it
up [in a] citation index and seeing how many others papers
mention it."

Source: Price D. J. D. Little science, big science...and beyond. New York: Columbia University
Press, 1986.
Paper R

…..To start, it is important to clarify the


terminological distinction between
"citation“[6] and "reference". In his
classic book Little Science, Big
Science, Derek Price gave a clear
definition of both terms. He said: "It
seems to me a great pity to waste a

R contains a reference to C,
good technical term by using the words
citation and reference interchangeably.
I therefore propose and adopt the
convention that if Paper R contains a
bibliographic footnote using and
describing Paper C, then R contains…

[6] The concept of citation indexing: A unique


and innovative tool    for navigating the research
literature. Current Contents, January 3, 1994.

Paper C

Little science, big science...and beyond.

This is my first Current Contents® (CC®)


essay under the rubric of Citation Comments.
As discussed in last week's CC, this new
monthly feature will focus on the applications C has a citation from R.
of the Institute for Scientific Information's
(ISI's) databases. 1 An appropriate topic to
launch this new series is perhaps the most
rudimentary -- the basic concept of citation
indexing.
To start, it is important to clarify the
terminological distinction between "citation"
and "reference". In his classic book Little
Science, Big Science, Derek Price gave a
clear definition of both terms. He said: "It
seems to me a great pity to waste a good
technical term by using the words citation and
reference interchangeably. I therefore
propose and adopt the convention that if
Paper R contains a bibliographic footnote
using and describing Paper C, then R
contains a.
Citation Index

Paper C

1) Paper X
2) Paper Y
3) Paper R
4) Paper Q
Citation Indexing

 A citation index indexes the citations an article makes, linking


the article with cited works.
 Originally designed mainly for literature search for researchers
to find subsequent articles that cite a given article.

 Invented by Dr. Eugene Garfield


 Example of a Citation Indexing Firm - Institute for Scientific
Information ® (ISI)
Institute for Scientific Information®
(ISI)
 the ISI databases differ from traditional indexing
and abstracting services in several other ways as
well.
 From the outset, the Science Citation Index (SCI),
Social Sciences Citation Index (SSCI), and Arts &
Humanities Citation Index (A&HCI) have been
multidisciplinary.
 They cover virtually all disciplines whereas
traditional services are limited to a single field
Web of Knowledge
 ISI Web of Knowledge, a dynamic, integrated, Web-based
environment

 ISI Web of Science provides access to


 Science Citation Index (over 3,200 journals )
 Social Sciences Citation Index (1400 journals)
 Arts & Humanities Citation Index
 Updated weekly.
 Journals from 1986 is available for Penn State Users
 Previous years of each index are available in PRINT at the
Libraries.
Web of Science

 search current and retrospective multidisciplinary


information from nearly 8,500 research journals
in the world.

 users can navigate forward, backward, and


through the literature, searching all disciplines
and time spans to uncover lot of information
relevant to their research.
Advantages

 Compared to traditional indexing-


 no subjective judgments to be made about
relevant descriptors.
 Faster.

 no limit to index terms - all cited references are


indexed.
Problems with ISI Databases

 Require manual effort during indexing


 Expensive
 Bias issues
 One possible solution – Autonomous
Citation Indexing

Adapted from ‘Citation Indexing - Its Theory and Application in Science, Technology, and
Humanities’ by Eugene Garfield
Bias in Citation Databases
 Bibliometric indicators do not represent all publishing -
though these databases have an international coverage, they
have a certain amount of bias-
 They contain more minor US journals than minor European journals
 Non-English language journals are not as comprehensively indexed
 From a non-English speaking world perspective, bibliometric
indicators represent only international level, predominantly English
language, higher impact, peer-reviewed, publicly available research
output.

Source: Bibliometric Indicators and the Social Sciences, prepared for ESRC, J. Sylvan Katz
SPRU, University of Sussex UK, December 1999
Bias in Citation Databases
 One of the recurrent criticisms – journal selection is biased by
the internal management decisions of ISI.
 Only journals are indexed- monographs are left out.
 A lack of correlation between the most highly cited authors
based on the journal sample and those based on the
monograph sample suggests that there may be two distinct
populations of highly cited authors.

Source: Blaise Cronin and Herbert W. Snyder. Comparative citation rankings of authors in
monographic and journal literature: a study of sociology. Journal of Documentation,53(3):263–
273, 1997.
ResearchIndex/CiteSeer
 ResearchIndex: A scientific literature digital library that
incorporates
 Autonomous citation indexing
 Citation context
 Full-text indexing
 Related document identification
 Query sensitive summaries
 Awareness and tracking
 Citation graph analysis
 https://citeseerx.

Source: Presentation on “Searching the World Wide Web General and Scientific Information
Access”, Steve Lawrence
CiteSeer – How does it work?
Downloads Convert to Obtain Store them
papers text and Citations & in
from the parse Do Full Text Database
Web Indexing

Query by citations
or key words

Source: CiteSeer: An Automatic Citation Indexing System (1998),C. Lee Giles, Kurt D.
Bollacker, Steve Lawrence, Digital Libraries 98 - The Third ACM Conference on Digital
Libraries
CiteSeer - Document Acquisition
 Web search engines used for crawling
 Heuristics used to locate papers
 Pages containing words “publications”, “papers”,
“postscript”, etc.).
 locates and downloads Postscript files identified by
“.ps”, “.ps.Z”, or “.ps.gz” extensions.
 URLs and Postscript files that are duplicates of those
already found are detected and skipped.
Document Parsing
 The downloaded Postscript files are first
converted into text
 Information extracted include- URL , Header,
Abstract, Introduction, Citations, Citation context
and Full text
 Issues in Citation Parsing include:
 Natural language citations

 Citations to the same article (affects citation


statistics)
Querying and Browsing

 First query – key word search used to return a list


of citations matching the query or list of articles.
 Finds related documents- a combination of
weighed similarity measures are used
Advantages of CiteSeer
 Completely Autonomous - cheaper and more availability
 More up-to-date databases - not limited to a pre-selected
set of journals or publication delays
 Literature search based on the context of citations
 Ability to recognize variant forms of citations
 No bias due to no subjective selection of journals
 Not restricted to papers – preprints, technical reports,
conference proceedings also indexed.
 User feedback on each article

Source: Autonomous Citation Matching (1999) Steve Lawrence, C. Lee Giles, Kurt Bollacker
Proceedings of the Third International Conference on Autonomous Agents
Areas of Improvement
1. Does not cover the significant journals comprehensively.
(might be less of a disadvantage over time as more journals become
available online)

2. Cannot distinguish subfields as accurately


(e.g. CiteSeer will not disambiguate two authors with the same name.)

3. Similar document retrieval system could be enhanced and improved.

4. Heuristics used to locate articles could be improved


Future prospects – Technology
Forecasting

 DIVA (for Database Information Visualization and Analysis


system) - bibliometric analysis of collections of scientific
literature and patents for technology forecasting.
 Documents, drawn from the technological field of interest, are
visualized as clusters on a two dimensional map, permitting
exploration of the relationships among the documents and
document clusters
 Can yield insight into trends in the technological field of
interest.

Source: DIVA: A Visualization System for Exploring Document Databases For Technology
Forecasting by Steven Morris, Zheng Wu, Camille DeYong, Sinan Salman, Dagmawi Yemenu
Computers and Industrial Engineering, Vol. 43, No. 4
Clustering of documents
Document Maps
Document timelines
Document timelines
Document timelines
Document timelines
‘Polymers’ cluster report showing a plot of links to all other clusters by year
Document timelines
‘Polymers’ cluster report showing a plot of links to each other cluster by year.
A comment on bibliometric analysis

Compared to “a drunk
who is looking for his keys
under a street lamp” .

When asked by a passer-


by as to why he is looking
there, the reply was “ This
is where the lamp is”.
A comment on bibliometric analysis
Critics say that publications (and citations) just provide “easy data” and
that the assessment of “real quality” needs more “quantitative
considerations”.
Summary

 Citation Indexing – more the 40 years old.


 Simple concept – far reaching influences, applications
 Many possibilities for
 Improvement of existing systems
 Developing new uses in the networked world

You might also like