Upgrade your Encarta experience

Sidebar from Encarta Appears in
Citation Analysis

Scientists are drowning in a flood of information overload. Every day, thousands of scientific studies are published. Citation analysis, a method of evaluating how often a research paper is footnoted by other scientists, is one way scientists can sort through the torrent of reports to find the most important and influential research.

Citation Analysis

By Christopher King

Modern science is a sprawling, bustling enterprise. Throughout the world each day, thousands of scientists report the results of experiments, studies, clinical trials, exploration, and invention in thousands of publications. How can anyone hope to keep track of all this research? How can one follow the progression of ideas and advances within a given field? And how can one determine which individuals, institutions, and nations are performing the most noteworthy and influential research? One method for tracking and evaluating research is known as citation analysis. It works because scientists leave an unmistakable trail behind them as they report their work—a trail of footnotes.

Also on Encarta

Science usually proceeds by steps or increments, as scientists advance and build on work that has come before. When scientists prepare papers to be published in specialized journals, they must follow the strict custom of acknowledging previous work, noting in detail the ideas, findings, or advancements from which their new work proceeds. They do this by including footnotes—also known as references or citations—in their papers. Each footnote gives credit to a previous researcher for a prior achievement and indicates where the achievement was reported. And each footnote constitutes a judgment as to the previous research that scientists themselves view as the most significant and useful.

Examining footnotes—seeing which names are acknowledged again and again in scientific papers—provides a means of identifying particularly influential and important research. It is one method for cutting through the vast thicket of worldwide scientific literature. By some estimates, upwards of 4 million scientific papers and reports are published annually in about 60,000 scientific and technical publications around the world. Specialists who study the scientific literature have determined that the most significant research in a given discipline actually appears in a relatively small number of journals—perhaps 10 to 20 percent of the journals in a field. These elite journals are usually peer-reviewed or refereed—that is, all articles are evaluated and cleared by experts before publication. Even a small selection of the most important journals representing all fields of science, however, produces a large volume of research: on the order of 750,000 papers a year—or roughly 2000 papers a day.

Much of this voluminous research, even among the most select journals, slips quietly into oblivion. Almost half of the papers published each year are never footnoted—that is, referred to by other scientists—and most of the rest are only footnoted once or twice after being published. Some papers, however, appear in the footnotes of many subsequent publications. By tracking papers that are footnoted numerous times, one can obtain a concrete and objective view of the specific scientists, as well as the universities or research institutions, that are the most highly regarded by other scientists.

Also on MSN

This knowledge is useful, for example, for agencies that provide funding for research. Citation analysis is one means of pointing the way to worthy recipients of financial support. Prospective students seeking influential universities, and newspapers and magazines seeking to publish rankings of the 'best' universities, may use citation analysis. Scientific institutions themselves often follow such information closely to monitor the influence of their own research in the scientific community. It also provides one measure (although certainly not the only measure) by which the performance of faculty or staff can be evaluated.

How Citation Analysis Works

Imagine a hypothetical scientist—say, Professor Greene, a molecular biologist at the small (and equally hypothetical) State Technical University. In 1992 Greene publishes a paper detailing the molecular structure of a protein that seems to play an important role in controlling the activity within individual cells. During the following year, another scientist—Professor Redd—reads Greene's paper and performs an experiment following up on some of its findings. In preparing the new paper and describing the background to the experiment, Redd includes a footnote listing Greene's 1992 report. Thus, Redd has 'cited' Greene—or, to put it another way, Greene has received a 'citation' for the earlier paper.

This process may be repeated many times, as other scientists, each pursuing some aspect of cell biology, read Greene's 1992 paper and cite it in their own work. By using modern, computerized scanning and indexing methods, it is possible to examine newly published journals and to track every citation to Greene’s paper over a period of, say, five years. The paper may be cited hundreds, or even thousands of times and statistics compiled about Greene’s work may show some interesting statistics about the paper and its influence.

Citation analysis can also provide statistics about the influence of Greene himself. For example, citation analysts could examine all the papers that Greene, either alone or with colleagues, published during a five-year period and determine how many citations, in total, these papers earned. When all of Greene's citations are added, and that sum is divided by the number of papers published, the average number of citations is calculated. This per-paper average is known as “citation impact.”

This kind of analysis can be performed not only for Greene, but for the other molecular biologists at State Technical University. One can learn, for example, how many papers were published in molecular biology journals by State Tech researchers during the five-year period. The total citations can be tallied, and a per-paper, ”impact average ascertained. When applied to universities and other institutions, this impact measure often produces surprises. A simple tally of total citations tends to favor large facilities that produce a high volume of papers. The impact average, however, levels the playing field. In a ranking of universities by total citations in molecular biology journals, State Tech might fall behind such large, powerhouse universities as Harvard or Stanford. A citations-per-paper measure, however, might tell a different story. The highly cited work of Greene and colleagues may demonstrate that their work carries just as much influence—if not more—as that of larger institutions. Thus, even a relatively less prestigious school can rank among the giants in citation impact.

In addition to providing a basis for ranking and evaluating individuals and institutions, citation analysis is a means of tracking science itself. The great number of citations to Greene's 1992 paper, for example, would demonstrate the scientific community's keen and active interest in the protein that Greene described. Scientists or students interested in the topic would want to know which papers Greene cited when preparing the 1992 paper. And they would want to know every paper that had subsequently cited the 1992 work. By using a citation index—and following the trail of footnotes—it is possible to follow the development of a given topic or idea over a span of several years.

Citations can also indicate links between different scientific fields—fields that, on first glance, might not seem related. To add to our roster of imaginary scientists, think of a microbiologist, Professor Blue, who studies bacteria and other microorganisms, and imagine Professor Brown, who studies spoilage in food products. Although the two scientists would probably consider themselves to be in quite different fields of science, citation analysis might demonstrate otherwise. Analysis might show that many scientists who cite Brown's papers on food spoilage frequently cite Blue's papers on bacteria at the same time. Analysts would say that Brown and Blue are frequently 'co-cited.” In fact, citation analysis could show that the papers of Blue and Brown constitute the core of a specialized field devoted to the role of microorganisms in food spoilage. Studying the citation links between different fields, analysts have actually been able to produce maps of different scientific disciplines and their interrelationships. Instead of countries, such maps show clusters of papers and scientists, linked by citations.

It should be noted that a high number of citations does not always indicate ground-breaking, original science. So-called 'methods' papers—those describing tools and techniques for performing scientific tasks—are often highly cited. Such papers, although not reporting new discoveries or ideas, are highly useful to large numbers of scientists. The most-cited paper of all time, in fact, is a methods paper, a 1951 report describing a means for isolating and measuring proteins. This paper, still collecting upwards of 5000 citations a year, has now been cited more than 250,000 times! Another kind of paper, instead of reporting new work, provides a summary of notable research in a given field. These reviews, since they provide a useful reference for scientists writing their own papers, also typically rack up numerous citations.

Similarly, citations do not always indicate support or approval of previous work. Sometimes scientific work attracts attention—and citations—because it is controversial or mistaken. A spate of 'negative' citations, for example, surrounded a 1989 paper by two Utah scientists that purported to show a fusion reaction at room temperature in a simple electrolytic cell. This report, and its hint of a cheap and limitless source of energy, launched a feverish 'cold fusion' gold rush, as scientists around the world dropped everything to try to duplicate the results. From 1989 into early 1990, the paper ranked among the most-cited of any recent report in the physical sciences, attracting dozens of citations every month. Most of the scientists who cited the paper, however, were actually reporting results that discredited the cold fusion claims. Despite occasional episodes of research attracting negative citations, however, the vast majority of citations are positive.

erely adding up citations is not a useful or accurate way to compare scientists in all disciplines, however. Some fields, such as molecular biology and genetics, are very active and crowded, producing a large volume of papers and citations. Other fields, such as radio astronomy, are smaller, with comparatively fewer journals and fewer opportunities to collect citations.

Citation Analysis at Work

The largest citation database in the world is maintained by the Institute for Scientific Information (ISI) in Philadelphia, Pennsylvania. Each year, in tracking the contents of roughly three-quarters of a million newly published papers, the database collects upwards of 15 million footnotes in the sciences, social sciences, and humanities—some citing articles as far back as 1945.

Recently, ISI undertook a study to identify the most-cited research, and researchers, in biomedicine during the 1990s. To locate the 'best of the best' as judged by citations, analysts used a special computer program to automatically extract biomedical papers published between 1990 and 1996 that had each been cited at least 300 times by the end of 1996. Examining this file of 1381 'high-impact' papers, analysts identified and ranked the institutions that produced at least 10 of the highly cited papers.

By the measure of total citations, Harvard University in Cambridge, Massachusetts, outranked all other institutions, with over 60,500 citations to its 128 high-impact biomedical papers published between 1990 and 1996. This citation total was nearly twice that of the next-ranked school, Johns Hopkins University in Baltimore, Maryland, which amassed over 35,500 citations to its 56 high-impact papers.

The ranking of institutions by the citations-per-paper average, or impact, demonstrated that size is not everything. Wellcome Research Labs (now part of the combined pharmaceutical firm of Glaxo Wellcome) in Beckenham, England, produced only 12 of the highly cited reports between 1990 and 1996. Each of these papers, however, was cited, on average, over 735 times—an impact figure that outscored any other institution.

The high-impact papers produced at the Wellcome Research Labs concerned nitric oxide. This substance, isolated by researchers early in the decade, is produced by enzymatic synthesis in humans and other mammals. During the 1990s, scientists learned that nitric oxide plays a central role in a wide variety of bodily processes and diseases, including memory, learning disorders, stroke, cancer, and tuberculosis. Citation analysis demonstrates that nitric oxide was one of the hottest research topics of the decade, accounting for some of the most highly cited work in biomedicine. For example, one 1991 paper on nitric oxide by researchers at Wellcome was cited nearly 4000 times by the end of 1996. Only one other paper published during the decade received more citations—a methods paper from 1990, describing a family of computer programs used in automated gene sequencing. This report was cited more than 4100 times—demonstrating the high level of interest during the decade in using computerized methods to build a detailed biochemical picture of the genetic makeup of humans and other organisms.

Among scientists in the ISI survey of highly cited biomedicine, one name stood out sharply. Bert Vogelstein, a cancer specialist at Johns Hopkins University, was an author of 22 high-impact reports. In total, these papers collected more than 17,000 citations—roughly 10,000 more citations than the number received by the second-ranked researcher. Vogelstein is an expert on tumor-suppressor genes. These genes ordinarily keep cells from turning cancerous. When they are absent, however, or are altered through mutation, such genes actually contribute to the uncontrolled cellular growth known as cancer. Vogelstein and his colleagues have identified several tumor-suppressor genes that play a role in cancers of the colon, breast, and other tissue. They have also developed blood tests to identify individuals whose genes place them at risk. As citation analysis shows, research into tumor-suppressor genes and other genetic and biochemical aspects of cancer has been extremely active during the 1990s.

Who Uses Citation Analysis?

Information scientists are not the only ones who engage in citation analysis. Scientists consult citation indexes to locate background sources and information for their own work, and to see how discoveries and advancements have developed over the years. For scientists, students, or anyone seeking information on a given topic, citations can lead to the most significant research and the most influential researchers. Historians of science examine citations to see not only how ideas developed, but also to see how the scientific literature itself has grown, branched, and changed over the course of many decades. Patent attorneys make use of citations to follow the thread of invention and innovation over time. University librarians, who must manage large collections of journals within limited storage space and, within limited budgets, can use citation-impact measures to identify the most important journals in a given field.

In short, anyone who needs to access, search, and evaluate the world of science can find what they seek by following the trail of footnotes.

About the author: Christopher King is the editor of ScienceWatch, a newsletter that tracks the trends and performance in basic research, published by the Institute for Scientific Information in Philadelphia, Pennsylvania.

Appears in

Information Science; Science; Scientific Method

© 2008 Microsoft