Ways To Measure Research



Although young computer scientists are told that they need to produce research results, no exact requirements are specified. Instead, researchers are usually given vague encouragement to achieve something ``significant'' or have ``high quality publications'' without any precise explanation of what it means.

To a junior researcher, it may seem that there is a conspiracy among senior people -- that they have some secret way of evaluating research but are unwilling to reveal it. After all, one is likely to hear a vague statement such as, ``research simply means accumulating knowledge'' or the less clever: ``It's difficult to define, but I know good research when I see it.''

The reason a junior staff member cannot obtain a more precise explanation of how to measure research is that no single explanation exists. Instead, there are a variety of measures -- each group tends to use a measure that maximizes their goals. Indeed, someone wants to make a point in favor or against a person, they choose a measure that helps.

If you are a junior researcher, this guide is for you. It lists measures, explains each, and gives the actual facts. Knowing the list will help you impress others when you talk about research and will help you avoid pitfalls.


Journal Paper Approach

(preferred by journal publishers)


Measure: N, the total number of papers published.
Reasoning: A researcher who generates a new idea writes a paper which is then reviewed by peers and eventually published in an archive journal. Thus, number of papers is a measure of productivity.
Actual Facts: Publication standards vary widely, with some conferences and journals accepting all submissions, and others accepting only a few. More important, almost all research is useless; no one really reads the papers. (One study estimates that on the average, a given research paper is read by .5 people, with the number being skewed upward by the few papers that are read by thousands).
Warnings: Although they claim otherwise, tenure committees use this measure because it's much easier to count papers than assess their merit. Note that people with gray hair are especially fond of this measure because they win by citing it -- their personal value of N is much higher than a young researcher's. When using this measure, don't brag about co-authors because credit is reduced when a paper has multiple authors.


Rate Of Publication Approach

(preferred by young researchers)

Measure: N/T, the ratio of total papers published to the time in which they were published.
Reasoning: Paper count is insufficient because it doesn't measure productivity -- if a researcher publishes 10 papers in one year, they are extremely productive, but if they publish 10 papers in a lifetime, they are extremely unproductive.
Actual Facts: A researcher's publication rate varies over time; the real peaks occur just before an individual is considered for promotion, and the rate usually tapers off dramatically in the years before retirement. Thus, as a researcher ages, they stop talking about N/T, and revert to measuring N. Warning: Tenure committees are wary of anyone who cites this measure. Besides, be realistic -- a bunch of gray-haired, old guys are not going to reward you for a high rate when they themselves are facing a rate that has fallen off.


Weighted Publication Approach

(preferred by accreditation agencies)

Measure: W, the sum of the weights assigned to published papers.
Reasoning: Because some papers represent more intellectual achievement than others, each paper should assigned a weight proportional to its quality. Instead of counting papers, the sum of the weights should be used. For example, non-research papers can be assigned a zero or near-zero weight.
Actual Facts: Instead of assessing each individual paper, people who use this method merely assign each journal a weight according to its prestige, and then use the value for any paper that appears in the journal. Of course, the prestige of a journal varies over time, and there is no such thing as a journal in which all papers are of uniform quality, but that doesn't seem to matter. The beauty of the measure is that given a set of publication, one can choose weights to make the list look good or bad.
Warning: When discussing this measure, remember that the choice of weights is arbitrary, and that although an individual may present evidence to justify their choice, in the end everyone seems to favor a set of weight that gives their personal publication list a high ranking.


Millions Of Monkeys Approach

(preferred by government granting agencies)

Measure: G, the total amount of taxpayer money distributed for research.
Reasoning: Given a long enough time, a random set of researchers banging away on keyboards will eventually write a paper about something that will benefit the country. To stimulate more researchers to produce more papers, the government collects proposals, and gives money to the ``best''. Obviously, giving more money will, therefore, stimulate more papers, which will increase the benefit to the country.
Actual Facts: The grant system is closer to a lottery than a national benefit. To ensure that everything is ``equal'', government agencies often follow a political agenda, meaning that the probability of obtaining a grant can depend on such factors as the size of one's institution, its geographic location, race, and gender. In the extreme case, an applicant will receive a letter informing them that they have been selected for a grant, but need to revise their proposal because the scientific content is unacceptable.
Warning: Don't take it personally one way or the other -- being awarded a government grant does not necessarily mean you have a great idea, nor does the denial of a government grant mean the idea is worthless.


Direct Funding Approach

(preferred by department heads)

Measure: D, the total dollars of grant funds acquired by a researcher.
Reasoning: Researchers who are awarded grants for the research must have good ideas (or the granting agency would not have awarded the money). Thus, more money must mean more ideas.
Actual Facts: Department Heads are only interested in impressing Deans and Department Heads at other institutions -- they love to brag about the total dollars of grant funds brought in by all members of their department. Unfortunately, the amount of grants funds that can be collected depends more on the amount available than the quality of proposed research. Governments hand out more when their coffers are overflowing (or when doing so has some political advantage); industry gives out much more when profits are high (or when they can get a tax write off).
Warning: Again, don't read too much into grants -- no matter what anyone says, the amount of money you receive (little or much) is not always proportional to the quality of your ideas.


Indirect Funding Approach

(preferred by university administrators)

Measure: O, the total overhead dollars generated.
Reasoning: When a researcher is awarded N dollars of government grant money, 1/3 of it is designated as ``indirect cost'' or ``overhead'' that pays for such things as office space, electricity, and accountants that keep track of expenditures on the grant. Overhead is a measure of how much the researcher has brought to the institution.
Actual Facts: Office space is needed with or without a grant, and large research institutions already have accounting procedures and systems in place. Thus, indirect costs are merely a way for the institution to rake off money from research grants.
Warning: Equipment grants are exempt from indirect cost, so don't brag to an administrator about a big equipment grant -- they will not be impressed. Also, remember that indirect cost is generated when money is spent, not when it is awarded. Thus, if you spend grant money in January instead of December, the overhead will count toward the new year and not the old.


Bottom Line Approach

(preferred by industrial research labs)

Measure: P, the profit generated by patents or products that result from the research.
Reasoning: An industry creates a research lab to benefit the business units, not simply as a way to spend excessive profits. Thus, in the industrial world, it makes sense to measure research by how it helps the bottom line.
Actual Facts: Almost no research has any real effect on the company profits. Even if a research idea eventually makes its way into a product, the revenue generated depends much more on marketing than on the quality of the underlying idea (there is even some evidence of an inverse relationship between product quality and profit).
Warning: Revenue is a terrible measure of research quality because stupid or trivial ideas often generate the most profit; don't assume an idea has any scientific merit just because it makes money, and don't assume otherwise if it does not.


Assessment Of Impact Approach

(preferred by the handful of researchers who actually achieve something)

Measure: I/R, Ratio of the impact of the work to the amount of resources used to generate it.
Reasoning: The ``impact'' of research on the field provides a good overall measure of value. One might ask questions such as: Did the work affect others? or Is the work cited and used? However, to make comparison fair, one cannot compare research performed by a team of twenty four researchers working at a large industrial lab using equipment that costs ten million dollars to the research performed by an individual working on weekends with no staff. Thus, to make a fair assessment, compute the ratio of impact to resources.
Actual Facts: Both impact and resources are difficult to measure. More important, it is unfortunate that ``big science'' often appears to have more impact simply because it generates more publicity.
Warning: Note that although this measure is the most fair, it is unpopular. Administrators dislike the measure because the amount of funding -- the item they wish to emphasize -- appears in the denominator, meaning that a researcher who achieves a given impact with fewer grant funds receives a higher evaluation under this measure! Most researchers dislike the measure as well because it emphasizes output over input -- it is much easier to obtain funding than to produce results that have any real impact.


Conclusion

If your research doesn't look good under the measure in use, maybe it's time to change the measure!


Translations: