Search for Evergreens in Science: A Functional Data Analysis1footnote 11footnote 1Ruizhi Zhang, Jian Wang & Yajun Mei. (2017). Search for evergreens in science: A functional data analysis. Journal of Informetrics, 11(3), 629–644. http://dx.doi.org/10.1016/j.joi.2017.05.007 ©2017 Elsevier Ltd. The authors thank the editor and three anonymous referees for their constructive comments which have substantially improved this paper. R. Zhang and Y. Mei were supported in part by the NSF grant CMMI-1362876, and J. Wang by a postdoctoral fellowship from the Research Foundation – Flanders (FWO). Data used in this paper are from a bibliometric database developed by the Competence Center for Bibliometrics for the German Science System (KB) and derived from the 1980 to 2012 Science Citation Index Expanded (SCI-E), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index (AHCI), Conference Proceedings Citation Index–Science (CPCI-S), and Conference Proceedings Citation Index–Social Science & Humanities (CPCI-SSH) prepared by Thomson Reuters (Scientific) Inc. (TR®), Philadelphia, Pennsylvania, USA: ©Copyright Thomson Reuters (Scientific) 2013. KB is funded by the German Federal Ministry of Education and Research (BMBF, project number: 01PQ08004A).

Search for Evergreens in Science: A Functional Data Analysis111Ruizhi Zhang, Jian Wang & Yajun Mei. (2017). Search for evergreens in science: A functional data analysis. Journal of Informetrics, 11(3), 629–644. http://dx.doi.org/10.1016/j.joi.2017.05.007
©2017 Elsevier Ltd.
The authors thank the editor and three anonymous referees for their constructive comments which have substantially improved this paper. R. Zhang and Y. Mei were supported in part by the NSF grant CMMI-1362876, and J. Wang by a postdoctoral fellowship from the Research Foundation – Flanders (FWO). Data used in this paper are from a bibliometric database developed by the Competence Center for Bibliometrics for the German Science System (KB) and derived from the 1980 to 2012 Science Citation Index Expanded (SCI-E), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index (AHCI), Conference Proceedings Citation Index–Science (CPCI-S), and Conference Proceedings Citation Index–Social Science & Humanities (CPCI-SSH) prepared by Thomson Reuters (Scientific) Inc. (TR®), Philadelphia, Pennsylvania, USA: ©Copyright Thomson Reuters (Scientific) 2013. KB is funded by the German Federal Ministry of Education and Research (BMBF, project number: 01PQ08004A).

Ruizhi Zhang, Jian Wang & Yajun Mei
H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology
Center for R&D Monitoring and Department of Managerial Economics, Strategy & Innovation, KU Leuven
German Center for Higher Education Research and Science Studies, DZHW Berlin
Emails: zrz123@gatech.edu, jian.wang@kuleuven.be, ymei@isye.gatech.edu
May 17, 2017
Abstract

Evergreens in science are papers that display a continual rise in annual citations without decline, at least within a sufficiently long time period. Aiming to better understand evergreens in particular and patterns of citation trajectory in general, this paper develops a functional data analysis method to cluster citation trajectories of a sample of 1699 research papers published in 1980 in the American Physical Society (APS) journals. We propose a functional Poisson regression model for individual papers’ citation trajectories, and fit the model to the observed 30-year citations of individual papers by functional principal component analysis and maximum likelihood estimation. Based on the estimated paper-specific coefficients, we apply the K-means clustering algorithm to cluster papers into different groups, for uncovering general types of citation trajectories. The result demonstrates the existence of an evergreen cluster of papers that do not exhibit any decline in annual citations over 30 years.

Keywords: citation trajectory; evergreen; functional Poisson regression; functional principal component analysis; K-means clustering

\DeclareLanguageMapping

englishamerican-apa \addbibresourcematthew

See pages - of CIT_FDA_170517_Final.pdf

Figure 1: Annual citations of the top ten cited APS papers. One curve represents one paper, and details about these ten papers are reported in Appendix A.
Figure 2: Annual and cumulative citations of four selected papers. One curve represents one selected paper. The red, blue, purple, and green curves correspond to flash-in-the-pan, normal document, delayed document, and evergreen respectively.
Figure 3: Mean function and its first derivative.
Figure 4: The first four eigenfunctions.
Figure 5: Determining the number of eigenfunctions.
Figure 6: Goodness of fit. The left panel plots kernel densities of log MSEs. The right panel is a scatterplot, where one point represents one paper, and its X- and Y-axes are the log MSEs for the WSB model and our functional Poisson regression model respectively.
Figure 7: Clustering results: Four general types of citation trajectories. The red, blue, purple, and green curves represent normal-low, normal-high, delayed, and evergreen papers respectively.
Figure 8: Clustering results: K = 2-6, three methods.
Figure 9: Clustering results: Alterative citation thresholds.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
35530
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description