Skip to content

R

R is a free and open source programming language and environment for statistical computing and data visualisation. R is an interpreted language which makes use of a range of data structures including vectors and data frames. R and it's associated software packages can be installed from the Comprehensive R Archive Network (CRAN). The language is commonly used with it's own free development environment RStudio. RStudio includes it's own console, code editor, and tools for plotting, history, debugging, and workspace management.

Personal Experience

N/A

Tools & Software Integrations

N/A

  • Bibliometrix - An open-source R package for quantitative research in scientometrics and bibliometrics. The package includes a shiny-based web app called biblioshiny which provides a no code GUI.
  • RTools - A set of programs that is required on Windows to build R packages from source.
  • RStudio - An IDE (Integrated Development Environment) that makes R easier to use. It includes a code editor, debugging and visualization tools.
  • Tidyverse - The tidyverse is a collection of R packages for data science.

Resources

N/A

Notes and Troubleshooting

Merge Bibtex Files for Bibliometrix

Method 1 - Processing in R

After downloading separate Bibtex files from Scopus and World of Science you can use the following code in RStudio to generate an Excel .xlsx file without duplicates:

install.packages("bibliometrix") # if you don't have it installed 
# setwd("ENTER PATH TO WORKING DIRECTORY")
library(bibliometrix)
S = convert2df("scopus.bib", dbsource = "scopus", format = "bibtex")
W = convert2df("wos.bib", dbsource = "isi", format = "bibtex")
Database = mergeDbSources(S, W, remove.duplicated = TRUE)
dim(Database)
install.packages("openxlsx") # if you don't have it installed
library(openxlsx) 
write.xlsx(Database, file = "database.xlsx") 

When using Bibliometrix, some graphics may not be generated due to the error duplicate row.names are not allowed. This occurs when the last column (SR) of the xlsx file contains non-unique values. This can happen if the mergeDbSources method in the R code fails to eliminate some duplicates (because titles of the same work sometimes present subtle differences and therefore are not eliminated).

To prevent this issue, you should edit the xlsx file after it's created by following these steps: 1. Remove any duplicate values in the DOI column (DI). 2. Remove any duplicate values in the SR column. 3. Ensure that the SR column has no missing values.

Method 2 - Using Bibtex-Tidy Online

Alternatively Bibtex files from Scopus and World of Science can be combined using the Bibtex-Tidy online: https://flamingtempura.github.io/bibtex-tidy/

When merging the bib files with Bibtex-Tidy, choose to merge only based on "Matching DOIs". Avoid using "Matching Keys" for merging because this can delete different works of the same authors.

Source: https://youtu.be/chaDruiPs4U