Thursday, 1 August 2013

Automatically creating citations for your R packages within knitr

Health warning: the following post is about solving a fairly small problem when using R, latex and knitr. If you don't already use two out of three of these you might want to skip over this post!

Workflow issues are some of the most irritating issues in academia. On one extreme you can use stata and Microsoft word to analyse and write your documents. However, there are a whole host of formatting and replicability issues that go along with this setup. At the other extreme you can use an ever increasing cocktail of software to finely hone every aspect of creating a paper with the consequent problems of updates, conflicts and looking for missing parentheses in five different scripts.




Kieran Healy has a good overview of some of these trade-offs so I won't go into that more here. Suffice to say I tend to come down more towards the latter end of the spectrum, making use of a wide array of tools.

One of these tools is knitr. Knitr allows you to write latex documents with embedded R code. I won't go into a full tutorial here but it's a neat way of integrating your code directly into the document. Most importantly from my perspective it means you avoid having to recreate all your analysis from scratch every time you make an adjustment to your data cleaning etc. Instead the document just recompiles from the updated dataset and updates any numbers in the document.

One issue I've had however is including citations to R packages in the document. R packages are fiddly to add to most reference managers and it seems odd to export the citation out of R into a citation manager and then back into knitr.

R provides a citation() function that gives a bibtex citation for the package. However, they don't include the reference key in this.

A quick solution I've come up with uses the latex filecontents package and an extra wrapper for the  citation function to automatically create a .bib file with citations for all the packages used in your analysis.

The wrapper function takes the package name and formats the bibtex citation ready to be dropped straight into a .bib file.

getBibTex <- function(pkg.name) {
 bib.cite <- toBibtex(citation(pkg.name))
 key <- paste0("rPkg_", pkg.name)
 temp.top <- substr(bib.cite[1], start = 1, 
   stop = (nchar(bib.cite[1]) - 1))
 bib.cite[1]  <- paste0(temp.top, key, ",")
 return(bib.cite) 
}

This means that every package has the key rpkg_[package name].

So here's the example document. It loads two packages then drops their citations into packages.bib. The \addbibresource{packages.bib} line then loads this file for use in citations. It should be noted that you can also also addbibresource with other .bib files at the same time so you can still integrate the citations from your normal citation manager.

Finally, I make use of the ggplot2 package to plot some data and cite Hadley Whickham's excellent package.


\documentclass[a4paper,12pt]{article} 
\usepackage{filecontents}
\usepackage[backend=bibtex8]{biblatex} 
\begin{filecontents}{packages.bib}
<<definingPackageCitations, cache = FALSE, results='asis', echo = FALSE, warning=FALSE, message=FALSE>>=
package.vector <- c( "ggplot2", "xtable" ) # a vector of the packages used
loaded <- lapply(package.vector, require, quietly = TRUE, 
character.only = TRUE) # loading in the packages 

getBibTex <- function(pkg.name) {
 bib.cite <- toBibtex(citation(pkg.name))
 key <- paste0("rPkg_", pkg.name)
 temp.top <- substr(bib.cite[1], start = 1, 
  stop = (nchar(bib.cite[1]) - 1))
 bib.cite[1]  <- paste0(temp.top, key, ",")
 return(bib.cite) 
}

for(i in package.vector) {
print(getBibTex(i))
}
@
\end{filecontents}

\addbibresource{packages.bib} % Specifying the packages.bib file 

\begin{document}
<<aBeautifulPlot, cache = FALSE, results='asis', echo = FALSE, warning=FALSE, message=FALSE>>=
data(mtcars)
qplot(mpg, wt, data=mtcars)
@

This plot was created using the qplot function in the ggplot2 package \autocite{rPkg_ggplot2}
\printbibliography
\end{document}




No comments:

Post a Comment