Processing an XL grading sheet with R
This tutorial is dedicated to
Vincent and to Erwan, the former for proposing me an interesting challenge to solve with R and the latter for his suspicious remarks about the language not being suited for this kind of task.
Disclaimer: of course, this script can not be used on a computer with Windows OS… 😉
Problem description
When correcting students’ essays, some teachers prefer filling a worksheet with detailed grading for every question in the essay (I must confess that you must be a bit mental to do it but actually… I do it). Well, this kind of detailed grading of an essay can be directly stored in a XL (yeah…) file and the teacher does not write anything on the students’ essay. The file looks like this file NotesPartiel.xls, which is provided as an example (the marks are real but the names are not). It looks like this:
with the first row containing the questions’ numbers, the second, the questions’ full mark and then, the detailed grades for every student. The first two columns contain the students’ first and last names.
The challenge was to code an R script for creating a PDF file that contains a page with the name and detailed marks for every student. To do so, I used:
- a simple R script to manage the overall process: the script loads the XL file, loops over the rows and calls bash commands that creates separated PDF files and merges then into a single PDF file;
- a Sweave file which is used to insert expression evaluated by R and coming for the XL file into a LaTeX document.
Solution part 1: Rscript
The R script first contains a customizable part in which the user can define:
-
the file name
myfile
; -
the questions’ names, which is the character vector
questions
################################# Vincent : to be customized # nom du fichier myfile = "NotesPartiel.xls" # questions questions = c("exo 1", "exo 1, question a", "exo 1, question b", "exo 2", "exo 3, question a", "exo 3, question b", "exo 3, question c", "exo 4", "exo 5", "exo 3 hypothèse", "exo 3") ###############################################################################
To load the data, the R package gdata is then used:
# load data library(gdata) notes = read.xls(myfile, sheet=1, header=TRUE)
Then, a loop is performed over the file’s rows (starting from the second one up to the last one) which:
- check that the current student has a mark (in the other case, nothing happens);
-
use the R package knitr to compile the Sweave document which creates a LaTeX (
.tex
) file; -
using the function
system
run bash commands to compile the LaTeX file with pdflatex and rename it with a name that contains the number of the current row;
library(knitr) # run... ! first line is the total toBeMerged = NULL for (ind in 2:nrow(notes)) { if (sum(!is.na(notes[ind,3:(ncol(notes)-3)]))!=0) { # Sweave the document... knit("TraiterNotes.Rnw") # pdflatex system("pdflatex TraiterNotes.tex") # Move the output system(paste0("mv TraiterNotes.pdf TraiterNotes-",ind,".pdf")) if (is.null(toBeMerged)) { toBeMerged = paste0("TraiterNotes-",ind,".pdf") } else toBeMerged = paste0(toBeMerged, " TraiterNotes-", ind, ".pdf") } }
Finally, all the PDF files that have been created (and those names have been merged during the processing of the previous loop into a string variable called
toBeMerged
) are merged with ghostscript and all PDF files except the merged file are deleted.# Merge system(paste0("gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=finalNotes.pdf ", toBeMerged)) # Clean system(paste0("rm ", toBeMerged))
Solution part 2: Sweave file
The Sweave file
TraiterNotes.Rnw
is very simple. It contains a standard LaTeX header, which can be customized by the user:\documentclass{article} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Vincent : to be customized \title{Notes du devoir trucmuche} \author{Vincent X} \date{} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \usepackage[utf8]{inputenc} \begin{document} \pagestyle{empty}
The remaining of the script contains standard R code inserted in chunks which allows the script to display a table with the detailed marks (using the R package xtable)
and simple texts evaluated by R thanks to the functionSexpr
. The global knitr optionecho=FALSE
is used to prevent the R script to the displayed in the PDF file.Finally, running the R script (from the directory in which the
</div>.Rmd
and the.xls
files are saved) leads to the following PDF file finalNotes.pdf. Not very elegant programming but very handy… Vincent owes me 75 beers for that work!