Skip to content

Count models in JAGS

Looks like I’ll be diving into some Bayesian analyses using JAGS. This post is primarily intended as a collection of links to [potentially] useful information, but also includes a few initial thoughts (I might update it occasionally with new links).

In terms of R packages, a very brief play suggests that R2jags is more user friendly than rjags (seems to be easier to get hold of the output and chains etc). Both packages seem to be missing good vignettes. The latter misses good examples too (although there are quite a few examples for both on the internet).

A tip from Zuur & Ieno 2016: use a model matrix and assign priors based on the number of columns of the model matrix. Saves having to change the model description when adding/removing variables (how to include different priors for different betas though…?). E.g. (modified from Zuur & Ieno 2016)

X <- model.matrix(~x1+x2+x3)
k <- ncol(X)
mod <- "
model {
 # priors
 for (i in 1:k) { 
      b[i] <- dnorm(0, 0.1) }
 # likelihood
 for (i in 1:N) {
      Y[i] ~ dpois(mu[i])      
      log(mu[i]) <- inprod(b[], X[i,])
 } 
...
"

General links:

Guide to JAGS through R, goes into quite a bit of detail on installing, diagnostics, even running JAGS from the terminal (as well as 2 or 3 three of the R packages for interfacing with JAGS).

Bayesian regression example (wiki style).

Advanced BUGS topics (includes the “zero-trick” to fit any distribution in JAGS)

Newest R2JAGS questions on StackOverflow

 

Count models:

A post with a couple of examples for JAGS count models (including offsets)

A post with a question regarding autocorrelation of parameters in negative binomial models and the answer on how to get around it.

A report using some data on owl feeding behaviour using 3 or 4 methods for ZIP/ZINBs. (This is not JAGS stuff, but interesting for ZIPs/ZINBs)

 

Books:

A book website with lots of code (Bayesian methods in health economics by Gianluca Baio). But see this R-bloggers post which mentions that an update broke some of the code, and details how to fix it. No idea if the book is any good, but the code should/could be useful

As with their other books*, HighStat’s “beginners guide” to ZIP models is very well written and easy to follow. Because mixed effects zero inflated methods are not implemented in standard software, Zuur and Ieno use JAGS quite a lot. Chapter 12 gives two methods (and code) for assessing overdispersion.

Zuur & Ieno (2016) A beginners guide to zero inflated models using R.

 

*well, the only other one I’ve read (the fantastic “penguin book” – Mixed Effects Models and Extensions in Ecology with R)

Advertisements

Emails from R

There are a few packages for sending email directly from R, but I work in a place where none of these work due to strict network settings. To at least partially circumvent this, here’s some code to produce a PowerShell script to send email(s) via Outlook. The PowerShell script can then be run either by a shell call (again, not possible in my workplace) or by right clicking the file and selecting run with PowerShell.

 

# Addresses 
add <- c("xxx@yyy.cc", "aaa@bbb.cc")
subject <- "Testing"

# construct message
# opening
start <- 'Hi, 
how are you?
'

# main content
body <- '
sent almost exclusively from R 
'

# signature
sig <- '
And this is my signature
'

# paste components together
Body <- paste(start, body, sig)

# construct PowerShell code (*.ps1)
email <- function(recip, subject, Body, filename, attachment = NULL, append){
  
  file <- paste0(filename, ".ps1")
  
  write('$Outlook = New-Object -ComObject Outlook.Application', file, append = append)
  write('$Mail = $Outlook.CreateItem(0)', file, append = TRUE)
  write(paste0('$Mail.To = "', recip, '"'), file, append = TRUE)
  write(paste0('$Mail.Subject = "', subject, '"'), file, append = TRUE)
  write(paste0('$Mail.Body = "', Body, '"'), file, append = TRUE)
  if(!is.null(attachment)){
    write(paste0('$File = "', attachment, '"'), file, append = TRUE)
    write('$Mail.Attachments.Add($File)', file, append = TRUE)
  }
  write('$Mail.Send()', file, append = TRUE)
  if(append) write('', file, append = TRUE)
}


for(i in 1:length(add)){
  file <- paste0("email", i, ".ps1")
  att <- file.path(getwd(), "blabla.txt")
  email(add[i], subject, Body, file, attachment = att) # with attachment
  # email(add[i], subject, Body, file)                 # without attachment
  # email(add[i], subject, Body, file, append = TRUE)  # multiple emails in a single PS file 
}

 

Now you can go and run the PowerShell script from within windows explorer.

 

<UPDATE> The clue to this solving this came from here

“Pretty” table columns 2

I recently wrote a few functions for making some common column formats and posted them here. I’ve recently added them to github gist so that you can download them straight into R using

source("https://gist.githubusercontent.com/aghaynes/7fd9d1c52b1cc4cb566a/raw/")

I changed a couple of the function names though and added an argument to specify_decimal so that you can provide a minimum value. The function arguments are now:

specify_decimal(x, k, min)
npropconv(n, perc, dp.n, dp.perc)
valciconv(val, CIlower, CIupper, dp.val, dp.CI)
minval(values, min)

minval is the new name for minp.

Hope thats useful…

 

 

“Pretty” table columns

Every now and then you might want to make a nice table to include directly in your documents without having to faff about with columns later in excel or word. Typical issues might be the specification of decimal places, converting a value and proportion/SE column into one to take the form of n (x) or a value and CIs into x (x_min – x_max). I needed to do these recently and wrote the following functions to make doing it a bit faster and more productive…

specify_decimal <- function(x, dp){
 out <- format(round(x, dp), nsmall=dp)
 out <- gsub(" ", "", out)
 out
 }

npropconv <- function(n, perc, dp.n=0, dp.perc=1){
 n <- specify_decimal(n, dp.n)
 perc <- specify_decimal(perc, dp.perc)
 OUT <- paste(n, " (", perc, ")", sep="")
 OUT
 }
valciconv <- function(val, CIlower, CIupper, dp.val=2, dp.CI=2){
 val <- specify_decimal(val, dp.val)
 CIlower <- specify_decimal(CIlower, dp.CI)
 CIupper <- specify_decimal(CIupper, dp.CI)
 OUT <- paste(val, " (", CIlower, " - ", CIupper, ")", sep="")
 OUT
 }
minp <- function(pvalues, min=0.001){
  pv <- as.numeric(pvalues)
  for(i in 1:length(pv)){
    if(pv[i] < min) pvalues[i] <- paste("<", min)
  }
  pvalues
}

And heres what they do…

> specify_decimal(x=c(0.01, 0.000001), dp=3)
[1] "0.010" "0.000"
> npropconv(n=7, perc=5, dp.n=0, dp.perc=1)
[1] "7 (5.0)"
> valciconv(val=7, CIlower=3, CIupper=9, dp.val=2, dp.CI=2)
[1] "7.00 (3.00 - 9.00)"
> minp(0.00002, min=0.05)
[1] "< 0.05"
> minp(0.00002, min=0.001)
[1] "< 0.001"

Any arguments with beginning with dp specify the number of decimal places for the relevant parameter (i.e. dp.n sets the decimal places for the n parameter). It would make sence to follow specify_decimal with a call to minp to deal with the 0s:

> decim <- specify_decimal(c(0.01, 0.000001),3)
> minp(decim, 0.001)
[1] "0.010"   "< 0.001"

 

Incidently, although I had p values in mind for minp, it can of course be used with any value at all!

Hope someone finds them handy!!!

Calculating confidence intervals for proportions

Heres a couple of functions for calculating the confidence intervals for proportions.

UPDATE: These confidence intervals, together with many more, have actually been programmed in the binom package (binom.confint). Use them instead. For Stata users, CIs from proportions are available with the ci program.

Firstly I give you the Simple Asymtotic Method:

simpasym <- function(n, p, z=1.96, cc=TRUE){
  out <- list()
  if(cc){
    out$lb <- p - z*sqrt((p*(1-p))/n) - 0.5/n
    out$ub <- p + z*sqrt((p*(1-p))/n) + 0.5/n
  } else {
    out$lb <- p - z*sqrt((p*(1-p))/n)
    out$ub <- p + z*sqrt((p*(1-p))/n)
  }
  out
}

which can be used thusly….

simpasym(n=30, p=0.3, z=1.96, cc=TRUE)
$lb
[1] 0.119348

$ub
[1] 0.480652

 

Where n is the sample size, p is the proportion, z is the z value for the % interval (i.e. 1.96 provides the 95% CI) and cc is whether a continuity correction should be applied. The returned results are the lower boundary ($lb) and the upper boundary ($ub).

The second method is the Score method and is define as follows:

scoreint <- function(n, p, z=1.96, cc=TRUE){
  out <- list()
  q <- 1-p
  zsq <- z^2
  denom <- (2*(n+zsq))
  if(cc){ 
    numl <- (2*n*p)+zsq-1-(z*sqrt(zsq-2-(1/n)+4*p*((n*q)+1)))
    numu <- (2*n*p)+zsq+1+(z*sqrt(zsq+2-(1/n)+4*p*((n*q)-1)))
    out$lb <- numl/denom
    out$ub <- numu/denom
    if(p==1) out$ub <- 1
    if(p==0) out$lb <- 0
  } else {
    out$lb <- ((2*n*p)+zsq-(z*sqrt(zsq+(4*n*p*q))))/denom
    out$ub <- ((2*n*p)+zsq+(z*sqrt(zsq+(4*n*p*q))))/denom
  }
  out
}

and is used in the same manner as simpasym…

scoreint(n=30, p=0.3, z=1.96, cc=TRUE)
$lb
[1] 0.1541262

$ub
[1] 0.4955824

These formulae (and a couple of others) are discussed in Newcombe, R. G. (1998) who suggests that the score method should be more frequently available in statistical software packages.

Hope that help someone!!!

Reference:

Newcombe, R. G. (1998) Two-sided confidence intervals for the single proportion: comparison of seven methods. Statist. Med., 17: 857-872. doi: 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.C

 

Cropping figures in LaTeX

Sometimes when you include a figure, you dont want to have to edit the picture file itself. In MS Word for instance you can use the crop command. LaTeX being LaTeX doesnt work quite the same way, but it is still possible. The standard way is to use the trim and clip options to \includegraphics

\begin{figure}
\includegraphics[trim= left bottom right top, clip=true]{file.jpeg}
\end{figure}

This method doesnt always work however. For instance, if you use xelatex, like me.

There is a way around this though – the adjustbox package. It includes, among other handy functions, the function \adjincludegraphics which does the trick. One of the best things about it is that it even seems to use the same coding, so no need to change all of the argument names…just add adj to the function call.

\usepackage{adjustbox}
...
\begin{figure}
\adjincludegraphics[trim= left bottom right top, clip=true]{file.jpeg}
\end{figure}

I think that its even possible to have it export its settings so that \includegraphics is recoded to do the same as \adjincludegraphics. I’ve not yet tried that though. (It should be just a case of using \usepackage[Export]{adjustbox} instead of \usepackage{adjustbox})

Changing figure or table number styles in LaTeX

Changing how LaTeX prints table or figure numbers is surprisingly easy to do. Say you have a figure and rather that the figure being labelled as Figure 4 you want it to be S1 because its a supplementary figure.

\setcounter{figure}{0}
\makeatletter 
\renewcommand{\thefigure}{S\arabic{figure}}

So we have reset the figure counter to zero and have told LaTeX to redefine \thefigure to have an S before the figure number. Easy huh!?!

For documents with chapters it is almost as easy to do:

\setcounter{figure}{0}
\makeatletter 
\renewcommand{\thefigure}{\arabic{chapter}.S\arabic{figure}}

To change tables instead of figures, just swap any instances of figure for table. Piece of cake.

Do remember to reset it though, otherwise your next chapter continues to use the changes you’ve made!!