\documentclass{article}

\usepackage{amsmath}
\usepackage{fullpage}
\usepackage{url}
\usepackage{xspace}
\newcommand{\R}{\texttt{R}\xspace}

\title{Nonparametric statistics}
\author{Patrick Breheny}
\date{November 10, 2016}

<<include=FALSE, purl=FALSE>>=
require(knitr)
opts_chunk$set(prompt = TRUE)
opts_chunk$set(comment = NA)
opts_template$set(fig = list(fig.height=4, fig.width=5, out.width='.5\\linewidth', fig.align='center'))
@

\begin{document}
\maketitle

Today's lab is about nonparametric statistics.  We will discuss the use of \R's built-in function, {\tt wilcox.test}, for carrying out the Wilcoxon rank sum test, then do some programming and create our own permutation test and construct a bootstrap confidence interval.

\section{{\tt wilcox.test}}

Let's begin by using a built-in function in \R for nonparametric testing.  To illustrate its use, let's work with the lead smelter data we've discussed in class.  The primary syntax of the function is essentially identical to that of {\tt t.test}:

<<wilcox.test___Lead_IQ_data>>=
lead <- read.delim("http://myweb.uiowa.edu/pbreheny/data/lead-iq.txt")
t.test(IQ~Smelter, lead)
wilcox.test(IQ~Smelter, lead)
x <- with(lead, IQ[Smelter=="Near"])
y <- with(lead, IQ[Smelter=="Far"])
wilcox.test(x, y) ## Same thing
@

In this particular case, as you may recall, the data is quite normal-looking, and the agreement between the parametric and nonparametric test is not surprising.  However, in the presence of outliers, the two might give very different results.  Let's see what happens if we add a super-genius to the ``near smelter'' group:

<<Adding_an_outlier>>=
x <- c(x, 210)
t.test(x, y)$p.value
wilcox.test(x, y)$p.value
@

As we mentioned in class, the Wilcoxon rank sum test is a permutation test.  In general, exact solutions to permutation tests are quite time-consuming to calculate, although there are some very clever algorithms for speeding things up when you're considering sums of permuted consecutive positive integers (i.e., what you're doing in a rank sum test).  Now, this won't actually work for the lead smelter data because there are ties -- eight kids, for example, have an IQ of 96 -- so the ranks aren't consecutive integers anymore.  So unfortunately, even if you ask for an exact test, {\tt wilcox.test} can't give you one:

<<Exact_method___Problem_with_ties>>=
wilcox.test(IQ~Smelter, lead, exact=TRUE)
@

For the sake of illustration, though, let's just remove the ties by adding a small amount of random noise to all the observations (this is known as ``jittering''):

<<Exact_vs_approximate>>=
lead2 <- lead
lead2$IQ <- lead2$IQ + runif(nrow(lead2), -0.1, 0.1)
wilcox.test(IQ~Smelter, lead2, exact=TRUE)     ## Exact result
wilcox.test(IQ~Smelter, lead2)                 ## Approximation 1
wilcox.test(IQ~Smelter, lead2, correct=FALSE)  ## Approximation 2
@

In this case, with $n=124$, both approximations are very accurate.  At the same time, it didn't take very long to compute the exact result, so there is little reason in this case to prefer the appoximation.  The computational burden increases dramatically, however, with larger sample sizes.  For example, with $n=400$, the exact result takes thousands of times longer to compute than the approximation:

<<Exact_vs_approximate___Time>>=
x <- runif(200)
y <- runif(200)
system.time(wilcox.test(x, y, exact=TRUE))
system.time(wilcox.test(x, y))
@

Finally, as we discussed in class, one can invert the Wilcoxon rank sum test to obtain a semiparametric confidence interval for the difference in location for two groups:

<<Semiparametric_interval>>=
x <- with(lead, IQ[Smelter=="Near"])
y <- with(lead, IQ[Smelter=="Far"])
wilcox.test(x, y, mu=3) ## This is the test
wilcox.test(x, y+3)     ## we are inverting
wilcox.test(x, y, conf.int=TRUE)
@

It is again worth noting that this interval is quite similar to the confidence interval we get from the $t$ test for the difference in mean IQ between the two groups.

\section{The Wilcoxon signed rank test}

In class, we discussed the idea of nonparametric testing for two-sample studies, but what about one-sample studies like our cystic fibrosis crossover experiment?  The idea of conditioning on the observed responses and putting them into the same urn doesn't apply here.  How can we construct a nonparametric test in the one-sample case?

Frank Wilcoxon had an idea for this situation as well, and his idea was this: consider instead conditioning on the {\em absolute values} of the observations.  Under the null, these values are just as likely to be positive as negative (i.e., a patient is just as likely to do better on the drug as on the placebo), so we can randomly assign each observation a $+/-$ sign under the null.  Let's implement this idea in \R:

<<Do-it-yourself_signed_rank_test, opts.label="fig">>=
cf <- read.delim("http://myweb.uiowa.edu/pbreheny/data/cystic-fibrosis.txt")
Diff <- apply(cf, 1, diff)
r <- rank(abs(Diff))  ## Absolute ranks
Obs <- sum(r[Diff>0]) ## Sum of positive ranks
n <- nrow(cf)
N <- 10000
ts <- numeric(N)
for (i in 1:N) {
  s <- rbinom(n, 1, 0.5) ## Random signs
  ts[i] <- sum(r[s==1])
}
hist(ts, col="gray", border="white", xlim=c(0,105), breaks=0:105,
     main="", xlab="Test statistic")
v <- c(mean(ts) - (Obs-mean(ts)), Obs)
abline(v=v, col="slateblue", lwd=3)
p <- sum(ts <= v[1] | ts >= v[2])/N
mtext(paste("p =",p), line=1)
@

In this particular case, the sample size is small enough where we can work out the exact test ourselves:

<<Do-it-yourself_signed_rank_test__exact, results='hide'>>=
l <- rep(list(0:1), n)
X <- as.matrix(expand.grid(l))
N <- nrow(X)
ts <- numeric(N)
pb <- txtProgressBar(1, N, style=3)
for (i in 1:N) {
  ts[i] <- sum(r[X[i,]==1])
  setTxtProgressBar(pb, i)
}
@

<<Exact_signed_rank_test__Figure, opts.label="fig">>=
tab <- table(ts)
obs <- as.character(Obs)
v <- as.numeric(names(tab)[tab==tab[obs]])
hist(ts, col="gray", border="white", xlim=c(0,105), breaks=0:105,
     main="", xlab="Test statistic")
abline(v=v, col="slateblue", lwd=3)
p <- sum(tab[tab <= tab[obs]])/N
mtext(paste("p =", round(p, 5)), line=1)
@

The purpose of the above programming was mainly pedagogical, since the {\tt wilcox.test} function can be used for Wilcoxon signed rank tests as well:

<<wilcox.test___Signed>>=
wilcox.test(Diff)
@

Note that we obtain exactly the same $p$-value from {\tt wilcox.test} that we got from our do-it-yourself version, which is reassuring.

As with the rank sum test, the signed rank test can be inverted to form a semiparametric confidence interval:

<<wilcox.test___Confidence_interval>>=
wilcox.test(Diff, mu=4) ## Test of location
wilcox.test(Diff-4)     ## Same thing
wilcox.test(Diff, conf.int=TRUE)
@

What is this a confidence interval for?  As the output indicates, it's for something called the ``pseudomedian''.  OK, but what if we want a confidence interval for the actual median?  This is a perfect time to use the bootstrap:

<<Bootstrapping_the_median>>=
B <- 10000
b <- numeric(B)
for (i in 1:B) {
  x <- sample(Diff, replace=TRUE)
  b[i] <- median(x)
}
quantile(b, c(.025, .975))                ## Percentile interval
median(Diff) + qnorm(c(.025, .975))*sd(b) ## "z" interval
@

Generally, the bootstrap percentile interval is superior to the bootstrap $z$ interval.  It is possible to construct other, even more accurate bootstrap confidence intervals as well, although such intervals lie beyond the scope of this course.  If you are interested, however, there is a nice \R package, {\tt boot}, that will construct such intervals (known as $BC_a$ intervals) for you.

\end{document}