This is an initial attempt to enable easy calculation/visualization of study designs from R/gap which benchmarked relevant publications and eventually the app can produce more generic results.
One can run the app with R/gap installation as follows,
Alternatively, one can run the app from source using
gap/inst/shinygap
. In fact, these are conveniently wrapped
up as runshinygap()
function.
To set the default parameters, some compromises need to be made, e.g., Kp=[1e-5, 0.4], MAF=[1e-3, 0.8], alpha=[1e-8, 0.05], beta=[0.01, 0.4]. The slider inputs provide upper bounds of parameters.
This is a call to fbsize()
.
This is a call to pbsize()
.
This is a call to ccsize()
whose power
argument indcates power (TRUE) or sample size (FALSE) calculation.
We implement it in function whose format is
tscc(model, GRR, p1, n1, n2, M, alpha.genome, pi.samples, pi.markers, K)
which requires specification of disease model (multiplicative, additive, dominant, recessive), genotypic relative risk (GRR), the estimated risk allele frequency in cases (p1), total number of cases (n1) total number of controls (n2), total number of markers (M), the false positive rate at genome level (αgenome), the proportion of markers to be selected (πmarkers, also used as the false positive rate at stage 1) and the population prevalence (K).
This is detailed in the package vignettes gap, https://cran.r-project.org/package=gap, or jss1.
Our implemention is with respect to two aspects2.
$$\Phi\left(Z_\alpha+\tilde{n}^\frac{1}{2}\theta\sqrt{\frac{p_1p_2p_D}{q+(1-q)p_D}}\right)$$ where α is the significance level, θ is the log-hazard ratio for two groups, pj, j = 1, 2, are the proportion of the two groups in the population (p1 + p2 = 1), ñ is the total number of subjects in the subcohort, pD is the proportion of the failures in the full cohort, and q is the sampling fraction of the subcohort.
$$\tilde{n}=\frac{nBp_D}{n-B(1-p_D)}$$ where $B=\frac{Z_{1-\alpha}+Z_\beta}{\theta^2p_1p_2p_D}$ and n is the whole cohort size.
Tests of allele frequency differences between cases and controls in a two-stage design are described here3. The usual test of proportions can be written as $$z(p_1,p_2,n_1,n_2,\pi_{samples})=\frac{p_1-p_2}{\sqrt{\frac{p_1(1-p_1)}{2n_1\pi_{sample}}+\frac{p_2(1-p_2)}{2n_2\pi_{sample}}}}$$ where p1 and p2 are the allele frequencies, n1 and n2 are the sample sizes, πsamples is the proportion of samples to be genotyped at stage 1. The test statistics for stage 1, for stage 2 as replication and for stages 1 and 2 in a joint analysis are then z1 = z(p̂1, p̂2, n1, n2, πsamples), z2 = z(p̂1, p̂2, n1, n2, 1 − πsamples), $z_j = \sqrt{\pi_{samples}}z_1+\sqrt{1-\pi_{samples}}z_2$, respectively. Let C1, C2, and Cj be the thresholds for these statistics, the false positive rates can be obtained according to P(|z1| > C1)P(|z2| > C2, sign(z1) = sign(z2)) and P(|z1| > C1)P(|zj| > Cj||z1| > C1) for replication-based and joint analyses, respectively.