Title: | Penalised Regression for Dichotomised Outcomes |
---|---|
Description: | Implements lasso and ridge regression for dichotomised outcomes (<doi:10.1080/02664763.2023.2233057>), i.e., numerical outcomes that were transformed to binary outcomes. Such artificial binary outcomes indicate whether an underlying measurement is greater than a threshold. |
Authors: | Armin Rauschenberger [aut, cre] |
Maintainer: | Armin Rauschenberger <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2024-10-26 05:57:37 UTC |
Source: | https://github.com/rauschenberger/cornet |
Verifies whether an argument matches formal requirements.
.check( x, type, dim = NULL, miss = FALSE, min = NULL, max = NULL, values = NULL, inf = FALSE, null = FALSE )
.check( x, type, dim = NULL, miss = FALSE, min = NULL, max = NULL, values = NULL, inf = FALSE, null = FALSE )
x |
argument |
type |
character |
dim |
vector/matrix dimensionality: integer scalar/vector |
miss |
accept missing values: logical |
min |
lower limit: numeric |
max |
upper limit: numeric |
values |
only accept specific values: vector |
inf |
accept infinite ( |
null |
accept |
cornet:::.check(0.5,type="scalar",min=0,max=1)
cornet:::.check(0.5,type="scalar",min=0,max=1)
Verifies whether two or more arguments are identical.
.equal(..., na.rm = FALSE)
.equal(..., na.rm = FALSE)
... |
scalars, vectors, or matrices of equal dimensions |
na.rm |
remove missing values: logical |
cornet:::.equal(1,1,1)
cornet:::.equal(1,1,1)
Simulates data for unit tests
.simulate(n, p, cor = 0, prob = 0.1, sd = 1, exp = 1, frac = 1)
.simulate(n, p, cor = 0, prob = 0.1, sd = 1, exp = 1, frac = 1)
n |
sample size: positive integer |
p |
covariate space: positive integer |
cor |
correlation coefficient :
numeric between |
prob |
effect proportion:
numeric between |
sd |
standard deviation: positive numeric |
exp |
exponent: positive numeric |
frac |
class proportion:
numeric between |
For simulating correlated features (cor
),
this function requires the R package MASS
(see
mvrnorm
).
Returns invisible list with elements y
and X
.
data <- cornet:::.simulate(n=10,p=20) names(data)
data <- cornet:::.simulate(n=10,p=20) names(data)
Compares models for a continuous response with a cut-off value.
.test(y, cutoff, X, alpha = 1, type.measure = "deviance")
.test(y, cutoff, X, alpha = 1, type.measure = "deviance")
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
type.measure |
loss function for binary classification:
character |
Splits samples into percent for training
and
percent for testing,
calculates squared deviance residuals of logistic and combined regression,
conducts the paired one-sided Wilcoxon signed rank test,
and returns the
-value.
For the multi-split test,
use the median
-value from
single-split tests
(van de Wiel 2009).
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) cornet:::.test(y=y,cutoff=0,X=X)
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) cornet:::.test(y=y,cutoff=0,X=X)
Extracts estimated coefficients from linear and logistic regression, under the penalty parameter that minimises the cross-validated loss.
## S3 method for class 'cornet' coef(object, ...)
## S3 method for class 'cornet' coef(object, ...)
object |
cornet object |
... |
further arguments (not applicable) |
This function returns a matrix with rows and two columns,
where
is the sample size. It includes the estimated coefficients
from linear regression (1st column:
"beta"
)
and logistic regression (2nd column: "gamma"
).
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) coef(net)
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) coef(net)
Implements lasso and ridge regression for dichotomised outcomes. Such outcomes are not naturally but artificially binary. They indicate whether an underlying measurement is greater than a threshold.
cornet( y, cutoff, X, alpha = 1, npi = 101, pi = NULL, nsigma = 99, sigma = NULL, nfolds = 10, foldid = NULL, type.measure = "deviance", ... )
cornet( y, cutoff, X, alpha = 1, npi = 101, pi = NULL, nsigma = 99, sigma = NULL, nfolds = 10, foldid = NULL, type.measure = "deviance", ... )
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
npi |
number of |
pi |
pi sequence:
vector of increasing values in the unit interval;
or |
nsigma |
number of |
sigma |
sigma sequence:
vector of increasing positive values;
or |
nfolds |
number of folds:
integer between |
foldid |
fold identifiers:
vector with entries between |
type.measure |
loss function for binary classification:
character |
... |
further arguments passed to |
The argument family
is unavailable, because
this function fits a gaussian model for the numeric response,
and a binomial model for the binary response.
Linear regression uses the loss function "deviance"
(or "mse"
),
but the loss is incomparable between linear and logistic regression.
The loss function "auc"
is unavailable for internal cross-validation.
If at all, use "auc"
for external cross-validation only.
Returns an object of class cornet
, a list with multiple slots:
gaussian
: fitted linear model, class glmnet
binomial
: fitted logistic model, class glmnet
sigma
: scaling parameters sigma
,
vector of length nsigma
pi
: weighting parameters pi
,
vector of length npi
cvm
: evaluation loss,
matrix with nsigma
rows and npi
columns
sigma.min
: optimal scaling parameter,
positive scalar
pi.min
: optimal weighting parameter,
scalar in unit interval
cutoff
: threshold for dichotomisation
Armin Rauschenberger and Enrico Glaab (2024). "Predicting dichotomised outcomes from high-dimensional data in biomedicine". Journal of Applied Statistics 51(9):1756-1771. doi:10.1080/02664763.2023.2233057. (Click here to access PDF. Contact: [email protected].)
Methods for objects of class cornet
include
coef
and
predict
.
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) net
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) net
Compares models for a continuous response with a cut-off value.
cv.cornet( y, cutoff, X, alpha = 1, nfolds.ext = 5, nfolds.int = 10, foldid.ext = NULL, foldid.int = NULL, type.measure = "deviance", rf = FALSE, xgboost = FALSE, ... )
cv.cornet( y, cutoff, X, alpha = 1, nfolds.ext = 5, nfolds.int = 10, foldid.ext = NULL, foldid.int = NULL, type.measure = "deviance", rf = FALSE, xgboost = FALSE, ... )
y |
continuous outcome:
vector of length |
cutoff |
cut-off point for dichotomising outcome into classes:
meaningful value between |
X |
features:
numeric matrix with |
alpha |
elastic net mixing parameter:
numeric between |
nfolds.ext |
number of external folds |
nfolds.int |
internal fold identifiers:
vector of length |
foldid.ext |
external fold identifiers:
vector of length |
foldid.int |
number of internal folds |
type.measure |
loss function for binary classification:
character |
rf |
comparison with random forest: logical |
xgboost |
comparison with extreme gradient boosting: logical |
... |
Computes the cross-validated loss of logistic and combined regression.
## Not run: n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) start <- Sys.time() loss <- cv.cornet(y=y,cutoff=0,X=X) end <- Sys.time() end - start loss ## End(Not run)
## Not run: n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) start <- Sys.time() loss <- cv.cornet(y=y,cutoff=0,X=X) end <- Sys.time() end - start loss ## End(Not run)
Plots the loss for different combinations of scaling (sigma) and weighting (pi) parameters.
## S3 method for class 'cornet' plot(x, ...)
## S3 method for class 'cornet' plot(x, ...)
x |
cornet object |
... |
further arguments (not applicable) |
This function plots the evaluation loss (cvm
).
Whereas the matrix has sigma in the rows, and pi in the columns,
the plot has sigma on the -axis, and pi on the
-axis.
For all combinations of sigma and pi, the colour indicates the loss.
If the R package
RColorBrewer
is installed,
blue represents low. Otherwise, red represents low.
White always represents high.
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) plot(net)
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) plot(net)
Predicts the binary outcome with linear, logistic, and combined regression.
## S3 method for class 'cornet' predict(object, newx, type = "probability", ...)
## S3 method for class 'cornet' predict(object, newx, type = "probability", ...)
object |
cornet object |
newx |
covariates:
numeric matrix with |
type |
|
... |
further arguments (not applicable) |
For linear regression, this function tentatively transforms the predicted values to predicted probabilities, using a Gaussian distribution with a fixed mean (threshold) and a fixed variance (estimated variance of the numeric outcome).
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) predict(net,newx=X)
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) predict(net,newx=X)
Prints summary of cornet object.
## S3 method for class 'cornet' print(x, ...)
## S3 method for class 'cornet' print(x, ...)
x |
cornet object |
... |
further arguments (not applicable) |
Returns sample size ,
number of covariates
,
information on dichotomisation,
tuned scaling parameter (sigma),
tuned weighting parameter (pi),
and corresponding loss.
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) print(net)
n <- 100; p <- 200 y <- rnorm(n) X <- matrix(rnorm(n*p),nrow=n,ncol=p) net <- cornet(y=y,cutoff=0,X=X) print(net)