Title: | Reads, Annotates, and Normalizes Reverse Phase Protein Array Data |
---|---|
Description: | Reads in sample description and slide description files and annotates the expression values taken from GenePix results files (text file format used by many microarray scanner and software providers). After normalization data can be visualized as boxplot, heatmap or dotplot. |
Authors: | Heiko Mannsperger with contributions of Stephan Gade |
Maintainer: | Torsten Schoeps <[email protected]> |
License: | LGPL |
Version: | 1.4.9 |
Built: | 2024-11-22 04:37:18 UTC |
Source: | https://github.com/cran/RPPanalyzer |
The package reads pheno and feature data of an RPPA experiment from textfiles and annotates the expression values in genepix result files (gpr files). For background correction the backgroundcorrect
funktion from the limma package is used. After normalization data can be plotted to check quality control or to get a first impression on the biological relevance of the data set.
Maintainer: Heiko Mannsperger <[email protected]>
## Not run: data(dataI) bgcorrected <- correctBG(dataI) normalized <- normalizeRPPA(bgcorrected,method="proteinDye") aggregated <- sample.median(normalized) ## End(Not run)
## Not run: data(dataI) bgcorrected <- correctBG(dataI) normalized <- normalizeRPPA(bgcorrected,method="proteinDye") aggregated <- sample.median(normalized) ## End(Not run)
The function assumes that each signal originates from an underlying true value which is scaled by a scaling factor depending on the slide and replicate. The method optimizes the scaling and truth parameters such that the distance between predicted and actual signals is minimized. There are aguments to specify what factors the scaling factors and truth parameters depend on.
averageData(subsample, scaling = c("slide", "replicate"), distinguish = c("cellline", "treatment"))
averageData(subsample, scaling = c("slide", "replicate"), distinguish = c("cellline", "treatment"))
subsample |
data.frame with columns "slide" (factor, the slide names), "ab" (factor, the antibody/target names), "time" (numeric, the time points), "signal" (numeric, signal values), "var0" (numeric, error parameter for the constant error), "varR" (numeric, error parameter for the relative error). The data.frame may contain further columns that can then be used in the |
scaling |
character. One scaling parameter ist estimated for each occurring combination of the corresponding factors. |
distinguish |
character. One truth parameter ist estimaed for each occuring combination of the factors "time", "ab" (antibody/target) and the factors in |
Averaging is based on the assumption that for each level of scaling
there is an underlying "true" antibody time-course for each level of distinguish
. The signals of different scaling levels are assumed to differ by a scaling factor. Both, antibody time-course values and scaling parameters are estimated simulatenously by generalized least squares estimation:
where correspond to the levels of
c("time", "ab", distinguish)
and the levels of scaling
.
data.frame |
with columns "time", "ab", "signal" (the truth parameters returned by |
Daniel Kaschek, Physikalisches Institut, Uni Freiburg. Email: [email protected]
calculates sample concentrations of a RPPA data set, using parameter of a linear model fitted to the dilution series.
calcLinear(x, sample.id = c("sample", "sample.n"), dilution = "dilution" , method = "quantreg", plot = F, detectionLimit = T)
calcLinear(x, sample.id = c("sample", "sample.n"), dilution = "dilution" , method = "quantreg", plot = F, detectionLimit = T)
x |
List containing background corrected RPPA data set |
sample.id |
character vector refering to column names from which samples can be separated |
dilution |
column name from the column in feature data that describes the dilution steps of each sample |
method |
character string describing the method used for the linear fit |
plot |
logical. If true dilution curves are plotted |
detectionLimit |
logical. If true model is fitted on dilution steps above the detection limit. If false, all data points are used to fit the model |
expression |
matrix with protein expression data |
dummy |
matrix with protein expression data |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
for calculation of serial diluted samples only
Heiko Mannsperger <[email protected]>,Stephan Gade <[email protected]>
## Not run: library(RPPanalyzer) data(ser.dil.samples) predicted.data <- calcLinear(ser.dil.samples,sample.id=c("sample","sample.n"), dilution="dilution") ## End(Not run)
## Not run: library(RPPanalyzer) data(ser.dil.samples) predicted.data <- calcLinear(ser.dil.samples,sample.id=c("sample","sample.n"), dilution="dilution") ## End(Not run)
Calculates sample concentrations of a RPPA data set, as wrapper for curveFitSigmoid.
calcLogistic(x, sample.id = c("sample", "sample.n"), dilution = "dilution", xVal = NULL, plot = F, detectionLimit = F)
calcLogistic(x, sample.id = c("sample", "sample.n"), dilution = "dilution", xVal = NULL, plot = F, detectionLimit = F)
x |
|
sample.id |
character vector refering to column names from which samples can be separated |
dilution |
column name from the column in feature data that describes the dilution steps of each sample |
xVal |
defines the dilution value for which the concentration is calulated. If null the highest dilution value is used |
plot |
logical. If true dilution curves are plotted |
detectionLimit |
logical. If true model is fitted on dilution steps above the detection limit. If false, all data points are used to fit the model |
expression |
matrix with protein expression data |
dummy |
matrix with protein expression data |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>, Stephan Gade <[email protected]>
## Not run: library(RPPanalyzer) data(ser.dil.samples) predicted.data <- calcLogistic(ser.dil.samples, sample.id=c("sample","sample.n"), dilution="dilution") ## End(Not run)
## Not run: library(RPPanalyzer) data(ser.dil.samples) predicted.data <- calcLogistic(ser.dil.samples, sample.id=c("sample","sample.n"), dilution="dilution") ## End(Not run)
Calculates the protein concentration of a serial diluted sample stored in an RPPA data list using the serial dilution curve algorithm published by Zhang et.al, Bioinformatics 2009.
calcSdc(x,sample.id=c("sample","sample.n"), sel=c("measurement","control"), dilution="dilution", D0=2,sensible.min=5, sensible.max=1.e9,minimal.err=5, plot=T, r=1.2)
calcSdc(x,sample.id=c("sample","sample.n"), sel=c("measurement","control"), dilution="dilution", D0=2,sensible.min=5, sensible.max=1.e9,minimal.err=5, plot=T, r=1.2)
x |
RPPA data list with replicates aggregated with median |
sample.id |
Attributes to identify the samples |
sel |
The sample type that should be calculated. Has to be "measurements","control", "neg_control",or "blank". |
dilution |
Name of the column in the feature data matrix describing the dilution steps of the samples. |
D0 |
Dilution factor. |
sensible.min |
Signals below this value are marked as undetected |
sensible.max |
Signals above the value are marked as saturated |
minimal.err |
Minimal valid estimate for the background noise |
plot |
Logical. If true, model fits are plotted |
r |
Constant factor used to determine the confidence interval for the saturation limit $M$ and the background noise $a$, shoul be $>1$. Can be lower if accuracy of signals is improved. |
The method of Zhang et. al doesn't fit the dose response curve but a derive model describing the functional relationship between the signals of two consecutive dilution steps. Since this new model does not contain the protein concentration anymore all spots of one array can be used for the fit, allowing a much more robust estimation of the underlying paramters.
expression |
matrix with expression values |
error |
matrix with error values |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>, Stephan Gade <[email protected]>
Zhang et. al, Bioinformatics 2009,Serial dilution curve: a new method for analysis of reverse phase protein array data
## Not run: library(RPPanalyzer) data(ser.dil.samples) ser.dil_median <- sample.median(ser.dil.samples) predicted.data <- calcSdc(ser.dil_median,D0=2,sel=c("measurement"), dilution="dilution") ## End(Not run)
## Not run: library(RPPanalyzer) data(ser.dil.samples) ser.dil_median <- sample.median(ser.dil.samples) predicted.data <- calcSdc(ser.dil_median,D0=2,sel=c("measurement"), dilution="dilution") ## End(Not run)
Corrects for background in an RPPA data set using different algorithms (e.g. from the limma package) avoiding negative values
correctBG(x, method = "normexp")
correctBG(x, method = "normexp")
x |
List with RPPA data set |
method |
any method from the function |
This function is a wrapper for the backgroundCorrect
function of the limma package. As additional method "addmin" is implemented.
expression |
matrix with background corrected expression data |
background |
matrix with background data |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>, Stephan Gade <[email protected]>
Ritchie, ME, Silver, J, Oshlack, A, Holmes, M, Diyagama, D, Holloway, A, and Smyth, GK (2007). A comparison of background correction methods for two-colour microarrays. Bioinformatics 23, 2700-2707.
For detailed information about the background correction methods see: backgroundCorrect
,
## Not run: library(RPPanalyzer) data(dataI) dataBGcorrected <- correctBG(dataI,method="normexp") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataI) dataBGcorrected <- correctBG(dataI,method="normexp") ## End(Not run)
Consists of 3 functions: getIntercepts()
, analyzeIntercepts()
and getSignals()
.
The first one derives intercepts of dilution series in dependence of dilSeriesID (column in sampledescription.txt) and slide/pad/incubationRun/spottingRun number (colnames of arraydescription). A smoothing spline is used to extrapolate to 0. Nonparametric bootstrap is used to estimate uncertainty of the intercept estimate.
The second function is used in the last one and does Analysis of Variances for nested models.
The last one updates the original timeseries signal to (foreground expression - intercept).
correctDilinterc(dilseries, arraydesc, timeseries, exportNo) getIntercepts(dilseries, arraydesc) analyzeIntercepts(intercepts, test="F", export) getSignals(timeseries, intercepts, arraydesc, exportNo) as.my(v)
correctDilinterc(dilseries, arraydesc, timeseries, exportNo) getIntercepts(dilseries, arraydesc) analyzeIntercepts(intercepts, test="F", export) getSignals(timeseries, intercepts, arraydesc, exportNo) as.my(v)
dilseries |
foreground signal matrix as result of |
arraydesc |
"arraydescription" matrix of the RPPA data set list |
timeseries |
foreground signal matrix as result of |
exportNo |
integer of 1-4 which of the linear fits should be exported to the attribute of the result, variable for |
intercepts |
output of |
test |
test parameter for ANOVA (see documentation of |
export |
see |
v |
some variable |
matrix with adapted signal intensities via subtraction of dilution intercept at concentration 0
Daniel Kaschek, Silvia von der Heyde
## Not run: library(RPPanalyzer) # read data dataDir <- system.file("extdata", package="RPPanalyzer") setwd(dataDir) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE) # write data write.Data(rawdata,FileNameExtension="test_data") # import raw data fgRaw.tmp <- read.delim("test_dataexpression.txt", stringsAsFactors=FALSE, row.names=NULL, header=TRUE) fgRaw <- read.delim("test_dataexpression.txt", skip=max(which(fgRaw.tmp[,1]==""))+1, stringsAsFactors=FALSE, row.names=NULL, header=TRUE) # remove NAs fgNAVec <- which(is.na(fgRaw[,"ID"])) if(length(fgNAVec) > 0){ fgRaw <- fgRaw[-fgNAVec,] } colnames(fgRaw) <- sub("X","", gsub("\\.","-", colnames(fgRaw))) # correct data for BG noise correctedData <- correctDilinterc(dilseries=fgRaw[which(fgRaw$sample_type=="control" & !is.na(fgRaw$dilSeriesID)),], arraydesc=rawdata$arraydescription, timeseries=fgRaw[which(fgRaw$sample_type=="measurement"),], exportNo=2) ## End(Not run)
## Not run: library(RPPanalyzer) # read data dataDir <- system.file("extdata", package="RPPanalyzer") setwd(dataDir) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE) # write data write.Data(rawdata,FileNameExtension="test_data") # import raw data fgRaw.tmp <- read.delim("test_dataexpression.txt", stringsAsFactors=FALSE, row.names=NULL, header=TRUE) fgRaw <- read.delim("test_dataexpression.txt", skip=max(which(fgRaw.tmp[,1]==""))+1, stringsAsFactors=FALSE, row.names=NULL, header=TRUE) # remove NAs fgNAVec <- which(is.na(fgRaw[,"ID"])) if(length(fgNAVec) > 0){ fgRaw <- fgRaw[-fgNAVec,] } colnames(fgRaw) <- sub("X","", gsub("\\.","-", colnames(fgRaw))) # correct data for BG noise correctedData <- correctDilinterc(dilseries=fgRaw[which(fgRaw$sample_type=="control" & !is.na(fgRaw$dilSeriesID)),], arraydesc=rawdata$arraydescription, timeseries=fgRaw[which(fgRaw$sample_type=="measurement"),], exportNo=2) ## End(Not run)
3-parameter sigmoidal curve.
curvePredictSigmoid(x, params)
curvePredictSigmoid(x, params)
x |
Input value(s). |
params |
Parameter vector containing three parameters alpha, beta and gamma. |
The model is defined as alpha + beta*(2^(x*gamma))/(1+2^(x*gamma)))
.
The prediction f(x) of the input value(s).
## Not run: x <- seq(-5, 5, by=0.1) y <- curvePredictSigmoid(x, c(alpha=2, beta=1, gamma=1.5)) plot(x, y) ## End(Not run)
## Not run: x <- seq(-5, 5, by=0.1) y <- curvePredictSigmoid(x, c(alpha=2, beta=1, gamma=1.5)) plot(x, y) ## End(Not run)
The data Set is a list of four elements. Expression and background are matrices containing signal intensities, the data frames arraydescription and sampledescription comprising feature and phenodata.
data(dataI)
data(dataI)
list
The data set is a list of four elements with data of a original reverse phase array experiment. The elements expression and background are 2304 times 26 matrices containing integers describing the signal intensities and local background for every spot of the experiment as generated with image analysis software. Arraydescription is a data frame, describing the incubation of every array refering the column of the expression and background matrix. Required rows are target and AB_ID with characters and array.id (four integers linked with "-"). Sampledescription is a data frame according to the rows of expression and background matrix and annotates the samples. Sampledescription requires the columns "ID", "sample_type", "sample", "concentration", and "dilution" as minimal information and "sample.n" to separate different sample groups.
The data set contains original reverse phase protein array signals with randomized pheno and feature data.
data(dataI) str(dataI)
data(dataI) str(dataI)
The data Set is a list of four elements. Sample.median
and sample.mads
are matrices
containing logged signal intensities and errors, the data frames arraydescription and sampledescription
comprising feature and phenodata.
data(dataII)
data(dataII)
List
The data set is a list of four elements with data of a original reverse phase
array experiment. The elements Sample.median
and sample.mads
are 624 times 12 matrices
containing logged signal intensities and errors for every sample of the
experiment. The values are background corrected and normalized against total protein content.
Arraydescription is a data frame, describing the incubation of every array
refering the column of the matrices. Required rows are target
and AB_ID with characters and array.id (four integers linked with "-").
Sampledescription is a data frame according to the rows of the
matrices annotating the samples. The columns "sample", "stimulation",
"inhibition", "stim_concentration", and "time"
are describing the time course experiment.
The data set contains original reverse phase protein array signals from a stimulation time course experiment with randomized pheno and feature data.
data(dataII) str(dataII)
data(dataII) str(dataII)
The data Set is a list of four elements. Expression and background are matrices containing signal intensities, the data frames arraydescription and sampledescription comprising feature and phenodata.
data(dataIII)
data(dataIII)
List
The data set is a list of four elements with data of a original reverse phase array experiment. The elements expression and background are 384 times 75 matrices containing integers describing the signal intensities and local background for every spot of the experiment as generated with image analysis software. Arraydescription is a data frame, describing the incubation of every array refering the column of the expression and background matrix. Required rows are target and AB_ID with characters and array.id (four integers linked with "-"). Sampledescription is a data frame according to the rows of expression and background matrix and annotates the samples.
The data set contains original reverse phase protein array signals from cancer specimen with randomized pheno and feature data.
data(dataIII) str(dataIII)
data(dataIII) str(dataIII)
Function for import, normalization and quality checks of data prior to the actual analysis. The preprocessing steps include subtraction of dilution series intercepts and FCF normalization. Additionally plots for quality checks are generated including dilutions and BLANK measurements.
dataPreproc(dataDir=getwd(), blocks=12, spot="aushon", exportNo=3, correct="both", remove_flagged=NULL)
dataPreproc(dataDir=getwd(), blocks=12, spot="aushon", exportNo=3, correct="both", remove_flagged=NULL)
dataDir |
directory of gpr files, slidedescription.txt and sampledescription.txt, default is the current working directory |
blocks |
see |
spot |
see |
exportNo |
see |
correct |
"both" applies |
remove_flagged |
Either NULL or an integer. If an integer, looks into column |
A list of 4 elements is returned.
rawdat |
list of 4 raw data elements ( |
cordat |
list of 4 elements like |
normdat |
list of 4 elements like |
DIR |
directory for storing the generated outputs |
All output files are stored in an analysis folder labeled by the date of analysis.
The txt files Dataexpression
and Databackground
result from write.Data
and store the raw data.
The pdf files getIntercepts_Output
and anovaIntercepts_Output
result from correctDilinterc
.
getIntercepts_Output
shows the derived intercepts and smoothing splines of dilution series in dependence of the dilSeriesID
column in sampledescription.txt and the slide/pad/incubationRun/spottingRun columns of the arraydescription
matrix.
anovaIntercepts_Output.pdf
results from the ANOVA in correctDilinterc
, comparing different linear models of the dilution series intercepts. The barplot displays the residual sum of squares (RSS) of the individual model fits. It helps to choose the appropriate exportNo
parameter. As RSS decreases, the model fits better.
Finally, three pdf files for quality checking are returned.
QC_dilutioncurve_raw.pdf
plots target and blank (2nd antibody only) signals from serially diluted control samples of the raw RPPA data set, see plotQC
.
QC_targetVSblank_normed.pdf
plots blank signals vs. target specific signals of dilution intercept corrected and FCF normalized RPPA data, see plotMeasurementsQC
.
QC_qqPlot_normed.pdf
contains qq-plots of dilution intercept corrected and FCF normalized RPPA data, see plotqq
.
Silvia von der Heyde
## Not run: library(RPPanalyzer) # get output list dataDir<-system.file("extdata",package="RPPanalyzer") res<-dataPreproc(dataDir=dataDir,blocks=12,spot="aushon",exportNo=4,correct="both") # get individual elements # raw data rawdat<-res$rawdat # dilution intercept corrected data cordat<-res$cordat # dilution intercept corrected and FCF normalized data normdat<-res$normdat # output directory DIR<-res$DIR ## End(Not run)
## Not run: library(RPPanalyzer) # get output list dataDir<-system.file("extdata",package="RPPanalyzer") res<-dataPreproc(dataDir=dataDir,blocks=12,spot="aushon",exportNo=4,correct="both") # get individual elements # raw data rawdat<-res$rawdat # dilution intercept corrected data cordat<-res$cordat # dilution intercept corrected and FCF normalized data normdat<-res$normdat # output directory DIR<-res$DIR ## End(Not run)
The method is based on a maximum-likelihood estimation. The model prediction is the expected variance given the signal, depending on var0 and varR.
getErrorModel(dataexpression, verbose=FALSE)
getErrorModel(dataexpression, verbose=FALSE)
dataexpression |
data.frame, standard output from RPPanalyzer's |
verbose |
logical, if TRUE, the function prints out additional information and produces a PDF file in the working directory with the signal vs. variance plots. |
The empirical variance estimator is distributed with
degrees of freedom, where
is the number of technical replicates. The estimated error parameters maximize the corresponding log-likelihood function. At the moment, the code assumes
. For cases
, the error parameters are slightly overestimated, thus, providing a conservative result. The explicit error model is
where is the signal strength.
data.frame |
with columns "slide" (factor, the slide names), "ab" (factor, the antibody/target names), "time" (numeric, the time points), "signal" (numeric, signal values), "var0" (numeric, error parameter for the constant error, equivalent to sigma0^2), "varR" (numeric, error parameter for the relative error, equivalent to sigmaR^2) and other columns depending on the input data.frame |
Daniel Kaschek, Physikalisches Institut, Uni Freiburg. Email: [email protected]
The data Set is a list of four elements. Expression and background are matrices containing signal intensities, the data frames arraydescription and sampledescription comprising feature and phenodata.
data(HKdata)
data(HKdata)
List
The data set is a list of four elements with data of a original reverse phase array experiment. The elements expression and background are 768 times 21 matrices containing integers describing the signal intensities and local background for every spot of the experiment as generated with image analysis software. Arraydescription is a data frame, describing the incubation of every array refering the column of the expression and background matrix. Required rows are target and AB_ID with characters and array.id (four integers linked with "-"). Sampledescription is a data frame according to the rows of expression and background matrix and annotates the samples.
The data set contains original reverse phase protein array of siRNA transfected cell line with randomized pheno and feature data.
data(HKdata) str(HKdata)
data(HKdata) str(HKdata)
Function to logarithmize (log2) the first two RPPA list elements, i.e. foreground and background signal intensities.
logList(x)
logList(x)
x |
list of 4 elements ( |
x.log |
list of 4 elements like the input but with log2 values of |
Silvia von der Heyde
## Not run: library(RPPanalyzer) # input data dataDir<-system.file("extdata",package="RPPanalyzer") x<-dataPreproc(dataDir=dataDir, blocks=12, spot="aushon", exportNo=4) x.norm<-x$normdat # get log2 list x.log<-logList(x.norm) ## End(Not run)
## Not run: library(RPPanalyzer) # input data dataDir<-system.file("extdata",package="RPPanalyzer") x<-dataPreproc(dataDir=dataDir, blocks=12, spot="aushon", exportNo=4) x.norm<-x$normdat # get log2 list x.log<-logList(x.norm) ## End(Not run)
Normalizes data in an RPPA data list. Four different normalization methods are provided: using externally measured protein concentration, signals from housekeeping proteins or protein dyes and row normalization.
normalizeRPPA(x, method = "row", normalizer = "housekeeping", useCol = "BCA", writetable = F,vals="logged")
normalizeRPPA(x, method = "row", normalizer = "housekeeping", useCol = "BCA", writetable = F,vals="logged")
x |
List containing RPPA data set |
method |
character string: one of |
normalizer |
character string describing the target in slidedescription
that should be used for normalization using |
useCol |
character string describing the column in sampledescription
that should be used for normalization using the method |
writetable |
logical. If true data are exported as tab delimited text files to current working directory |
vals |
the data is returned at log2 scale with substracted normalizer
value per default. If argument is set to |
The function provides four different methods to normalize RPPA data to ensure
that an optimal data quality. The default method row
uses the expression
matrix: after taking the logarithm the row median is substracted from each
value of one row assuming that the median expression over all targets of one
sample is representing total protein amount of the spots. For the method
proteinDye
arrays with the pattern protein
in the target
description are used for normalization. For every spotting run a separate
protein slide is required. If the slides containing more than one array, the
arrays will be normalized by the corresponding protein
array. To use
external protein assay data for normalization, a column containing the protein
concentration has to be added to the sampledescription file. The name of this
column is addressed via the useCol
argument. To use any other target
for normalization the method housekeeping
can be used. The target
for this method has to be addressed via the normalizer
argument.
expression |
matrix with protein expression data |
dummy |
matrix with protein expression data |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") dataIb <- pick.high.conc(dataI_bgcorr,highest="dilution") normRow <- normalizeRPPA(dataIb,method="row") normDye <- normalizeRPPA(dataIb,method="proteinDye") normPassay <- normalizeRPPA(dataIb,method="extValue",useCol="concentration") normHK <- normalizeRPPA(dataIb,method="housekeeping",normalizer="housekeeping") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") dataIb <- pick.high.conc(dataI_bgcorr,highest="dilution") normRow <- normalizeRPPA(dataIb,method="row") normDye <- normalizeRPPA(dataIb,method="proteinDye") normPassay <- normalizeRPPA(dataIb,method="extValue",useCol="concentration") normHK <- normalizeRPPA(dataIb,method="housekeeping",normalizer="housekeeping") ## End(Not run)
Picks the dilution step with the value 1 from serialy diluted samples in an RPPA data set.
pick.high.conc(x, highest = ("dilution"), sample.id=c("sample","sample.n"))
pick.high.conc(x, highest = ("dilution"), sample.id=c("sample","sample.n"))
x |
Any RPPA data list with 4 elements |
highest |
Character string describing the column that contains the dilution steps |
sample.id |
Attributes to identify the samples |
The function selects all spots or samples from a RPPA data set with the value
1 in the column of the sampledescription denoted in argument highest
.
An RPPA data list containing only the samples with the highest concentration of each dilution series.
Heiko Mannsperger <[email protected]>, Stephan Gade <[email protected]>
## Not run: library(RPPanalyzer) data(ser.dil.samples) dataHighcon <- pick.high.conc(ser.dil.samples,highest="dilution") ## End(Not run)
## Not run: library(RPPanalyzer) data(ser.dil.samples) dataHighcon <- pick.high.conc(ser.dil.samples,highest="dilution") ## End(Not run)
Plots the blank signals and the target specific signals of an RPPA data list in a PDF file.
plotMeasurementsQC(x, file = "QC_plots.pdf", arrays2rm = c("protein"))
plotMeasurementsQC(x, file = "QC_plots.pdf", arrays2rm = c("protein"))
x |
RPPA data list as output from |
file |
name of the PDF file that will be exported |
arrays2rm |
character describing the arrays that dont have be plotted |
This function genrates scatter plots in a pdf file from not yet normalized samples
(annotated as measurement
in the sample_type
column
of the sampledescription file) of RPPA data to get an impression of the distance
from the blank signal to the target specific signal.
An array with blank as target description is needed.
Genrates a PDF file
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) plotMeasurementsQC(dataIII,file="control_plot.pdf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) plotMeasurementsQC(dataIII,file="control_plot.pdf") ## End(Not run)
Plots target and blank signal from control samples of an RPPA data set in one plot. Exports pdf file.
plotQC(x, file = "target_vs_blank.pdf", arrays2rm = c("protein"))
plotQC(x, file = "target_vs_blank.pdf", arrays2rm = c("protein"))
x |
RPPA data list as output from |
file |
name of the PDF file |
arrays2rm |
character describing the arrays that dont have be plotted |
This function genrates scatter plots in a pdf file from not yet normalized, serially diluted
control samples (annotated as control
in the sample_type
column
of the sampledescription file) of RPPA data to get an impression of the antibody dynamic. An
array with blank as target description is needed.
generates a PDF file
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) plotQC(dataIII,file="plotQC.pdf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) plotQC(dataIII,file="plotQC.pdf") ## End(Not run)
Draws a qq-plot and qq-line from measurements samples of a RPPA data set
plotqq(x, fileName = "qqplot_and_line.pdf")
plotqq(x, fileName = "qqplot_and_line.pdf")
x |
RPPA data list as output from |
fileName |
name of the PDF file |
This function implements the functions qqnorm
and qqline
from
stats package to get an impression of the data distribution in an RPPA data set.
generates a PDF file.
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) plotqq(dataIII,file="dataIII_qqplot.pdf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) plotqq(dataIII,file="dataIII_qqplot.pdf") ## End(Not run)
Draws time course data from a RPPA data list and calculates a mathematical model on the time course data.
plotTimeCourse(x, tc.identifier = c("sample", "stimulation", "inhibition", "stim_concentration"), tc.reference=NULL, plot.split = "experiment", file = NULL, arrays2rm = c("protein", "Blank"), plotformat = "stderr", log=TRUE, color=NULL, xlim = NULL, ylim = NULL)
plotTimeCourse(x, tc.identifier = c("sample", "stimulation", "inhibition", "stim_concentration"), tc.reference=NULL, plot.split = "experiment", file = NULL, arrays2rm = c("protein", "Blank"), plotformat = "stderr", log=TRUE, color=NULL, xlim = NULL, ylim = NULL)
x |
List containing RPPA data set |
tc.identifier |
character string describing the column names in the sampledescription that identifies the individual time course experiments |
tc.reference |
character string describing the sample that will be used as reference for the time course plots. |
plot.split |
character string describing the column names in sampledescription that defines the argument that devides between different plots |
file |
character string for the name of the exported file |
arrays2rm |
character strings identifying the targets that should be from the time course plots |
plotformat |
character string defining the plot type: |
log |
logical, if true time courses signal intensities will be plotted at log2 scale |
color |
Vector holding the colors for the samples to be plot. If NULL, colors will be generated. |
xlim |
Limits for x-axis. If NULL (default) limits are generated for each timeseries plot. If a range (numeric vector of length 2) is given, this is used for all plots. |
ylim |
Analogous to |
This function plots RPPA time course experiments from data sets with aggregated
replicate spots. A column time
containing numeric values is required in the sampledescription file.
One or several column in the sampledescription file should be
able to indentify the individual experiments described in argument tc.identifier
.
One column should provide a parameter plot.split
to split the whole data set into different
comparable time courses that have to be plotted together.
Different plotting options can be specified with the argument plotformat
. Option both
is
most informative, since it shows the original data plus standard deviations
at each time point, combined with a spline fit and the standard error
of the fit.
generates a PDF file
Heiko Mannsperger <[email protected]
## Not run: library(RPPanalyzer) data(dataII) plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="stderr") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="errbar") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="both") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="rawdata") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="spline") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="spline_noconf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataII) plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="stderr") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="errbar") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="both") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="rawdata") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="spline") plotTimeCourse(dataII, tc.identifier=c("sample","stimulation","stim_concentration","inhibition") ,plot.split="experiment",plotformat="spline_noconf") ## End(Not run)
plotTimeCourseII creates multiplot rectangular PDF files for time course datasets. Page layout (number of plots per page, arrangement of plots) and plot layout can be customized within the function.
plotTimeCourseII(x,plotgroup="",filename="timeseries_multiplot.pdf",numpage=4, cols=2,xname="time",yname="signal",legpos="top",legrow=2,legtitle="treatment", legtitlepos="top",legtextsize=10,legtextcolor="black",legtitlesize=10, legtitlecolor="black",legtitleface="bold",legitemsize=1,plottitlesize=12, plottitleface="bold",xaxissize=10,yaxissize=10,xaxisface="bold", yaxisface="bold",xaxistextsize=8,xaxistextangle=0,yaxistextsize=8, linecolor="Set1")
plotTimeCourseII(x,plotgroup="",filename="timeseries_multiplot.pdf",numpage=4, cols=2,xname="time",yname="signal",legpos="top",legrow=2,legtitle="treatment", legtitlepos="top",legtextsize=10,legtextcolor="black",legtitlesize=10, legtitlecolor="black",legtitleface="bold",legitemsize=1,plottitlesize=12, plottitleface="bold",xaxissize=10,yaxissize=10,xaxisface="bold", yaxisface="bold",xaxistextsize=8,xaxistextangle=0,yaxistextsize=8, linecolor="Set1")
x |
RPPA time course dataset preprocessed with the getErrorModel and averageData function |
plotgroup |
select the feature (eg. treatment) which should be plotted in one plot |
filename |
enter filename, DIR needs to be defined as your working directory, add .pdf to filename |
numpage |
number of plots per page |
cols |
number of plot columns per page |
xname |
title of the x axis |
yname |
title of the y axis |
legpos |
postion of the legend in context of the plot ("top","bottom","right","left"), "none" removes legend from plot |
legrow |
number of item rows within the legend |
legtitle |
title of the legend |
legtitlepos |
position of the legend title |
legtextsize |
font size of the legend text |
legtextcolor |
color of the legend text |
legtitlesize |
font size of the legend title |
legtitlecolor |
color of the legend title |
legtitleface |
font face of the legend title (eg. "bold") |
legitemsize |
size of the legend item pictures |
plottitlesize |
size of the plot title |
plottitleface |
font face of the plot title |
xaxissize |
font size of the x axis title |
yaxissize |
font size of the y axis title |
xaxisface |
font face of the x axis title |
yaxisface |
font face of the y axis title |
xaxistextsize |
font size of the x axis text |
xaxistextangle |
angle of the x axis text |
yaxistextsize |
font size of the y axis text |
linecolor |
color of the plot lines: either chose a scheme ("Set1","Dark2","Paired") or hand a vector of color names |
The plotTimeCourseII function plots RPPA timecourse datasets in multiple line charts. For each cell line and target protein a separate plot is created. The average foldchange values of different replicates and the error bars are visualized. In order to be visualized by the plotTimeCourseII function, the dataset needs to be preprocessed by the getErrorModel and averageData function from the RPPanalyzer package. Additionally the plotgroup needs to be defined if it is not named ?treatment?. The remaining arguments are optional.
Generates a PDF file.
Johannes Bues ([email protected])
## Not run: # pre-process the data dataDir <- system.file("extdata", package="RPPanalyzer") res <- dataPreproc(dataDir=dataDir, blocks=12, spot="aushon", exportNo=2) # remove arrays normdat_rm <- remove.arrays(res$normdat, param="target", arrays2rm=c("protein","blank")) # select samples and export data sel_sampels_A549 <- select.sample.group(normdat_rm, params=list("cell_line"="A549"), combine= FALSE) write.Data(sel_sampels_A549, FileNameExtension="HGF_sample_data_A549") # read selected data dataexpression_1 <- read.table("HGF_sample_data_A549expression.txt") # use getErrorModel function dataexpression_2 <- getErrorModel(dataexpression_1, verbose=FALSE) # use averageData function dataexpression_3 <- averageData(dataexpression_2, scaling=c("slide","replicate"), distinguish=c("cell_line","treatment")) # plot time course data plotTimeCourseII(dataexpression_3, filename="timecourse_HGF_sample_data_A549.pdf", legpos="top", xname="time [min]", yname="signal [a.u.]", linecolor=c("red","green","blue","black","orange","grey")) ## End(Not run)
## Not run: # pre-process the data dataDir <- system.file("extdata", package="RPPanalyzer") res <- dataPreproc(dataDir=dataDir, blocks=12, spot="aushon", exportNo=2) # remove arrays normdat_rm <- remove.arrays(res$normdat, param="target", arrays2rm=c("protein","blank")) # select samples and export data sel_sampels_A549 <- select.sample.group(normdat_rm, params=list("cell_line"="A549"), combine= FALSE) write.Data(sel_sampels_A549, FileNameExtension="HGF_sample_data_A549") # read selected data dataexpression_1 <- read.table("HGF_sample_data_A549expression.txt") # use getErrorModel function dataexpression_2 <- getErrorModel(dataexpression_1, verbose=FALSE) # use averageData function dataexpression_3 <- averageData(dataexpression_2, scaling=c("slide","replicate"), distinguish=c("cell_line","treatment")) # plot time course data plotTimeCourseII(dataexpression_3, filename="timecourse_HGF_sample_data_A549.pdf", legpos="top", xname="time [min]", yname="signal [a.u.]", linecolor=c("red","green","blue","black","orange","grey")) ## End(Not run)
reads sampledescription and slidedescription txt files and annotates the median expression value in GenePix result files stored in current working directory.
read.Data(blocksperarray = 4, spotter = "arrayjet", writetable = FALSE, printFlags=FALSE,fileName="Flagged_spots.csv", remove_flagged=NULL, ...)
read.Data(blocksperarray = 4, spotter = "arrayjet", writetable = FALSE, printFlags=FALSE,fileName="Flagged_spots.csv", remove_flagged=NULL, ...)
blocksperarray |
Integer describing the number of blocks in one array. |
spotter |
character strings: default |
writetable |
logical. If true data are exported as tab delimited text files to current working directory |
printFlags |
logical. If true flagged spots will exported as csv file |
fileName |
character string naming the csv file for the flagged spots |
remove_flagged |
Either NULL or an integer. If an integer, looks into column |
... |
any other arguments passed to read.gpr |
This function reads and annotates RPPA rawdata provided in three different kind of files. It is very important that these data files are in a correct format and stored in the same folder.
The file sampledescription.txt has to be a tab delimited text file with at least 6 columns named plate, column, row, sample_type, sample, concentration and in case of serially diluted samples a column dilution is required. The first 3 columns are describing the location of the sample in the source well plate. The 4th column describes the for different types of samples: measurement, control, neg_control or blank. In the column sample any character string describing the sample is possible. The column concentration has to contain only numerical values. Columns with further phenodata can be added.
The slidedescription.txt describes the array properties. Required columns are: gpr (describing the name of the corresponding gpr file), the columns pad, slide, incubation_run, spotting_run containing integers are generating a unique array identifier. The column target describes the analyzed target and AB_ID the used antibody. Column with further feature data can be added.
The third kind of files are the gpr files as results from image analysis software GenePix using the galfile from a aushon or arrayjet spotter.
expression |
matrix with protein expression data |
background |
matrix with background data |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) dataDir <- system.file("extdata", package="RPPanalyzer") setwd(dataDir) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE, remove_flagged=NULL) print(dim(rawdata$expression)) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE, remove_flagged=50) print(dim(rawdata$expression)) ## End(Not run)
## Not run: library(RPPanalyzer) dataDir <- system.file("extdata", package="RPPanalyzer") setwd(dataDir) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE, remove_flagged=NULL) print(dim(rawdata$expression)) rawdata <- read.Data(blocksperarray=12, spotter="aushon", printFlags=FALSE, remove_flagged=50) print(dim(rawdata$expression)) ## End(Not run)
Removes arrays from the RPPA data set which are not used in following calculations.
remove.arrays(x, param = "target", arrays2rm = c("protein", "blank", "housekeeping"))
remove.arrays(x, param = "target", arrays2rm = c("protein", "blank", "housekeeping"))
x |
List with RPPA data set |
param |
charater describing a row in the arraydescription (column in slidedescription file) |
arrays2rm |
character defining the arrays to remove |
The RPPA data list without the arrays specified by arrays2rm
.
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) DT <- remove.arrays(dataIII, param = "target", arrays2rm = c("protein")) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) DT <- remove.arrays(dataIII, param = "target", arrays2rm = c("protein")) ## End(Not run)
Draws boxplots of groups of an RPPA data set and compares the expression values to a reference group (control) if provided (wilcox.test). Otherwise a test on general differences is performed (kruskal.test). Additionally a grouping order for plotting can be provided here.
rppa2boxplot(x, param, control=NULL, orderGrp=NULL, file = "boxplot_groups.pdf")
rppa2boxplot(x, param, control=NULL, orderGrp=NULL, file = "boxplot_groups.pdf")
x |
List with RPPA data with aggregated replicate spots |
param |
Character value of one of the columns of the sampledescription matrix, i.e. x[[4]], describing the phenodata that should be analyzed |
control |
Character value of one of the columns of the sampledescription matrix, i.e. x[[4]], describing the sample group of |
orderGrp |
defines the ordering of the subgroups in |
file |
Title of the file that will be exported. |
Generates a PDF file
Silvia von der Heyde, Heiko Mannsperger
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) rppa2boxplot(x=dataIII_median, param="rank", control="vx", orderGrp=c("vx","zx","yzr","rxi"), file="wilcoxonBoxplot.pdf") rppa2boxplot(x=dataIII_median, param="rank", control=NULL, orderGrp=c("vx","zx","yzr","rxi"), file="kruskalBoxplot.pdf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) rppa2boxplot(x=dataIII_median, param="rank", control="vx", orderGrp=c("vx","zx","yzr","rxi"), file="wilcoxonBoxplot.pdf") rppa2boxplot(x=dataIII_median, param="rank", control=NULL, orderGrp=c("vx","zx","yzr","rxi"), file="kruskalBoxplot.pdf") ## End(Not run)
Converts a RPPA data list into an Expression Set
rppaList2ExpressionSet(x)
rppaList2ExpressionSet(x)
x |
List with RPPA data set |
This function builds an Expression Set from RPPA data. Due to the design of RPPA experiments, pheno and feature data are inverted compared to DNA/RNA array data sets.
object of class Expressionset
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") dataI_median <- sample.median(dataI_bgcorr) expr.set <- rppaList2ExpressionSet(dataI_median) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") dataI_median <- sample.median(dataI_bgcorr) expr.set <- rppaList2ExpressionSet(dataI_median) ## End(Not run)
Draws a heatmap from an RPPA data set and adds column side colors visualizing groups of selected phenodata.
rppaList2Heatmap(x, sampledescription = "sample", side.color = "tissue", remove = c("blank", "protein", "Abmix"), distance = "eucsq", dendros = "both", cutoff = 0.005, fileName = NULL, cols = colorpanel(100, low = "blue", mid = "yellow", high = "red"), hclust.method="ward", scale = "row")
rppaList2Heatmap(x, sampledescription = "sample", side.color = "tissue", remove = c("blank", "protein", "Abmix"), distance = "eucsq", dendros = "both", cutoff = 0.005, fileName = NULL, cols = colorpanel(100, low = "blue", mid = "yellow", high = "red"), hclust.method="ward", scale = "row")
x |
List with RPPA data set, aggregatedreplicates |
sampledescription |
character describing the sample identifier |
side.color |
character describing the parameter for the side colors of the heatmap |
remove |
character describing the arrays that should removed from the heatmap data |
distance |
character describing the method for the dendrogram |
dendros |
character: "both" for row and column dendrogram |
cutoff |
numeric describing the percentage that are identified as outliers for the heatmap color distribution |
fileName |
character for the file where the pdf file will be stored. If NULL, plot to standard plotting device. |
cols |
color key for the heatmap |
hclust.method |
The method to be used for cluster agglomeration. Defaults to |
scale |
String. Either |
generates a PDF file
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) rppaList2Heatmap(dataIII_median) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) rppaList2Heatmap(dataIII_median) ## End(Not run)
GenePix result files are tab delimited text files exported from the commonly used microarray image analysis tool GenePix.
tab delimeted text file
The GenePix result files are files from original reverse phase protein arrays
Aggregates the replicates in an RPPA data list using the median function.
sample.median(x)
sample.median(x)
x |
List with RPPA data set |
expression |
matrix with protein expression data |
error_mad |
matrix with error values |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") data.median <- sample.median(dataI_bgcorr) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataI) dataI_bgcorr <- correctBG(dataI,method="normexp") data.median <- sample.median(dataI_bgcorr) ## End(Not run)
The sample description file contains all information concerning the samples of a reverse phase protein experiment.
tab delimeted text file
The sample description file contains information for sample annotation and data analysis.
To identify the sample in the source well plate the columns plate
, row
, column
are obligatory. It is neccessary that every well that is spottet is described.
The columns sample_type
and sample
as well as concentration
and
for serially diluted samples dilution
are required for data analysis.
To fit a model to serial dilution e.g. using the calcSdc
function, it
is neccessary to indicate the highest concentration in the dilution
column
with the value 1.
Any additionally column can be added to describe further phenodata of interest.
The data set contains original reverse phase protein array signals with randomized pheno and feature data.
The sample description file contains all information concerning the samples of a reverse phase protein experiment.
tab delimeted text file
The sample description file contains information for sample annotation and data analysis.
To identify the sample in the source well plate the columns plate
, row
, column
are obligatory. It is neccessary that every well that is spottet is described.
The columns sample_type
and sample
as well as concentration
and
for serially diluted samples dilution
are required for data analysis.
The column dilSeriesID
is required for background correction based on serial dilutions.
Any additionally column can be added to describe further phenodata of interest.
The data set contains original reverse phase protein array signals. A549 cells were starved for 24 h and subsequently stimulated with six different HGF concentrations ranging from 0 - 100 ng/ml. Samples were obtained at six different time points ranging from 0 - 120 min. The experiment was done in triplicates, and the samples were analysed by RPPA using antibodies directed against proteins and phosphoproteins of MET receptor signalling.
Selects the measurement samples defined as "measurement" in sample_type from an RPPA data list
select.measurements(x)
select.measurements(x)
x |
List with RPPA data set |
expression |
matrix with protein expression data |
background |
matrix with protein background data or error values dependend on the input files |
arraydescription |
data frame with feature data |
sampledescription |
data frame with pheno data |
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) measures <- select.measurements(dataIII_median) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) measures <- select.measurements(dataIII_median) ## End(Not run)
Selects samples from an RPPA data list according to the selected parameter.
select.sample.group(x, params=list("tissue" = c("T", "N")), combine = F )
select.sample.group(x, params=list("tissue" = c("T", "N")), combine = F )
x |
List with RPPA data set |
params |
List of parameters the selection of samples is bases on. The names of the list describes the columns of the sampledescription matrix. The according values corresponds to the values in these columns that will be selected. |
combine |
Logical value. Indicates wheter the samples should match at least one criterion given in the params list ( |
An RPPA data list containing only these samples that match the criteria given in the params
list.
Heiko Mannsperger <[email protected]>, Stephan Gade <[email protected]>
## Not run: library(RPPanalyzer) data(dataII) selectedData <- select.sample.group(dataII,params=list("stimulation"=c("A","B"))) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataII) selectedData <- select.sample.group(dataII,params=list("stimulation"=c("A","B"))) ## End(Not run)
The data Set is a list of four elements. Expression and background are matrices containing signal intensities, the data frames arraydescription and sampledescription comprising feature and phenodata.
data(ser.dil.samples)
data(ser.dil.samples)
list
The data set is a subset of the data set dataI to shorten the running time during the R CMD check process. The data set contains information about the localization of the samples.
The data set contains original reverse phase protein array signals with randomized pheno and feature data.
## Not run: data(ser.dil.samples) str(ser.dil.samples) ## End(Not run)
## Not run: data(ser.dil.samples) str(ser.dil.samples) ## End(Not run)
Draws boxplots of groups of an RPPA data set. Additionally a grouping order for plotting can be provided here.
simpleBoxplot(x, param, orderGrp=NULL, file = "boxplot_groups.pdf")
simpleBoxplot(x, param, orderGrp=NULL, file = "boxplot_groups.pdf")
x |
List with RPPA data with aggregated replicate spots |
param |
Character value of one of the columns of the sampledescription matrix, i.e. x[[4]], describing the phenodata that should be analyzed |
orderGrp |
defines the ordering of the subgroups in |
file |
Title of the file that will be exported. |
Generates a PDF file
Silvia von der Heyde, Heiko Mannsperger
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) simpleBoxplot(x=dataIII_median, param="rank", orderGrp=c("vx","zx","yzr","rxi"), file="simpleBoxplot.pdf") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) simpleBoxplot(x=dataIII_median, param="rank", orderGrp=c("vx","zx","yzr","rxi"), file="simpleBoxplot.pdf") ## End(Not run)
The slide description file contains all information concerning the arrays of a reverse phase protein experiment.
tab delimeted text file
The slide description file contains information for array annotation and data analysis.
To find the GenePix result files (gpr files) in current working directory it
is neccesssary that the names of the gpr files are matching with the gpr
column.
To identify the array on the slides the columns pad
, slide
, spotting_run
, incubation_run
are obligatory. It is neccessary that every well that is spottet is described.
The columns sample_type
and sample
as well as concentration
and
(for serially diluted samples) dilution
are required for data analysis.
The columns target
describes the analyzed proteins and AB_ID
contains a indentifier for the antibody used for the detection.
Any additionally column can be added to describe further phenodata of interest.
The data set contains the incubation data from reverse phase protein arrays with randomized feature data.
The slide description file contains all information concerning the arrays of a reverse phase protein experiment.
tab delimeted text file
The slide description file contains information for array annotation and data analysis.
To find the GenePix result files (gpr files) in current working directory it
is neccesssary that the names of the gpr files are matching with the gpr
column.
To identify the array on the slides the columns pad
, slide
, spotting_run
, incubation_run
are obligatory. It is neccessary that every well that is spottet is described.
The columns sample_type
and sample
as well as concentration
and
(for serially diluted samples) dilution
are required for data analysis.
The columns target
describes the analyzed proteins and AB_ID
contains a indentifier for the antibody used for the detection.
Any additionally column can be added to describe further phenodata of interest.
The data set contains the incubation data from reverse phase protein arrays for the HGF data set. These are 3 sample slides plus one slide for FCF normalization.
Tests for correlation between protein expression value and any continuous data using cor.test.
test.correlation(x, param, method.cor = "kendall", method.padj = "BH", file = "correlation_plot.pdf")
test.correlation(x, param, method.cor = "kendall", method.padj = "BH", file = "correlation_plot.pdf")
x |
List containing RPPa data set |
param |
character describing the parameter |
method.cor |
character string describing the correlation |
method.padj |
character string describing the method for the p-value correction for multiple testing. |
file |
character string |
generates a pdf file
Heiko Mannsperger <[email protected]>
For information about the argument method.cor see cor.test
,
informations about methods.padj can be found under p.adjust
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) test.correlation(dataIII_median,param="staging") ## End(Not run)
## Not run: library(RPPanalyzer) data(dataIII) dataIII_median <- sample.median(dataIII) test.correlation(dataIII_median,param="staging") ## End(Not run)
Writes the 3 or 4 elements of an RPPA data list into one or two csv files which can easily imported into spreadsheet software
write.Data(x,FileNameExtension="Data")
write.Data(x,FileNameExtension="Data")
x |
List with RPPA data set |
FileNameExtension |
character string which will be added to the name of the exported file |
one or two csv files dependend from the length of the RPPA data list
Heiko Mannsperger <[email protected]>
## Not run: library(RPPanalyzer) data(dataII) write.Data(dataII) ## End(Not run)
## Not run: library(RPPanalyzer) data(dataII) write.Data(dataII) ## End(Not run)