The R Statistics Component Collection facilitates easy integration of the widely used, public domain R Statistics package into Pipeline Pilot workflows for both the expert R user and the person who wants to use R’s statistical capabilities without having to learn R.
The collection allows you to perform exploratory analyses, create informative graphics, and make educated decisions. It includes components that implement statistical methods for data manipulation, clustering, model-building, and data analysis. You can incorporate output results from R directly into your pipeline for further analysis using other components in the Pipeline Pilot framework. You can use your existing R scripts in custom Pipeline Pilot components, enabling you to reuse them in different protocols or share them across the organization.
With the R Statistics Component Collection, you can:
Correlate multiple properties in a heat map display to see which ones are most relevant
View distributions among population subgroups using box plots
Perform an ANOVA to determine differences between means of multiple data sets
Model data with logistic regression, a support vector machine (SVM), a neural network, or any of 10 other learning methods
Apply the models you build to make predictions for new data sets, including model applicability domain (MAD) support to ensure the models are applied properly
Save training data with any model, allowing the model to be extended as more experimental data becomes available.
Apply numerous different clustering methods
Apply your own R script to each Pipeline Pilot data record or to the data stream as a whole