The Modeling Collection offers a comprehensive set of learning and data modeling capabilities, statistical filters, and clustering components optimized for large real-world data sets. This collection of components extends Pipeline Pilot's standard capabilities to include statistics and predictive modeling for data mining applications.
Create protocols with powerful methods such as:
Fast data clustering
Unsupervised categorical learning using Bayesian Statistics
Principal component analysis
Linear regression and partial least squares regression
The modeling methods provide built-in methods, such as various cross-validation techniques, to ensure the quality of the models built. They also provide methods to assess the quality of each prediction as the model is subsequently applied. This is particularly important as models are increasingly deployed to end-users who may not be familiar with the training data or limits of the modeling method.
When combined with the separately available Chemistry Collection, you can perform:
Structure activity modeling
Compound clustering
Maximal common substructure search
Image Gallery
EnlargeUse the Modeling Collection components when you need to create predictive models and characterize and mine data.
EnlargeThe Modeling Collection includes several enhanced viewers tailored for interpreting model results, such as enrichment plots.