PIPELINE PILOT

Empower your research team with a flexible scientific platform that drives efficiency, collaboration and innovation.

Data Modeling Component Collection for
Pipeline Pilot

The Data Modeling Component Collection offers a comprehensive set of learning and data modeling capabilities, statistical filters, and clustering components optimized for large real-world data sets. This collection of components extends Pipeline Pilot's standard capabilities to include statistics and predictive modeling for data mining applications.

Analyze and model your data using methods such as:

  • Fast data clustering
  • Categorical learning using Bayesian statistics
  • Principal component analysis (PCA)
  • Linear regression, partial least squares (PLS) regression, and k-nearest neighbor (kNN) regression
  • ROC plots, enrichment plots, and other visualization techniques for evaluating model quality

The model-building components provide methods such as cross-validation to ensure the quality of the models built. They also provide model applicability domain (MAD) methods to assess the quality of each prediction as the model is subsequently applied. This is particularly important as models are increasingly deployed to end-users who may not be familiar with the training data or limitations of a particular model. Training data can be saved with any model, allowing the model to be extended as more experimental data becomes available.

When combined with the separately available Chemistry Collection, you can perform:

  • Structure-activity modeling
  • Compound clustering
  • Structural similarity searching

Read the Data Modeling Component Collection Datasheet

Browse By:

Pipeline Pilot 8.0 Webinar Series - Register Today!