PIPELINE PILOT

Empower your research team with a flexible scientific platform that drives efficiency, collaboration and innovation.

Advanced Data Modeling Component Collection

The Pipeline Pilot Advanced Data Modeling Component Collection provides components for Recursive Partitioning (RP) classification models, Genetic Function Approximation (GFA) regression models, and multi-objective Pareto Optimization.

  • The Recursive Partitioning (RP) components provide multiple methods for building single tree or forest models. The methods can build models for single or multiple response properties.
  • Genetic Function Approximation (GFA) components use a sophisticated genetic algorithm to perform variable selection and build multiple models, which can be combined into a consensus or ensemble model. The models identify relationships between properties and the response in a set of compounds or other data.
  • The Pareto Optimization components provide methods for multi-objective optimization problems that provide solutions giving the best tradeoff among two or more conflicting goals.

With the Recursive Partitioning Components you can:

  • Perform rapid learning and data mining experiments on large data sets with large numbers of descriptors, including molecular data sets using fingerprints as descriptors
  • Visualize trees to understand the relationships between descriptors and responses
  • Analyze variable importance to identify the most discriminating descriptors
  • Rapidly apply models to make predictions for new data sets, including model applicability domain (MAD) support to ensure the model is applied properly (also applies to GFA models)

With GFA Components you can:

  • Return multiple models rather than a single "best" model by creating a number of trial models, thereby generating multiple hypotheses for further investigation
  • Combine multiple models into a single ensemble model, which often yields better predictive performance than any one of its component models
  • Plot variable usage statistics over the evolution of the model population, giving insight into the descriptors most responsible for determining the response

With the Pareto Optimization Components you can:

  • Optimize solutions for problems as diverse as combinatorial library design, formulation ingredient optimization, or stock portfolio risk management
  • Find individual samples within a data set that have the best tradeoff among desired property values
  • Find subsets of samples from a larger data set that collectively have the best tradeoffs among desired property values

Read the Advanced Data Modeling Component Collection Datasheet

 

Browse By:

Pipeline Pilot 8.0 Webinar Series - Register Today!