This talk was given by Mark Kotanchek at the Inaugural Workshop of the Program on Law and Computation at the University of Houston Law Center:
The real-world violates many of the underlying assumptions of conventional data analysis — superfluous variables, correlated inputs, nonlinear couplings, noise & measurement errors, redundant data, too many factors and not enough data, etc. DataModeler’s symbolic regression approach mitigates those realities to develop diverse human-interpretable model forms which automatically focus on the driving factors and minimal complexity. Effectively, we are letting the data speak for itself and avoiding imposing constraints derived from lack of human imagination and creativity.
These diverse expressions form the foundation of trustable models which can detect inappropriate model use (e.g., in hitherto unexplored data regions) or changes in the fundamentals of the targeted response behavior. Furthermore, we can use these models for outlier detection (which are either critically important or should be ignored) as well as to guide future data collection to improve the model quality and fidelity. This talk is a whirlwind review of the key concepts and implications of this very unique and powerful technology.