Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.


Researchers use "big molecular and imaging data" together with machine learning (ML) algorithms to build predictive models for specific diseases or phenotypes. Unfortunately, most of these models show deteriorating performance when tested on unseen data, often due to over-fitting. A possible mitigation of this pervasive problem is directly embedding prior biological knowledge – in the form of interacting gene/features pairs and networks and other ways as well – into the decision rules to build robust predictive models with reduced over-fitting, leading to more consistent and robust predictive signatures. Furthermore, this approach also enhances the translational value of the derived classifiers by hypothesizing causal explanations for the disease phenotypes. In conclusion, embedding biological mechanisms into statistical learning holds the promise to move the field towards a successful transition to personalised health care.


Luigi Marchionni, M.D., Ph.D. is Associate Professor and Vice-Chair for Computational and Systems Pathology at Weill Cornell Medicine.  Prof. Marchionni works in close collaboration with “wet lab” researchers, uncovering molecular contributions to interesting cancer phenotypes.  His current research focuses on knowledge integration across different “omics” and imaging data types, the development of novel prediction algorithms for cancer prognostication and therapy selection, and the integration of “omics-based” predictors into current cancer patients’ clinical management.