Event

Research Seminar: Perturbation Validation: A New Heuristic to Validate Machine Learning Models

  • Conférencier  Jie Zhang, UCL

  • Lieu

    Room E004 JFK Building 29 Avenue J.F. Kennedy L-1855 Kirchberg

    LU

This talk introduces Perturbation Validation (PV), a new technique to validate machine learning models. PV does not rely on test data. Instead, it perturbs training data labels, re-trains the model against the perturbed data, then uses the consequent training accuracy decrease rate to assess model relevance. This is also different from traditional statistical approaches, which make judgements based on original training accuracy. We evaluated PV on 10 real-world data sets and 6 synthetic data sets.

Our results demonstrate that PV is more discriminating about model fit than existing validation approaches and it accords well with widely-held intuitions concerning the properties of a good model fit measurement. We also show that the complement of PV and existing validations allow us to disentangle some of the issues present in the recently debated apparent paradox” that high capacity, potentially overfitted” models may nevertheless, exhibit good generalisation ability.

Jie Zhang is a research fellow at CREST, UCL, working with Prof. Mark Harman. She got her PhD degree at Peking University in 2018. She won the 2016 MSRA Fellowship, the Top-ten Research Excellence Award of EECS, Peking University, and the 2015 National Scholarship. She is the co-chair of ASE SRC 2019 and Mutation 2020, and the program committee member of FSE 2020, ISSTA 2020, ASE 2020, 2019. Her major research interests are software testing and machine learning testing.

Homepage