Dylan Slack

Hello! I am a second-year PhD student at University of California, Irvine advised by Sameer Singh and co-advised by Hima Lakkaraju.

I work on machine learning and natural language processing as part of UCI NLP, UCI CREATE, and the HPI Research Center. My research is supported by an HPI fellowship. I've previously interned at AWS and will be interning at Google AI in Summer 2021 👨‍💻.

Blog /  CV /  Bio /  Google Scholar /  Github /  Twitter

profile photo

I work on machine learning, natural language processing, interpretability, and fairness. Much of my research focuses on developing models that are robust, trustworthy, and equitable. * denotes equal contribution.

low-bayes-samples How Much Should I Trust You? Modeling Uncertainty of Black Box Explanations
Dylan Slack, Sophie Hilgard, Sameer Singh, and Hima Lakkaraju
arXiv, 2020
arXiv / bibtex

Bayesian variants of LIME and KernelSHAP provide guarantees for the quality of local explanations and allow them to run faster.

defuse Defuse: Harnessing Unrestricted Adversarial Examples for Debugging Models Beyond Test Accuracy
Dylan Slack, Nathalie Rauschmayr, and Krishnaram Kenthapadi
arXiv, 2020
arXiv / bibtex

Defuse generates and corrects novel errors in classifiers.

Private-Models Differentially Private Language Models Benefit from Public Pre-training
Gavin Kerrigan*, Dylan Slack*, and Jens Tuyls*
EMNLP PrivateNLP Workshop, 2020
code / arXiv / bibtex

Pre-training on public data helps differentially private models learn quicker.

Adversarial-Lime Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
Dylan Slack*, Sophie Hilgard*, Emily Jia, Sameer Singh, and Hima Lakkaraju
AIES, 2020   (Oral Presentation)
Work also presented at SafeAI Workshop, AAAI, 2020
code / video / arXiv / bibtex

Adversaries can train models to make LIME & SHAP return any explanation they want.

Press: Deeplearning.ai / Harvard Business Review

Fairness Warning Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data
Dylan Slack, Sorelle Friedler, Emile Givental,
FAccT, 2020  
Work also presented at HCML Workshop, NeurIPS, 2019
code / video / arXiv / bibtex

Fairness Warnings tell you when a fair model won't transfer to a new setting well, but with a few data points avaliable, we can use meta-learning to help.

Fairness Warning Assessing the Local Interpretability of Machine Learning Models
Dylan Slack*, Sorelle Friedler, Carlos Scheidegger, and Chitradeep Dutta Roy
HCML Workshop, NeurIPS, 2019
arXiv / bibtex

The number of computations (e.g., +,-,>) users have to perform to trace the output of a model can be used as a proxy for model interpretability.


Automatic Failure Diagnosis and Correction in Machine Learning Models
Nathalie Rauschmayr, Krishnaram Kenthapadi, and Dylan Slack
Patent Application Filed


Here are a few recent talks!


Speaking at AISC, virtually.


Speaking at FAccT, in Barcelona, Spain.

Source modified from this website.