Borja de Balle Pigem

Research Scientist @ DeepMind

From April 2017 until May 2019 I was senior machine learning scientist in Neil Lawrence's team at Amazon (Cambridge, UK). Before that I was a lecturer (≈ assistant professor) at Lancaster University affiliated with the Department of Mathematics and Statistics and the Data Science Institute. From October 2013 until September 2015 I was a post-doctoral fellow in the Reasoning and Learning Laboratory at McGill University, where I worked with Prakash Panangaden, Joelle Pineau, and Doina Precup. I obtained my PhD in 2013 from UPC after working in the LARCA research group under the supervision of Jorge Castro and Ricard Gavaldà. During my PhD I spent several months visiting Mehryar Mohri at the Courant Institute (NYU).

Contact

borja /dot/ balle /at/ gmail /dot/ theusual

My research interests revolve around all aspects of Machine Learning: theory, algorithms, and applications.

Currently I focus on the foundations of privacy-preserving data analysis, including Differential Privacy and Private Multi-Party Machine Learning.

In the past I worked on scalable spectral algorithms for learning latent-variable models inspired by Language Theory and Dynamical Systems, and motivated by applications in Natural Language Processing and Reinforcement Learning.

Papers available here may be subject to copyright and are intended for personal, non-commercial use only.

S. Ghalebikesabi, L. Berrada, S. Gowal, I. Ktena, R. Stanforth, J. Hayes, S. De, S. L. Smith, O. Wiles, and B. Balle
Differentially Private Diffusion Models Generate Useful Synthetic Images
ArXiv Preprint, 2023
[arXiv]

M. Nasr, J. Hayes, T. Steinke, B. Balle, F. Tramèr, M. Jagielski, N. Carlini, A. Terzis
Tight Auditing of Differentially Private Machine Learning
ArXiv Preprint, 2023
[arXiv]

J. Hayes, S. Mahloujifar, B. Balle
Bounding Training Data Reconstruction in DP-SGD
ArXiv Preprint, 2023
[arXiv]

N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramèr, B. Balle, D. Ippolito, E. Wallace
Extracting Training Data from Diffusion Models
ArXiv Preprint, 2023
[arXiv]

S. De, L. Berrada, J. Hayes, S. L. Smith, and B. Balle
Unlocking High-Accuracy Differentially Private Image Classification through Scale
ArXiv Preprint, 2022
[arXiv] [code]

B. Balle, G. Cherubin, and J. Hayes
Reconstructing Training Data with Informed Adversaries
IEEE Symposium on Security and Privacy (S&P), 2022
[doi] [arXiv] [code]

L. Weidinger, J. Uesato, M. Rauh, C. Griffin, P.-S. Huang, J. Mellor, A. Glaese, M. Cheng, B. Balle, A. Kasirzadeh, C. Biles, S. Brown, Z. Kenton, W. Hawkins, T. Stepleton, A. Birhane, L. A. Hendricks, L. Rimell, W. S. Isaac, J. Haas, S. Legassick, G. Irving, and I. Gabriel
Ethical and Social Risks of Harm from Language Models
ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2022
[arXiv (long version)] [proceedings (short version)]

B. Balle, P. Gourdeau, and P. Panangaden
Bisimulation Metrics and Norms for Real-Weighted Automata
Information and Computation, Vol. 282, 2022
[doi]

B. Balle and G. Rabusseau
Approximate Minimization of Weighted Tree Automata
Information and Computation, Vol. 282, 2022
[doi]

B. Balle, C. Lacroce, P. Panangaden, D. Precup, and G. Rabusseau
Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata
International Colloquium on Automata, Languages, and Programming (ICALP), 2021
[proceedings]

B. Balle, P. Kairouz, H. B. McMahan, O. Thakkar, and A. Thakurta
Privacy Amplification via Random Check-Ins
Neural Information Processing Systems (NeurIPS), 2020
[arXiv]

B. Balle, J. Bell, A. Gascon, and K. Nissim
Private Summation in the Multi-Message Shuffle Model
ACM Conference on Computer and Communications Security (CCS), 2020
[arXiv]

G. Vietri, B. Balle, A. Krishnamurthy, and S. Wu
Private Reinforcement Learning with PAC and Regret Guarantees
International Conference on Machine Learning (ICML), 2020

B. Avent, J. Gonzalez, T. Diethe, A. Paleyes, and B. Balle
Automatic Discovery of Privacy-Utility Pareto Fronts
Proceedings on Privacy Enhancing Technologies (PoPETS), 2020
[code]
(Andreas Pfitzmann Best Student Paper Award)

P. Schoppmann, L. Vogelsang, A. Gascon, and B. Balle
Secure and Scalable Document Similarity on Distributed Databases: Differential Privacy to the Rescue
Proceedings on Privacy Enhancing Technologies (PoPETS), 2020
[code]

Y.-X. Wang, B. Balle, and S. Kasiviswanathan
Subsampled Rényi Differential Privacy and Analytical Moments Accountant (Journal Version)
Journal of Privacy and Confidentiality, Vol. 10, Num. 2, 2020
[doi]

B. Balle, G. Barthe, and M. Gaboardi
Privacy Profiles and Amplification by Subsampling
Journal of Privacy and Confidentiality, Vol. 10, Num. 1, 2020
[doi]

A.-H. Karimi, G. Barthe, B. Balle, and I. Valera
Model-Agnostic Counterfactual Explanations for Consequential Decisions
Artificial Intelligence and Statistics Conference (AISTATS), 2020
[PMLR] [code] [video]

B. Balle, G. Barthe, M. Gaboardi, J. Hsu, and T. Sato
Hypothesis Testing Interpretations and Renyi Differential Privacy
Artificial Intelligence and Statistics Conference (AISTATS), 2020
[PMLR]

H. Husain, B. Balle, Z. Cranko, R. Nock
Local Differential Privacy for Sampling
Artificial Intelligence and Statistics Conference (AISTATS), 2020
[PMLR]

K. (Dj) Dvijotham, J. Hayes, B. Balle, Z. Kolter, C. Qin, A. Gyorgy, K. Xiao, S. Gowal, P. Kohli
A Framework for Robustness Certification of Smoothed Classifiers Using f-Divergences
International Conference on Learning Representations (ICLR), 2020
[OpenReview]

O. Feyisetan, B. Balle, T. Drake, and T. Diethe
Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations
International Conference on Web Search and Data Mining (WSDM), 2020
[arXiv]

B. Balle, G. Barthe, M. Gaboardi, and J. Geumlek
Privacy Amplification by Mixing and Diffusion Mechanisms
Neural Information Processing Systems (NeurIPS), 2019
[arXiv]

B. Balle, J. Bell, A. Gascon, and K. Nissim
The Privacy Blanket of the Shuffle Model
International Cryptology Conference (CRYPTO), 2019
[arXiv] [code]

Y.-X. Wang, B. Balle, and S. Kasiviswanathan
Subsampled Rényi Differential Privacy and Analytical Moments Accountant
Artificial Intelligence and Statistics Conference (AISTATS), 2019
[arXiv]
(Notable Paper Award; Oral Presentation)

B. Balle, P. Panangaden, and D. Precup
Singular Value Automata and Approximate Minimization
Mathematical Structures in Computer Science, 2019
[doi] [arXiv]

B. Balle, G. Barthe, and M. Gaboardi
Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences
Neural Information Processing Systems (NeurIPS), 2018
[arXiv]

B. Balle and Y.-X. Wang
Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising
International Conference on Machine Learning (ICML), 2018
[arXiv] [code]

Y. Grinberg, H. Aboutalebi, M. Lyman-Abramovitch, B. Balle, and D. Precup
Learning Predictive State Representations from Non-uniform Sampling
AAAI Conference on Artificial Intelligence (AAAI), 2018

M. Ruffini, G. Rabusseau, and B. Balle
Hierarchical Methods of Moments
Neural Information Processing Systems (NIPS), 2017
[proceedings] [supplementary]

G. Rabusseau, B. Balle, and J. Pineau
Multitask Spectral Learning of Weighted Automata
Neural Information Processing Systems (NIPS), 2017
[proceedings] [supplementary]

B. Balle and O.-A. Maillard
Spectral Learning from a Single Trajectory under Finite-State Policies
International Conference on Machine Learning (ICML), 2017
[PMLR] [supplementary]

B. Balle, P. Gourdeau, and P. Panangaden
Bisimulation Metrics for Weighted Automata
International Colloquium on Automata, Languages, and Programming (ICALP), 2017
[arXiv]

A. Gascon, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, and D. Evans
Privacy-Preserving Distributed Linear Regression on High-Dimensional Data
Proceedings on Privacy Enhancing Technologies (PoPETS), 2017
[doi]

B. Balle and M. Mohri
Generalization Bounds for Learning Weighted Automata
Theoretical Computer Science, 2017
[arXiv] [doi]

B. Balle, M. Gomrokchi, and D. Precup
Differentially Private Policy Evaluation
International Conference on Machine Learning (ICML), 2016
[arXiv]

L. Langer, B. Balle, and D. Precup
Learning Multi-Step Predictive State Representations
International Joint Conference on Artificial Intelligence (IJCAI), 2016

G. Rabusseau, B. Balle, and S. B. Cohen
Low-Rank Approximation of Weighted Tree Automata
Artificial Intelligence and Statistics Conference (AISTATS), 2016

C. Zhou, B. Balle, and J. Pineau
Learning Time Series Models for Pedestrian Motion Prediction
IEEE International Conference on Robotics and Automation (ICRA), 2016
[code]

B. Wang, B. Balle, and J. Pineau
Multitask Generalized Eigenvalue Program
AAAI Conference on Artificial Intelligence (AAAI), 2016

B. Balle and M. Mohri
On the Rademacher Complexity of Weighted Automata
Algorithmic Learning Theory (ALT), 2015

B. Balle and M. Mohri
Learning Weighted Automata
Conference on Algebraic Informatics (CAI), 2015
(Invited Paper)

P. L. Bacon, B. Balle, and D. Precup
Learning and Planning with Timing Information in Markov Decision Processes
Uncertainty in Artificial Intelligence (UAI), 2015

L. Addario-Berry, B. Balle, and G. Perarnau
Diameter and Stationary Distribution of Random r-out Digraphs
ArXiv Preprint, 2015
[arXiv]

B. Balle, P. Panangaden, and D. Precup
A Canonical Form for Weighted Automata and Applications to Approximate Minimization
Logic in Computer Science (LICS), 2015
[arXiv]

B. Balle, W. Hamilton, and J. Pineau
Methods of Moments for Learning Stochastic Languages: Unified Presentation and Empirical Comparison
International Conference on Machine Learning (ICML), 2014

A. Quattoni, B. Balle, X. Carreras, and A. Globerson
Spectral Regularization for Max-Margin Sequence Tagging
International Conference on Machine Learning (ICML), 2014

B. Balle, X. Carreras, F. M. Luque, and A. Quattoni
Spectral Learning of Weighted Automata: A Forward-Backward Perspective
Machine Learning, Vol. 96, No. 1, 2014
[doi]

B. Balle, J. Castro, and R. Gavaldà
Adaptively Learning Probabilistic Deterministic Automata from Data Streams
Machine Learning, Vol. 96, No. 1, 2014
[doi]

B. Balle
Ergodicity of Random Walks on Random DFA
ArXiv Preprint, 2013
[arXiv]

B. Balle
Learning Finite-State Machines: Algorithmic and Statistical Aspects
PhD Thesis, 2013

B. Balle, B. Casas, A. Catarineu, R. Gavalà, and D. Manzano-Macho
The Architecture of a Churn Prediction System Based on Stream Mining
International Conference of the Catalan Association of Artificial Intelligence (CCIA), 2013
[doi]

B. Balle, J. Castro, and R. Gavaldà
Learning Probabilistic Automata: A Study In State Distinguishability
Theoretical Computer Science, 473:46-60, 2013
[doi]

B. Balle and M. Mohri
Spectral Learning of General Weighted Automata via Constrained Matrix Completion
Neural Information Processing Systems (NIPS), 2012
[slides] [video]
(Honorable Mention for the Outstanding Student Paper Award)

B. Balle, J. Castro, and R. Gavaldà
Bootstrapping and Learning PDFA in Data Streams
International Colloquium on Grammatical Inference (ICGI), 2012
[slides]
(Best Student Paper Award)

B. Balle, A. Quattoni, and X. Carreras
Local Loss Optimization in Operator Models: A New Insight into Spectral Learning
International Conference on Machine Learning (ICML), 2012
[slides] [video]

F. M. Luque, A. Quattoni, B. Balle, and X. Carreras
Spectral Learning for Non-Deterministic Dependency Parsing
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2012
(Best Paper Award)

B. Balle, A. Quattoni, and X. Carreras
A Spectral Learning Algorithm for Finite State Transducers
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2011
[slides]

B. Balle, J. Castro, and R. Gavaldà
A Lower Bound for Learning Distributions Generated by Probabilistic Automata
Algorithmic Learning Theory (ALT), 2010

B. Balle, J. Castro, and R. Gavaldà
Learning PDFA with Asynchronous Transitions
International Colloquium on Grammatical Inference (ICGI), 2010

B. Balle, E. Ventura, and J.M. Fuertes
An Algorithm to Design Prescribed Length Codes for Single-Tracked Shaft Encoders
IEEE International Conference on Mechatronics (ICM), 2009
[slides]

J.M. Fuertes, B. Balle, and E. Ventura
Absolute-Type Shaft Encoding Using LFSR Sequences With a Prescribed Length
IEEE Transactions on Instrumentation and Measurement, Vol. 57, No. 5, 2008

The Privacy Blanket of the Shuffle Model
Privacy and the Science of Data Analysis, Simons Institute, April 2019

Learning the Privacy-Utility Trade-off with Bayesian Optimization
Data Privacy: From Foundations to Applications, Simons Institute, March 2019

Automata Learning
Summer School on Foundations of Programming and Software Systems, July 2018

Singular Value Automata and Approximate Minimization
Weighted Automata: Theory and Applications, May 2018

A Short Tutorial on Differential Privacy
The Alan Turing Institute, January 2018

Learning Automata with Hankel Matrices
Logic and Learning Workshop, The Alan Turing Institute, January 2018

Theoretical Guarantees for Learning Weighted Automata
International Conference on Grammatical Inference (ICGI), October 2016

Tutorial on (Co-)Algebraic and Analytical Aspects of Weighted Automata Minimisation and Equivalence
Coalgebraic Methods in Computer Science (CMCS), April 2016
(Presented jointly with Alexandra Silva)

Tutorial on Spectral Learning Techniques for Weighted Automata, Transducers, and Grammars
Empirical Methods in Natural Language Processing (EMNLP), October 2014
(Presented jointly with Ariadna Quattoni and Xavier Carreras)

Area Chair / Senior PC — NeurIPS 2019, ICML 2019, AISTATS 2019, NeurIPS 2018, NIPS 2014

Workshops Chair (with Marco Cuturi) — NIPS 2015

Steering Committe — International Conference on Grammatical Inference (since 2016)

Workshop Organizer — Privacy-Preserving Machine Learning (PPML'19) @ CCS 2019

Workshop Organizer — Privacy in Machine Learning and Artificial Intelligence (PiMLAI) @ ICML 2018

Workshop Organizer — Learning and Automata (LearnAut) @ LICS 2017

Workshop Organizer — Fairness and Privacy in Machine Learning @ DALI 2017

Workshop Organizer — Private Multi-Party Machine Learning @ NIPS 2016

Competition Organizer — Sequence Prediction Challenge (SPICE) (2016)

Workshop Organizer — Methods of Moments and Spectral Learning @ ICML 2014

Workshop Organizer — Spectral Learning @ NIPS 2013

Workshop Organizer — Spectral Learning @ ICML 2013