# Bio

I’m a research scientist at Samsung AI Center in Cambridge, UK. Before joining Samsung, I did a postdoc at Inria Sierra with Alexandre d’Aspremont and Francis Bach. I received my PhD in computer science from KAUST, where I worked under the supervision of Peter Richtárik on optimization theory and its applications in machine learning. In 2020, I interned at Google Brain hosted by Nicolas Le Roux and Courtney Paquette. Prior to that, I obtained my double degree MSc diploma from École Normale Supérieure Paris-Saclay and Paris-Dauphine, and a BSc from Moscow Institute of Physics and Technology.

My hobbies include ultimate frisbee, squash and bouldering.

Note that I’m currently not taking any interns.

Interests
• Optimization
• Deep learning
• Federated and distributed learning
Education
• PhD in Computer Science, 2021

KAUST

• MSc in Data Science, 2017

École normale supérieure Paris-Saclay and Paris-Dauphine

• BSc in Computer Science and Physics, 2016

Moscow Institute of Physics and Technology

# Experience

Research Scientist
Samsung
Jan 2023 – Present Cambridge, UK
Working on federated learning and embedded AI systems as a member of the Distributed AI team.
Postdoc
Dec 2021 – Dec 2022 Paris, France
Conducted research on adaptive, second-order, and distributed optimization.

# Recent Posts

Paper on Regularized Newton accepted at SIAM Journal on Optimization (SIOPT)
My paper on Regularized Newton got accepted for publication at SIAM Journal on Optimization (SIOPT). The main result of this work is to show that one can globalize Newton’s method by using regularization proportional to the square root of the gradient norm. The corresponding method achieves global acceleration over gradient descent and it converges with the $O(1/k^2)$ rate of cubic Newton.
Presenting at 2022 Workshop on FL and Analytics organized by Google
I’m taking part in the 2022 Workshop on Federated Learning and Analytics on 9 and 10 November. I am giving a talk about our work on Asynchronous SGD in the mini-workshop on Federated Systems at Scale on 9 November.
I'm giving a talk at Institut Henri Poincaré
I’m giving a talk at Séminaire Parisien d’Optimisation on 10 October. I will be presenting my work on second-order optimization, including the Super-Universal Newton paper.

# Recent Papers

Quickly discover relevant content by filtering publications.
(2023). Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy.

(2023). Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes.

(2022). Super-Universal Regularized Newton Method.

(2022). Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays.

(2022). ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!. ICML.

(2022). Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization.

(2021). Proximal and Federated Random Reshuffling. ICML.

(2020). Random Reshuffling: Simple Analysis with Vast Improvements. NeurIPS.

(2020). Dualize, Split, Randomize: Toward Fast Nonsmooth Optimization Algorithms. JOTA.

(2019). Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates.

(2019). First Analysis of Local GD on Heterogeneous Data.

(2019). Tighter Theory for Local SGD on Identical and Heterogeneous Data. AISTATS.

(2019). MISO is Making a Comeback With Better Proofs and Rates.

(2019). DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate. AISTATS.

(2019). Stochastic Distributed Learning with Gradient Quantization and Double Variance Reduction. Optimization Methods and Software.

(2019). 99% of Worker-Master Communication in Distributed Optimization Is Not Needed. UAI.

(2019). Distributed Learning with Compressed Gradient Differences.

(2018). SEGA: Variance Reduction via Gradient Sketching. NeurIPS.

(2018). A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning. ICML.

(2018). A Distributed Flexible Delay-tolerant Proximal Gradient Algorithm. SIOPT.