Research Scientist
Samsung AI Center Cambridge


Adjunct Researcher
Machine Learning Systems Lab, University of Cambridge


From november 2022, I joined the Samsung AI Center in Cambridge as a Research Scientist. I am also an affiliated lecturer at the Cambridge Machine Learning Systems Lab from the University of Cambridge since 2020. In fact, I am an Associate Professor on leave from the Laboratoire Informatique d'Avignon (LIA) and Avignon Université (FR) (joined in 2020). My research focuses on artificial intelligence through the development of new efficient deep learning methods via self-supervised / representation learning. I am also interested on the efficiency problematic of large deep learning models alongside with finding elegant solutions to develop new type of neural networks. In this extent, I investigate different potential solutions including federated learning and efficient large-scale self-supervised and supervised learning. As of now, I have mostly applied these new concepts to automatic speech processing. I created community driven and internationally adopted solutions such as PyTorch-Kaldi and SpeechBrain, two open source toolkits for speech processing and deep learning entirely written in PyTorch. Finally, I am particularly attached to the concept of AI for Social Good, and I therefore dedicate part of my research time to solve concrete societal problems, such as the astonishing carbon footprint of modern deep learning models or the open access of the latest technologies to the wide community.

From february to august 2020, I was part of the Oxford Machine Learning Systems (OxMLSys) as a Senior Research Associate working in collaboration with Prof. Nicholas Lane, working on the fundations of the above described topics.

Prior to joining Oxford, I did my PhD thesis at the University of Avignon and the Laboratoire Informatique d'Avignon (LIA) under the supervision of Prof. Linarès Georges and Assoc. Prof. Morchid Mohamed. The thesis was part of an industrial collaboration (CIFRE) with Orkis, a French company specialised in assets and data managements. The thesis ended with more than 15 publications in well-known conferences and journals, and released new tools to the research community to develop and investigate quaternion neural networks in the context of natural language processing and image processing.

During the thesis, I was given the opportunity to spend four months working at the Montréal Institute for Learning Algorithms (MILA), Montréal under the supervision of Prof. Yoshua Bengio. The collaboration mainly concerned the development of new quaternion convolutional and recurrent neural networks for automatic speech recognition. This period is also at the origin of long-term and funded projects including Pytorch-Kaldi and SpeechBrain.

Funded Projects

E-SSL: Efficient Self-Supervised Learning for Inclusive and Innovative Speech Technologies (2023 - 2026)
National Research Agency (ANR) — Former Principal Investigator — 469 000€

As a former Associate Professor, I was awarded one of the largest available French grant for innovating in the field of SSL for speech technologies. More precisely, I was leading a consortium gathering three universities (Avignon University, Université Grenoble-Alpes and PSL Paris) and composed with Prof. François PORTET (LIG), Assoc. Prof. Solange ROSSATO (LIG), Assoc. Prof. Benjamin LECOUTEUX (LIG), Assoc. Prof. Didier SCHWAB (LIG), Assoc. Prof. Fabien RINGEVAL, Dr., CR, Marco DINARELLI (CNRS, LIG), Prof. Alexandre ALLAUZEN (LAMSADE), Assoc. Prof. Titouan PARCOLLET (LIA), Prof. Yannick ESTÈVE (LIA). Unfortunately, and as I am currently on leave, I stopped leading this project (Prof. Yannick Estève is the new PI). I am still involved in the project as a PhD student co-adviser as well as an external scientist as my research at Samsung and the University of Cambridge is highly linked to the topics of this grant. An abstract of the E-SSL project is: Following previous major advances, self-supervised learning (SSL) has recently emerged as one of the most promising artificial intelligence (AI) methods. With this technique, it becomes feasible to take advantage of the colossal amounts of existing unlabeled data to significantly improve the results of various AI systems. In particular, the field of speech processing (SP) is being rapidly transformed by the rise of SSL due to massive industrial investments, and the explosion of data both made available by few companies. Although incredibly powerful, the complexity of SSL models requires researchers and the industry to acquire extraordinary computing capacities, which drastically reduces both the access to fundamental research in this field and its deployment in real products. For instance, existing works based on SSL models for speech are in fact relying on a system maintained and made available by a single company (wav2vec 2.0). The entire life cycle of the technology, from its theoretical foundations to its practical deployment, including the analysis of societal aspects, is therefore dependent only on institutions with the physical and financial means to support the intensity of the development of this technique. The E-SSL project aims at re-empowering the scientific community and the speech industry with the necessary control over self-supervised learning in order to ensure its fair evolution and deployment by facilitating both academic research and its transfer to industry. In practice, E-SSL holistically integrates three key issues of self-supervised learning for speech representations including its effective computational efficiency, its societal impacts and the feasibility of its extension to future products.

SpeechBrain: simplify the access to speech technologies (2021 - 2022)
Principal Investigator — 334 000€

The Speechbrain project (see bellow for mor info) has been granted 500K hours of GPU time in the French datacenter Jean Zay. This project is a joint collaboration between the Laboratoire Informatique d'Avignon (LIA) , Le Laboratoire de Traitement et Communication de l'Information (LTCI), the Laboratoire des Sciences du Numérique de Nantes (LS2N) and the Laboratoire Interdisciplinaire des Sciences du Numérique (LISN) that aims to support the effort devoted to SpeechBrain towards a democratization of the research and developpement of speech technologies. This project will gather top-tier researchers to build and release state-of-the-art and ground-breaking systems for speech translation, self-supervised learning of speech representations, speaker verification and identification, voice privacy, spoken language understanding, speech synthesis, speech for e-health and speech enhancement.

The consortium is composed with: Assoc. Prof. Titouan PARCOLLET (PI, LIA), Prof. Yannick ESTÈVE (LIA), Prof. Corinne FREDOUILLE (LIA), Prof. Jean-François BONASTRE (LIA), Prof. Richard DUFOUR (LS2N), Prof. Slim ESSID (LTCI), Assoc. Prof. Sahar GANNAY (LISN).

LeBenchmark (2020 - 2022)
Co-Principal Investigator for SSL models — 117 000€

LeBenchmark project has been granted 200K hours of GPU time in the French datacenter Jean Zay. This project is a joint collaboration between the Laboratoire d'Informatique de Grenoble (LIG) and the Laboratoire Informatique d'Avignon (LIA) that aimed to collect large quantities of raw speech in French (i.e. several thousand hours) with different styles (read speech, prepared speech, spontaneous speech), from various speakers and use them to learn self-supervised models to be shared with the research community. Furthermore, we also established a new benchmark data set for several speech processing tasks. I am in charge of finding the appropriate self-supervised methods (e.g wav2vec) and to deploy it on our new dataset. We released various [pre-trained models]( with various training sets acounting for up to 14,000 hours of speech in French. Everything is nicely integrated to SpeechBrain for peoples interested in re-using our models for downstream tasks!

The consortium is composed with: Prof. Laurent BESACIER (PI, LIG), Prof. François PORTET (LIG), Assoc. Prof. Solange ROSSATO (LIG), Assoc. Prof. Benjamin LECOUTEUX (LIG), Assoc. Prof. Didier SCHWAB (LIG), Assoc. Prof. Fabien RINGEVAL, Dr., CR, Marco DINARELLI (CNRS, LIG), Prof. Alexandre ALLAUZEN (LAMSADE), Assoc. Prof. Titouan PARCOLLET (LIA), Prof. Yannick ESTÈVE (LIA).

Models are available on HuggingFace .

SpeechBrain (2019 - )
Creator & Co-Principal Investigator — 234 000€ but looking for sponsors 😁

The SpeechBrain project aims to develop an open-source and all-in-one toolkit based on PyTorch. The goal is to develop a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech systems for speech recognition (both end-to-end and HMM-DNN), speaker recognition, speech separation, multi-microphone signal processing (e.g, beamforming), self-supervised learning, and many others. The project is funded thanks to generous donations from an ever growing number of sponsors: the Laboratoire Informatique d'Avignon (LIA), the Montréal Institute for Learning Algorithms (MILA), Samsung, HuggingFace, NVIDIA, Dolby, ViaDialog, OVH and NAVER Labs. SpeechBrain also benefits from the collaboration and expertise of around 20 different partner institutions (both academics and industrials) ranging from the University of Cambridge to the PyTorch Team. SpeechBrain already beats all the other toolkits on the considered datasets and with a much easier interface to play with. SpeechBrain reached 4.6K stars on GitHub in a year demonstrating a clear interest from the community for our toolkit. I am the co-creator and co-leader of SpeechBrain with Dr. Ravanelli Mirco, currently an Assistant Professor at Concordia University and the Montréal Institute for Learning Algorithms (Mila, CA)

Have a look at: SpeechBrain! Or to our open access paper describing the toolkit!

Advised & Mentored Ph.D. Students and Interns

Jarod DURET — Ph.D. Student — Started in October 2021 and is co-advised with Prof. Yannick ESTÈVE from LIA
Title: Expressive Speech to Speech Translation.
Ideas: Current speech to speech translation systems rely on cascade systems that combine both a speech to text and a text to speech block. In practice, existing end-to-end speech to speech translation systems do not work well and often completely bypass the concept of expressivity. With this thesis, we aim at inventing a simple end-to-end model enabling an expressivity transfer alongside with the translation. This appears as being particularly challenging as the expressivity may be defined differently from one language to an other. Multiple directions will be investigated, starting with the design of better speech representations capturing the expressivity to condition the generation of the speech signal.

Ryan Whetten — Ph.D. Student — Started in Sept. 2023 and is co-advised with Prof. Yannick ESTÈVE from LIA and Dr. Marco Dinarelli from CNRS
Title: Efficient Self-Supervised Learning.
Ideas: This thesis is part of the E-SSL ANR project. The key idea is to make SSL pre-training quicker and faster. Right now, hundreds of thousands of hours of GPUs are necessary to train a model. Ryan will try to lower this number down to something more accessible to the community. For instance, he will investigate linear time-complexity alternatives to self-attention or investigate the theory behind the training objectives to find more approriate training losses.

Salah ZAIEM — Ph.D. Student — October 2020 - March 2024 and was co-advised with Prof. Slim ESSID from Telecom Paris Sud.
Title: Informed Self-Supervised Speech Representations Learning.
Ideas: Self-supervised learning methods for speech are mostly empirically driven. In particular, there exist very few theoretical evidences on why a method performs better than an other one. With this thesis, we aim at providing theoreticaly grounded tools to design SSL models in an informed manner. For instance, we developed a solution to design a PASE-like architecture without the need for pre-text task search with empirical validation, potentially saving weeks of training (and compute / carbon emissions).

Xinchi QIU — Mentored Ph.D. Student — Started in Sept. 2019, advised by Prof. Nicholas Lane from the University of Cambridge.
Title: Efficient Federated Learning
Ideas: With Xinchi, we wondered if the concept of federated learning that will soon cover a large part of the deep learning use cases could be more efficient than centralised training. With that in mind, we started by investigating its energy footprint before jumping into practical ways of increasing its efficiency including high-dimensional neural networks, parameters sharing or sparsity.

Yan Gao — Mentored Ph.D. Student — Sept. 2019 - Dec. 2023 , advised by Prof. Nicholas Lane from the University of Cambridge.
Title: Federated Self-supervised Learning
Ideas: Yan explores different ways of enabling on-device SSL training with edge data. In practice, state-of-the-art speech recognizers are highly resource intensive and we build the foundations necessary to properly assess the difficulty arising when such models are deployed on device. For instance, Yan investigates different solutions to aggregate numerous acoustic models coming from large pools of devices under a federated learning setup. His latest work tries to highlight the unstainability of current large scale self-supervised speech models under constrained resources.

Adel Moumen — Apprenticeship — Sept. 2022 to August 2024.
Title: SpeechBrain
Ideas: Along other members, Adel is part of the active core of SpeechBrain. He maintains, develops and conceive the toolkit.

Adel Moumen — Undergrad. Intern — June 2021 to August 2021 and June 2022 to August 2022.
Title: On the limitations of LiGRU networks.
Ideas: Adel investigates different ways of making LiGRU a mandatory alternative to LSTM/GRU for speech processing. There are theoretical and empirical evidences that LiGRU simply are better than LSTM and GRU. Unfortunately, they suffer from a poor GPU implementation and an instability in the recurrent connection. These problems will be tackled during the internship.

SpeechBrain Full-Time Research Engineers — Started on January 2022.
Format: this is a list of all the researchers working full-time or part-time on SpeechBrain that I recruited and supervised.
Topics: all the topics and concepts developed within SpeechBrain.

Dr. Andreas Nautsch (2022-2023)
Adel Moumen (2022-2024)

SpeechBrain Interns — Started on January 2020.
Format: this is a list of all the researchers that I mentored during their internship to work on SpeechBrain. Internships took place either in Avignon (LIA) or Montréal (Mila) and were co-advised with Dr. Ravanelli
Topics: all the topics and concepts developed within SpeechBrain.

Mohamed Anwar, Naver LABS Europe (FR).
Aku Rouhe, Aalto University (FI).
Peter Plantinga, now at JP Morgan.
Loren Lugosch, Mila (CA).
Nauman Dawalatabad, now at MIT (USA).
Ju-Chieh Chou, National Taiwan University (TW).
Sung-Lin Yeh, now at University of Edinburg (UK).
Hwidong Na, Samsung SAIL (CA).
Abdel Heba, Linaroga / University of Toulouse (FR).
Samuele Cornell, now at Amazon.
Jianyuan Zhong, University of Rochester (USA).
Cem Subakan, University of Montréal (CA).
Szu-Wei Fu, Academia Sinica (TW).

Community Contributions

General Co-Chair and Area Chair

Area Chair, NeurIPS, 2022, 2023, 2024.
Chair, RECITAL (TALN session), June 28th (2022), Avignon (France).

Workshop & Session Co-Organizer

IEEE ICASSP Workshop on Self-Supervised in Audio, Speech and Beyond (website), June 10th (2023), Rhodos (Greece).
ICML self-Supervision in Audio and Speech (held virtually), July 17th (2020), Vienna (Austria).


Interspeech 2022: "State of the PyTorch Ecosystem for Speech Technologies", August 2022.
IEEE ASRU 2021: "SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit", December 2021.
Interspeech 2021: "SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit", August 2021.
University of Sheffield: "SpeechBrain: Unifying Speech Technologies and Deep Learning With an Open Source Toolkit", June 2021.

Invited Talks

Flower Labs: "Federated Self-Supervised Speech Technologies: Opportunities and Challenges.", May (2023).
University of Cambridge (UK): Invited Lecture on "Federated Self-Supervised Speech Technologies: Opportunities and Challenges.", February (2023).
Samsung AI Cambridge (UK): "Federated Speech Technologies", July (2022).
ADASP Group, Télécom Paris: "SpeechBrain: A General-Purpose Speech Toolkit", March 6th (2022).
Idiap Research Institute: "SpeechBrain: A General-Purpose Speech Toolkit", April 27th (2022).
Machine Intelligence Laboratory, University of Cambridge: "SpeechBrain: A General-Purpose Speech Toolkit", March 14th (2022).
Naver Labs Europe: "SpeechBrain: A General-Purpose Speech Toolkit", November 30th (2021).
FestivalIA Avignon: "SpeechBrain : un outil polyvalent pour le traitement automatique de la parole", November 17th (2021).
Microsoft Research Summit Workshop on Federated Learning and Confidential Computing: "Federated Speech Technologies", October 21th (2021).
Machine Learning Summer Schools — Taipei: "Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark", August 20th (2021).
The Machine Learning / Data Science Meetup of Rome: "SpeechBrain: A General-Purpose Speech Toolkit", July 7th (2021).
The 2nd Annual Federated & Distributed / Decentralized Machine Learning Conference (remote): "Can Federated Learning Save the Planet", June 16th (2021).
Flower Summit 2021 (remote): "Federated speech technologies made easy: Flower and SpeechBrain", March 11th (2021).
Centre de Recherche en Automatique de Nancy (FR): "Should we use quaternion neural networks? Recent advances and limitations.", March 29th (2021).
Samsung AI Cambridge (UK): "SpeechBrain", February 27th (2021).
Samsung AI Cambridge (UK): "Quaternion neural networks", July 10th (2019).
University of Oxford (UK): "Quaternion neural networks", July 9th (2019), Computer Science Department.

Program Committee Member

  • Nature Biotechnology. 2023
  • IEEE Journal of Selected Topics in Signal Processing. 2022
  • IEEE Signal Processing Letters. 2021
  • IEEE Transactions on Neural Networks and Learning Systems. 2019, 2020
  • IEEE International Journal of Wavelets, Multiresolution and Information Processing. 2020
  • ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 2020
  • Elsevier, Computers & Graphics, 2021
  • ACM Multimedia. 2021
  • Springer, Neural Processing Letters. 2020
  • IEEE Transactions on Image Processing. 2020
  • NeurIPS. 20(18-21-23), top 10% best reviewer 2020.
  • ICLR. 20(19-20-21), top 10% best reviewer 2020.
  • INTERSPEECH. 20(19-20-21-22-23-24)
  • ICASSP. 20(20-23)

Expert for Research Projects Evaluation

  • Agence Nationale de la Recherche. 2021
  • Idex Lyon St-Étienne. 2021

Press Coverage

Le Devoir (Quebec News) — "Google, dis-moi si tu comprends mon accent québécois", April 10th (2021).


International Journals

International conferences

  • SpeechBrain: A General-Purpose Speech Toolkit
  • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab HEBA, Jianyuan Zhong, Ju-Chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato De Mori, Yoshua Bengio
  • Open access on Arxiv.
  • On-device federated learning with flower
  • Akhil Mathur, Daniel J. Beutel, Pedro Porto Buarque de Gusmao, Javier Fernandez-Marques, Taner Topal, Xinchi Qiu, Titouan Parcollet, Yan Gao, Nicholas D. Lane
  • MLSys conference.
  • Can Federated Learning Save The Planet ?
  • Xinchi Qiu, Titouan Parcollet, Nicholas Lane
  • NeurIPS 2020 : Tackling Climate Change with Machine Learning Workshop
  • December 11-12, Virtual Conference (COVID)
  • Quaternion Recurrent Neural Networks
  • Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Chiheb Trabelsi, Renato De Mori, Yoshua Bengio
  • ICLR 2019
  • May 6-9 2019, New Orleans, (USA)

National conferences


All the tutorials and practical work classes are dispensed in addition to my standard research job

Year 2023/2024

Now at the University of Cambridge, UK — (on leave from academia)

TeachingLecturesTutorialsPractical WorkTotal
Principles of Machine Learning Systems4h4h
Federated Speech Technologies1h1h

Year 2021/2022

TeachingLecturesTutorialsPractical WorkTotal
Deep Knowledge Representation6h6h12h
Lab Research Meetings30h30h
Apprenticeship Management15h15h
Advanced Programming Project42h42h
Human-Machine Interfacing3h13.5h27h
C++ Advanced Programming30h30h

Year 2020/2021

TeachingLecturesTutorialsPractical WorkTotal
Deep Knowledge Representation6h6h12h
Human-Machine Interfacing7.5h19.5h27h
IT Systems Security24h24h
C++ Advanced Programming30h30h

Year 2019/2020

TeachingTutorialsPractical WorkTotal
Innovation Application (AI)12h12h

Year 2018/2019

TeachingTutorialsPractical WorkTotal
Computer Science Basics10h10h
Object Oriented Programming C++14h14h28h
Tools for Machine Learning12h12h

Year 2017/2018

TeachingTutorialsPractical WorkTotal
Advanced programming and projects (B.Sc.)12h19h31h

Year 2016/2017

TeachingTutorialsPractical WorkTotal
Advanced programming and projects (B.Sc.)12h18h30h
Agorithms and programming (B.Sc.)12h18h30h

Year 2015/2016

TeachingTutorialsPractical WorkTotal
C2I Certification (B.Sc.)144h

Year 2013/2014

TeachingTutorialsPractical WorkTotal
C2I Certification (B.Sc.)144h



  • PhD in computer science (CIFRE) - 2019
  • Thesis: Quaternion neural networks
  • Advisors: Georges Linarès, Mohamed Morchid
  • Reviewers: Thierry Artières, Allexandre Allauzen
  • Committee: Yoshua Bengio, Benjamin Lecouteux, Xavier Bost, Nathalie Camelin
  • Avignon Université, France & Orkis, Aix-en-provence, France
  • Master Research in computer science - 2016
  • Thesis: Quaternions and deep neural networks
  • Avignon Université, France
  • Bachelor in computer science - 2014
  • Avignon Université, France


  • Adjunct Researcher - 2022
  • Cambridge Machine Learning Systems Lab
  • University of Cambridge
  • Research Scientist - 2022
  • Samsung AI Center Cambridge
  • 50/60 Station Road, Cambridge (UK)
  • Visiting Scholar (remote) - 2020 / 2022
  • Cambridge Machine Learning Systems Lab
  • University of Cambridge
  • Associate Professor at Avignon Université - 2020 / 2022
  • Laboratoire Informatique d'Avignon (LIA)
  • 339 chemin des Meinajaries, 84000 Avignon, France
  • Senior Research Associate - 2020
  • Advisors: Nicholas Lane

  • The research focuses on efficient automatic speech recognition with an emphasis on representation learningm new ways of representing artificial neurons and self-supervised learning. I'm also involved in the supervision of students within the group.

  • University of Oxford, Department of Computer Science, OxMLSys lab, Oxford, United-Kingdom
  • CIFRE PhD Student - 2017 / 2019
  • Thesis: Quaternion neural networks
  • Advisor: Georges Linarès, Mohamed Morchid

  • PhD Student in machine learning working on quaternion-valued neural networks with applications to speech recognition, spoken language understanding and image processing. Research engineer at ORKIS, working on new techniques on ASR and documents management with deep learning. Teacher assistant for Master and Bachelor students. Given courses mainly focused on machine learning, programming and algorithms.

  • Avignon Université, France & Orkis, Aix-en-provence, France
  • Montréal Institute for Learning Algorithms (MILA) - Jan. 2018 / Apr. 2018
  • Advisor: Yoshua Bengio

  • Collaboration and release of the Pytorch-Kaldi toolkit. Research on quaternion convolutional neural networks for end-to-end automatic speech recognition. Introduction of quaternion-valued recurrent neural networks to speech recognition.

  • Université de Montréal, Québec, Canada
  • Research Engineer - Sep. 2016 / Mar. 2017
  • Advisor: Mohamed Morchid

  • Hyper-complex neural networks implementation on GPU (CUDA). Research on Spoken Language Understanding (Pattern Recognition).

  • Avignon Université, France




Samsung AI Centre, 50-60 Station Road, CB1 2JH, Cambridge, United-Kingdom.