Pleias is building the data infrastructure layer for enterprise agentic AI. Efficient, competitive and adapted to entreprise and institutions constraints.
Its dual-product stack - Stratum for AI-native enterprise data processing and Synth for synthetic data generation - enables organizations to train and implement language models that rival systems 100× larger.

+200× more training-efficient. Our engineered data reaches state-of-the-art in ~100B tokens, where raw data needs trillions

Can run entirely on your own infrastructure - on-premise or on-device. Your data never leaves your walls and never touches an external API.

Every data point is rights-cleared and traceable to its source. Common Corpus - our 2-trillion-token open dataset - is EU AI Act-compliant by construction

Our data tooling does the slow, messy prep that stalls most projects - so internal AI use cases ship in weeks instead of months.



PhD, KU Leuven; ex-Aleph Alpha, ex-Apple

M.Sc. University of Lorraine & Saarland University; B.A. Higher School of Economics

M.Eng. EPFL

M.Sc. Maastricht University; B.Eng. ITMO

PhD candidate, ENS ULM; M.S. Sorbonne Université

M.Eng. CentraleSupélec

M.Eng. École 42

M.Eng. CentraleSupélec

PhD candidate, THWS; ex-Yandex

M.Eng. CentraleSupélec; ex-Engie Research Lab

BSc, University of Greenwich

M.Eng. THWS

M.Eng. Ecole 42

M.A. École du Louvre
.png)
.png)
.png)
.png)