We believe it is essential to provide the latest academic research tools on privacy-preserving machine learning to AI industrial production systems. Indeed, data privacy has become an essential prerequisite for many industries ranging from healthcare to finance, and although tools to ensure user confidentiality exist, they remain out of reach for non privacy experts.
Our ambition is to provide these tools to the largest number of people through open and community-supported projects. That's why we are deeply involved in several open-source projects including the PySyft library of OpenMined, a 5000 people community that combines the latest research advances and business-oriented challenges. These roots in open-source and academia help us to be pioneers in open Federated Learning in healthcare and to provide concrete solutions to health services.
Currently, most machine learning projects or services rely on massive data collection to build centralized data lakes on which models are trained. However, this vision faces several obstacles, including concerns about user privacy, proprietary data issues and data regulations, such as the GDPR. This puts at risk data collection and hence models performance.
Federated Learning offers a solution by reversing the paradigm: instead of collecting data to a central server, data is stored locally in decentralized and standardized data warehouses. The model is now sent to these remote nodes to train on local datasets, model updates are aggregated and then sent back to the nodes. Thus, the data never leaves the nodes that can be hospitals or banks for example and nothing but the model is exposed.
We intend to provide each AI project with a unified interface to scale their production models to all healthcare facilities in our network. This platform is agnostic of the machine learning framework used and removes the technical barriers to deploy federated learning, traceability and security tools to protect their models and data used. In addition, we provide health data using the standard format FHIR, so it is already structured and cleaned. These two features allow data scientists to only focus on building their model and addressing health challenges.
Last, because we value transparency and community feedback, all our code is open-source and available on GitHub.
We collaborate with many partners in academia to help building the tools of tomorrow.
In addition to Federated Learning, we are actively involved in the development of new tools to improve private AI, including Secure Multi-Party Computation, a technique that allows computation on encrypted data, and Differential Privacy that prevents models from storing specific individual data and only allows them to learn statistical behaviours.
Ambitious projects need a community to support them
and should be open to the largest number.