Federated Learning: Collaborative Machine Learning without Centralized Training

web

ai.googleblog.com·ai.googleblog.com/2017/04/federated-learning-collaborativ...

This Google AI Blog post introduced federated learning to a broad audience; it is relevant to AI safety discussions around data privacy, decentralized AI governance, and reducing risks of large-scale data centralization by powerful AI developers.

Metadata

Importance: 62/100blog postprimary source

Summary

Google introduces federated learning, a technique that trains machine learning models across many decentralized devices (like smartphones) without centralizing raw user data. Instead of sending data to a server, the model is sent to each device, trained locally, and only model updates (gradients) are aggregated centrally. This approach offers privacy benefits by keeping sensitive user data on-device while still enabling powerful shared models.

Key Points

•Federated learning trains models on distributed devices by sharing model updates rather than raw data, preserving user privacy.
•Developed primarily for mobile devices (e.g., Gboard), where personal data is sensitive and bandwidth for uploading raw data is limited.
•Aggregation of model updates (via Federated Averaging) allows a global model to improve without any single party seeing individual user data.
•Raises important questions about differential privacy, secure aggregation, and whether gradients themselves can leak private information.
•Represents a shift in ML infrastructure that has significant implications for privacy-preserving AI deployment and data governance.

Cited by 1 page

Page	Type	Quality
AI-Driven Concentration of Power	Risk	65.0

Cached Content Preview

HTTP 200Fetched Apr 9, 20269 KB

Federated Learning: Collaborative Machine Learning without Centralized Training 
 
 
 
 

 
 

 

 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 
 

 
 
 
 

 
 
 
 
 
 
 

 

 
 
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 

 

 Federated Learning: Collaborative Machine Learning without Centralized Training Data 

 

 
 April 6, 2017

 Posted by Brendan McMahan and Daniel Ramage, Research Scientists

 
 

 

 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 Quick links

 
 
 
 
 
 
 Share 
 
 

 
 
 

 
 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 
 
 
 
 

 
 
 
 
 
 
 Copy link 
 
 
 ×
 
 
 
 

 
 
 
 

 
 
 
 
 
 

 
 
 
 
 

 

Standard machine learning approaches require centralizing the training data on one machine or in a datacenter. And Google has built one of the most secure and robust cloud infrastructures for processing this data to make our services better. Now for models trained from user interaction with mobile devices, we're introducing an additional approach: Federated Learning . 

Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. This goes beyond the use of local models that make predictions on mobile devices (like the Mobile Vision API and On-Device Smart Reply ) by bringing model training to the device as well.

It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud. 
 Your phone personalizes the model locally, based on your usage (A). Many users' updates are aggregated (B) to form a consensus change (C) to the shared model, after which the procedure is repeated. Federated Learning allows for smarter models, lower latency, and less power consumption, all while ensuring privacy. And this approach has another immediate benefit: in addition to providing an update to the shared model, the improved model on your phone can also be used immediately, powering experiences personalized by the way you use your phone.

We're currently testing Federated Learning in Gboard on Android , the Google Keyboard. When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model.
 To make Federated Learning possible, we had to overcome many algorithmic and technical challenges. In a typical machine learning system, an optimization algorithm like Stochastic Gradient Descent (SGD) runs on a large dataset pa

... (truncated, 9 KB total)

Resource ID: a47933706c3362a7 | Stable ID: sid_qkGIxPgSRK