Back
Hugging Face's BLOOM
webbigscience.huggingface.co·bigscience.huggingface.co/
BLOOM is frequently cited in AI safety discussions around open-source model release governance and the tradeoffs between accessibility and misuse risk for large language models.
Metadata
Importance: 55/100homepage
Summary
BLOOM is a large open-source multilingual language model developed collaboratively by the BigScience workshop, a year-long research initiative involving thousands of researchers. It was designed as a transparent, accessible alternative to proprietary large language models, with attention to governance, ethics, and responsible release practices. The project represents a major effort to democratize access to frontier AI capabilities while establishing governance norms for open model releases.
Key Points
- •BLOOM is a 176-billion parameter multilingual LLM trained on 46 natural languages and 13 programming languages, comparable in scale to GPT-3
- •Developed via a collaborative open-science initiative (BigScience) involving 1000+ researchers from 60+ countries, funded by French computing resources
- •Released under a Responsible AI License (RAIL) restricting certain harmful use cases, representing an early experiment in dual-use governance for open models
- •Raises important AI safety considerations around open-source release of frontier models, balancing democratization benefits against misuse risks
- •Demonstrates that large-scale AI development can be conducted with participatory, multi-stakeholder governance rather than solely by well-resourced corporations
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Proliferation | Risk | 60.0 |
Cached Content Preview
HTTP 200Fetched Apr 9, 20264 KB
BigScience Research Workshop
Jan
FEB
Mar
20
2025
2026
2027
success
fail
About this capture
COLLECTED BY
Collection: Save Page Now Outlinks
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20260220055633/https://bigscience.huggingface.co/
About
The projectIn the press
Outcomes
BlogModelsDatasetsPapersCode
Events
Kickoff#1 (ELLIS 2021)#2 (INLG 2021)#3 (NeurIPS 2021)#4 (Reddit AMA)#5 (ACL 2022)
Resources
NotionGoogle DriveGitHubSlack
Join
A one-year long research workshop
on large multilingual models and datasets
Update: Introducing The World's Largest Open Multilingual Language Model - BLOOM 🌸
You can find the model here and learn more by reading our blog post.
The acceleration in Artificial Intelligence will have a fundamental impact on society. A considerable part of this effort stems from training larger models on larger datasets.
The resources for this endeavour are found mainly in the hands of big technology giants. The stranglehold on this transformative technology poses problems, from a research advancement, environmental, ethical and societal perspective.
The BigScience project takes inspiration from scientific creation schemes such as CERN and the LHC, in which open scientific collaborations facilitate the creation of large-scale artefacts that are useful for the entire research community.
Summary
During one-year, from May 2021 to May 2022, more than 1,000 researchers from 60 countries and more than 250 institutions are creating together a very large multilingual neural network language model and a very large multilingual text dataset on the 28 petaflops Jean Zay (IDRIS) supercomputer located near Paris, France.
During the workshop, the participants plan to investigate the dataset and the model from all angles: bias, social impact, capabilities, limitations, ethics, potential improvements, specific domain performances, carbon impact, general AI/cognitive research landscape.
All the knowledge and information gathered during the workshop is openly accessible and can be explored on our Notion.
Coming events
BigScience is organizing the ACL 2022 Workshop "Challenges & Perspectives in Creating Large Language Models" in May 2022. This event will also serve as the closing session of this one year-long initiative aimed at developing a multilingual large language model.
More information and the program can be found here.
Who is organizing BigScience
BigScience is not a consortium nor an officially incorporated entity. It's an open collaboration boot-strapped by HuggingFace, GENCI and IDRIS, and organised as a research workshop. This research workshop gathers academic, industrial and independent researchers from many affiliations and whose research interests span many fields of research across AI, NLP, social sciences, legal,
... (truncated, 4 KB total)Resource ID:
80fcbf839b8eb40d | Stable ID: sid_13kL2qDtfC