Golden has been acquired by ComplyAdvantage.Read about it here ⟶

MLCommons

MLCommons is a collaborative open engineering consortium, focused on developing the AI ecosystem through benchmarks, public datasets, and research.

Overview Structured Data Issues Contributors Activity

All edits

Edits on 7 Sep, 2023

"prospector:3169:3652577"

Katrina-Kay Pettitt

edited on 7 Sep, 2023

Edits made to:

Infobox (+1/-1 properties)

Infobox

Number of Employees (Ranges)

5 – 90

Number of Employees (Ranges)

1 – 4

Edits on 21 Jul, 2023

Amy Tomlinson Gayle

edited on 21 Jul, 2023

Edits made to:

Article (+212/-153 characters)

Article

MLCommons is a collaborative open engineering consortium, focused on developing the AI ecosystem through benchmarks, public datasets, and research. MLCommons' mission is to accelerate machine learning (ML) innovation and increase its positive impact on society. While AI and ML have been around for decades, the technology is often fragmented, bespoke, and poorly understood. MLCommons aims to unlock the next stage of ML adoption by creating useful measures of quality and performance, large-scale open data sets, and common development practices and resources. MLCommons believes theirits efforts will help democratize ML and enable its widespread adoption into new products and services, growing ML from a research field into a mature industry.

...

Founded in 2020 and headquartered in San Francisco, MLCommon's history goes back to the MLPerf benchmark in 2018. The MLPerf benchmark quickly grew into a suite of industry metrics for measuring machine learning performance and promoting transparency of machine learning techniques. Starting with over 50fifty founding partners, MLCommons is a community-driven and community-funded effort. Its members include startupsstart-ups, leading companies, academics, and non-profitsnonprofits from around the world. MLCommons promotes open-source and open data development, with most of its software projects available under the Apache 2.0 license and its datasets using CC-BY 4.0.

...

Get everyone involved (Bebe global, inclusive, and fair; Bringbring together academia, small companies, large companies, non-profits, etc; Makemake it easy to get involved; Bebe as open with ourits IP as possible while sustaining the community)
Act through collaborative engineering (Keepkeep leadership mostly technical, with an emphasis on hands-on-involvement; Favorfavor data-driven decisions, design simplicity, and focus on real user value)
Make fast but consensus-supported decisions (Veryvery low barrier for “experimental” working groups with well reviewedwell-reviewed path to full endorsement; Favorfavor grudging consensus over 51/49 votes, especially for big decisions; Makemake technical contributions easy; Favorfavor rapid development and iteration)
Build a community that people want to be part of (Bebe welcoming, informal, and friendly; Encourageencourage, recognize, and reward contributions; Celebratecelebrate with cake)

...

Developing ML benchmarks provides consistent measurements of accuracy, speed, and efficiency. This enables engineers to design reliable products and helps researchers compare innovations, choosing the best ideas for the future of the field. MLCommons' work on benchmarks is divided into training and inference workgroups,work groups that continue to release and develop and release benchmarks for the industry.

...

MLCommons releases public datasets to help academics and entrepreneurs develop new technologies and start new companies. Datasets released by MLCommons include the following:

...

ADollar Street is a collection of images showing everyday household items from homes around the world, visually capturing the socioeconomic diversity of traditionally underrepresented populations. It includes 38,479 images collected from 63 different countries, tagged from a set of 289 possible topics. The metadata for each image includes demographic information such as region, country, and total household monthly income. Dollar Street consists of public domain data, licensed for academic, commercial, and non-commercial usage, under CC-BY and CC-BY-SA 4.0.

...

AThis is a growing audio dataset of spoken words in 50fifty languages for academic research and commercial applications in keyword spotting and spoken term search. The dataset contains more than 340,000 keywords, totaling 23.4 million 1-second spoken examples (over 6,000 hours). The Multilingual Spoken Words Corpus is licensed under CC-BY 4.0.

...

OnePeople's Speech is one of the world’s largest English speech recognition datasets, including 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. The People's Speech dataset is large enough to train speech-to-text systems and is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0.

...

The foundations of MLCommons started with the MLPerf benchmarks in 2018 that established industry-standard metrics to measure machine learning performance and quickly grew to encompass data sets and best practices. The community behind the MLPerf benchmarks included members from every continent and grew to over 70seventy supporting organizations, fromincluding software startupsstart-ups, to researchers at top universities, and to cloud computing and semiconductor giants. MLCommons grew out of this effort, and the consortium formed on December 3, 2020.

Edits on 20 Jul, 2023

Arthur Smalley

edited on 20 Jul, 2023

Edits made to:

Description (+147 characters)

Article (+4716 characters)

MLCommons

MLCommons is a collaborative open engineering consortium, focused on developing the AI ecosystem through benchmarks, public datasets, and research.

Article

Overview

MLCommons is a collaborative open engineering consortium, focused on developing the AI ecosystem through benchmarks, public datasets, and research. MLCommons mission is to accelerate machine learning (ML) innovation and increase its positive impact on society. While AI and ML have been around for decades, the technology is often fragmented, bespoke, and poorly understood. MLCommons aims to unlock the next stage of ML adoption by creating useful measures of quality and performance, large-scale open data sets, and common development practices and resources. MLCommons believes their efforts will help democratize ML and enable its widespread adoption into new products and services, growing ML from a research field into a mature industry

...

Founded in 2020 and headquartered in San Francisco, MLCommon's history goes back to the MLPerf benchmark in 2018. The MLPerf benchmark quickly grew into a suite of industry metrics for measuring machine learning performance and promoting transparency of machine learning techniques. Starting with over 50 founding partners, MLCommons is a community-driven and community-funded effort. Its members include startups, leading companies, academics, and non-profits from around the world. MLCommons promotes open-source and open data development, with most of its software projects available under the Apache 2.0 license and its datasets using CC-BY 4.0.

Philosophy

MLCommons has five key principles:

Grow ML markets and make the world a better place
Get everyone involved (Be global, inclusive, and fair; Bring together academia, small companies, large companies, non-profits, etc; Make it easy to get involved; Be as open with our IP as possible while sustaining the community)
Act through collaborative engineering (Keep leadership mostly technical, with an emphasis on hands-on-involvement; Favor data-driven decisions, design simplicity, and focus on real user value)
Make fast but consensus-supported decisions (Very low barrier for “experimental” working groups with well reviewed path to full endorsement; Favor grudging consensus over 51/49 votes, especially for big decisions; Make technical contributions easy; Favor rapid development and iteration)
Build a community that people want to be part of (Be welcoming, informal, and friendly; Encourage, recognize, and reward contributions; Celebrate with cake)

Benchmarks

Developing ML benchmarks provides consistent measurements of accuracy, speed, and efficiency. This enables engineers to design reliable products and helps researchers compare innovations, choosing the best ideas for the future of the field. MLCommons work on benchmarks is divided into training and inference workgroups, that continue to release and develop and release benchmarks for the industry.

Datasets

MLCommons releases public datasets to help academics and entrepreneurs develop new technologies and start new companies. Datasets released by MLCommons include:

Dollar Street

A collection of images showing everyday household items from homes around the world, visually capturing the socioeconomic diversity of traditionally underrepresented populations. It includes 38,479 images collected from 63 different countries, tagged from a set of 289 possible topics. The metadata for each image includes demographic information such as region, country, and total household monthly income. Dollar Street consists of public domain data, licensed for academic, commercial, and non-commercial usage, under CC-BY and CC-BY-SA 4.0.

Multilingual Spoken Words

A growing audio dataset of spoken words in 50 languages for academic research and commercial applications in keyword spotting and spoken term search. The dataset contains more than 340,000 keywords, totaling 23.4 million 1-second spoken examples (over 6,000 hours). The Multilingual Spoken Words Corpus is licensed under CC-BY 4.0.

People’s Speech

One of the world’s largest English speech recognition datasets, including 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. The People's Speech dataset is large enough to train speech-to-text systems and is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0.

Founding

The foundations of MLCommons started with the MLPerf benchmarks in 2018 that established industry-standard metrics to measure machine learning performance and quickly grew to encompass data sets and best practices. The community behind the MLPerf benchmarks included members from every continent and grew to over 70 supporting organizations from software startups, to researchers at top universities, and to cloud computing and semiconductor giants. MLCommons grew out of this effort and the consortium formed on December 3, 2020.

Edits on 19 Jul, 2023

Arthur Smalley

edited on 19 Jul, 2023

Edits made to:

Infobox (+9 properties)

Infobox

YouTube Channel

https://www.youtube.com/@mlcommons

Community Forum

https://groups.google.com/a/mlcommons.org/g/public

Founded Date

December 3, 2020