AI EXPRESS
  • AI
    Rain nabs $11M to build voice experiences for brands

    Rain nabs $11M to build voice products

    Finding AI’s low-hanging fruit | VentureBeat

    Finding AI’s low-hanging fruit | VentureBeat

    4 key areas of opportunity for automation

    For AI model success, utilize MLops and get the data right

    Crippling AI cyberattacks are inevitable: 4 ways security pros can prepare

    Crippling AI cyberattacks are inevitable: 4 ways security pros can prepare

    AI

    How AI Is Being Used to Assess Risk

    Rain nabs $11M to build voice experiences for brands

    Rain nabs $11M to build voice experiences for brands

  • ML
    Personalize your machine translation results by using fuzzy matching with Amazon Translate

    Personalize your machine translation results by using fuzzy matching with Amazon Translate

    Moderate, classify, and process documents using Amazon Rekognition and Amazon Textract

    Moderate, classify, and process documents using Amazon Rekognition and Amazon Textract

    The Intel®3D Athlete Tracking (3DAT) scalable architecture deploys pose estimation models using Amazon Kinesis Data Streams and Amazon EKS

    The Intel®3D Athlete Tracking (3DAT) scalable architecture deploys pose estimation models using Amazon Kinesis Data Streams and Amazon EKS

    Intelligently search your Jira projects with Amazon Kendra Jira cloud connector

    Intelligently search your Jira projects with Amazon Kendra Jira cloud connector

    Enhance the caller experience with hints in Amazon Lex

    Enhance the caller experience with hints in Amazon Lex

    Image classification and object detection using Amazon Rekognition Custom Labels and Amazon SageMaker JumpStart

    Image classification and object detection using Amazon Rekognition Custom Labels and Amazon SageMaker JumpStart

    Run automatic model tuning with Amazon SageMaker JumpStart

    Run automatic model tuning with Amazon SageMaker JumpStart

    Achieve in-vehicle comfort using personalized machine learning and Amazon SageMaker

    Achieve in-vehicle comfort using personalized machine learning and Amazon SageMaker

    Example of subtitles toggled on within a web video player

    Create video subtitles with Amazon Transcribe using this no-code workflow

  • NLP
    This file image, provided by SK Telecom Co., shows the telecom giant

    SK Telecom Launches AI Service that Supports Natural Language Dialogue

    Researchers Propose A Graph-Based Machine Learning Method To Quantify The Spatial Homogeneity Of Subnetworks

    Researchers Propose A Graph-Based Machine Learning Method To Quantify The Spatial Homogeneity Of Subnetworks

    Westpac fund backs start-up that enables AI phone calls

    Westpac fund backs start-up that enables AI phone calls

    Biased data is anathema to society says the SAS CTO who has made it his mission to stamp bias out

    Biased data is anathema to society says the SAS CTO who has made it his mission to stamp bias out

    ELaPro, a LOINC-mapped core dataset for top laboratory procedures of eligibility screening for clinical trials | BMC Medical Research Methodology

    ELaPro, a LOINC-mapped core dataset for top laboratory procedures of eligibility screening for clinical trials | BMC Medical Research Methodology

    The problem with self-driving cars

    The problem with self-driving cars

    These 5 robotic startups are impacting healthcare sector with their innovation

    These 5 robotic startups are impacting healthcare sector with their innovation

    Raidix Era Western Digital

    What is a supercomputer? – Dataconomy

    Data Intelligence Solutions for Sales Market Overview 2022-2029| Key Players – Linkedln, Discoverorg, Zoomlnfo, Datanyze, Dun & Bradstreet

    Japan Cloud Natural Language Processing Market Size 2022 Analysis by 2029

  • Vision
    Creator Karen X. Cheng Brings Keen AI for Design ‘In the NVIDIA Studio’

    Creator Karen X. Cheng Brings Keen AI for Design ‘In the NVIDIA Studio’

    GFN Thursday: ‘Evil Dead: The Game’ on GeForce NOW

    GFN Thursday: ‘Evil Dead: The Game’ on GeForce NOW

    pix2pix Generative Adversarial Networks

    pix2pix Generative Adversarial Networks

    AI-Generated Endangered Species Mix With Times Square’s Nightlife

    AI-Generated Endangered Species Mix With Times Square’s Nightlife

    Shopping Smart: AiFi Using AI to Spark a Retail Renaissance

    Shopping Smart: AiFi Using AI to Spark a Retail Renaissance

    Writing AlexNet from Scratch in PyTorch

    Writing AlexNet from Scratch in PyTorch

    Duos Technologies Uses AI-Powered System for Railcar Inspection

    Duos Technologies Uses AI-Powered System for Railcar Inspection

    Recycleye AI-Driven Systems Aim to Reduce Global Waste

    Recycleye AI-Driven Systems Aim to Reduce Global Waste

    NVIDIA Metropolis Edge AI-on-5G Platform Delivers IVA Over 5G

    NVIDIA Metropolis Edge AI-on-5G Platform Delivers IVA Over 5G

  • Robotics
    Eureka Robotics brings in $4.5M in pre-Series A funding

    Eureka Robotics brings in $4.5M in pre-Series A funding

    NASCAR crash test

    AB Dynamics’ robots at use crash testing NASCAR cars

    depainting a plane

    Advanced cable management lets robots depaint airplanes

    Dusty Robotics raises $45M Series B round

    Dusty Robotics raises $45M Series B round

    Flexxbotics brings in $2.9M in Series A funding

    Flexxbotics brings in $2.9M in Series A funding

    ABB's Mark Joppru joins MiR as VP of sales for the Americas

    ABB’s Mark Joppru joins MiR as VP of sales for the Americas

    Teraki, DriveU.auto partner for teleoperated delivery robots

    Teraki, DriveU.auto partner for teleoperated delivery robots

    Apex.AI receives strategic investment from Daimler Truck

    Apex.AI receives strategic investment from Daimler Truck

    Festo introduces pneumatic cobot arm

    Festo introduces pneumatic cobot arm

  • RPA
    Invoice Management Made Easy With Automation and RPA solution

    Automated Invoice Processing: An Ardent Need of Modern Day Businesses

    Conversational AI- Oomphing Up HR Digitization Factor| AutomationEdge

    Conversational AI- Oomphing Up HR Digitization Factor| AutomationEdge

    Know how to Implement Conversational AI

    Alarm Ringing! Top 10 Tips to go about Conversational Marketing

    UiPath RPA & Microsoft Cloud - Microsoft Inspire 2019

    UiPath RPA & Microsoft Cloud – Microsoft Inspire 2019

    UiPath 2019.7 Monthly Update | UiPath

    UiPath 2019.7 Monthly Update | UiPath

    Take The Wheel of Your Automation Strategy

    Take The Wheel of Your Automation Strategy

    Finding Your Unattended Robots Use Cases (Part 1)

    Finding Your Unattended Robots Use Cases (Part 1)

    EU Urges Public Sector to Use Artificial Intelligence To Improve Services

    EU Urges Public Sector to Use Artificial Intelligence To Improve Services

    2019 Gartner Peer Insights Customers' Choice for RPA

    2019 Gartner Peer Insights Customers’ Choice for RPA

  • Gaming
    Rumours grow as details of a Silent Hill 2 remake emerge following recent leak

    Rumours grow as details of a Silent Hill 2 remake emerge following recent leak

    Random: Man Rescues "Abandoned" Nintendogs, Becomes Viral Sensation On TikTok

    Random: Man Rescues “Abandoned” Nintendogs, Becomes Viral Sensation On TikTok

    Skyrim mod brings Shadow of Mordor's brilliant Nemesis system to Tamriel

    Skyrim mod brings Shadow of Mordor’s brilliant Nemesis system to Tamriel

    Finished Elden Ring but never played Dark Souls? Now's the time

    Finished Elden Ring but never played Dark Souls? Now’s the time

    You can now play Resident Evil 7 and Village in fully-immersive VR on PC

    You can now play Resident Evil 7 and Village in fully-immersive VR on PC

    UK Charts: Nintendo Switch Sports Is Number One For A Third Week

    UK Charts: Nintendo Switch Sports Is Number One For A Third Week

    Square Enix still recommends Balan Wonderworld "with confidence" despite recent lawsuit

    Square Enix still recommends Balan Wonderworld “with confidence” despite recent lawsuit

    This Elden Ring mod lets you hang out with your favourite NPCs

    This Elden Ring mod lets you hang out with your favourite NPCs

    Gears of War could be getting a Master Chief Collection-style collection

    Gears of War could be getting a Master Chief Collection-style collection

  • Investment
    StartPlaying

    StartPlaying Raises $6.5M in Seed Funding

    Akuity Raises $20M in Series A Funding

    Akuity Raises $20M in Series A Funding

    jambo

    Jambo Raises $30M in Series A Funding

    Gusto Collective Raises US$11M in Seed Plus Funding

    Gusto Collective Raises US$11M in Seed Plus Funding

    business intelligence

    Gain.pro Raises USD10M in Funding

    Fleet Nurse

    FleetNurse Receives Investment from HCAP Partners

    Optibus

    Optibus Closes USD100M Series D Funding

    Fresh Technology Raises $7M in Series A Funding

    Fresh Technology Raises $7M in Series A Funding

    ACE & Company Closes Fourth Buyout Co-Investment Fund, at $244M

    Troob Capital Management Closes Second Tactical Opportunities Fund, At $209M

  • More
    • Data analytics
    • Apps
    • No Code
    • Cloud
    • Quantum Computing
    • Security
    • AR & VR
    • Esports
    • IOT
    • Smart Home
    • Smart City
    • Crypto Currency
    • Blockchain
    • Reviews
    • Video
No Result
View All Result
AI EXPRESS
No Result
View All Result
Home AI

MLCommons releases open source datasets for speech recognition

seprameen by seprameen
December 21, 2021
in AI
0
MLCommons releases open source datasets for speech recognition
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter

Hear from CIOs, CTOs, and different C-level and senior execs on information and AI methods on the Way forward for Work Summit this January 12, 2022. Study extra


Let the OSS Enterprise e-newsletter information your open supply journey! Sign up here.

MLCommons, the nonprofit consortium devoted to creating open AI growth instruments and assets, as we speak introduced the discharge of the Folks’s Speech Dataset and the Multilingual Spoken Phrases Corpus. The consortium claims that the Folks’s Speech Dataset is among the many world’s most complete English speech datasets licensed for tutorial and industrial utilization, with tens of hundreds of hours of recordings, and that the Multilingual Spoken Phrases Corpus (MSWC) is likely one of the largest audio speech datasets with key phrases in 50 languages.

No-cost datasets reminiscent of TED-LIUM and LibriSpeech have lengthy been accessible for builders to coach, take a look at, and benchmark speech recognition programs. However some, like Fisher and Switchboard, require licensing or comparatively excessive one-time funds. This places even well-resourced organizations at a drawback in contrast with tech giants reminiscent of Google, Apple, and Amazon, which may collect giant quantities of coaching information by means of gadgets like smartphones and sensible audio system. For instance, 4 years in the past, when researchers at Mozilla started creating the English-language speech recognition system DeepSpeech, the staff needed to attain out to TV and radio stations and language departments at universities to complement the general public speech information that they had been capable of finding.

With the discharge of the Folks’s Speech Dataset and the MSWC, the hope is that extra builders will be capable of construct their very own speech recognition programs with fewer budgetary and logistical constraints than beforehand, in response to Keith Achorn. Achorn, a machine studying engineer at Intel, is likely one of the researchers who’s overseen the curation of the Folks’s Speech Dataset and the MSWC over the previous a number of years.

“Fashionable machine studying fashions depend on huge portions of knowledge to coach. Each ‘The Folks’s Speech’ and ‘MSWC’ are among the many largest datasets of their respective courses. MSWC is of explicit curiosity for its inclusion of fifty languages,” Achorn informed VentureBeat by way of electronic mail. “In our analysis, most of those 50 languages had no keyword-spotting speech datasets publicly accessible till now, and even these which did had very restricted vocabularies.”

See also  Pandemic pushes growth in robotic process automation

Open-sourcing speech tooling

Beginning in 2018, a working group shaped beneath the auspices of MLCommons to establish and chart the 50 most-used languages on this planet right into a single dataset — and work out a solution to make the dataset helpful. Members of the staff got here from Harvard and the College of Michigan in addition to Alibaba, Oracle, Google, Baidu, Intel, and others.

The researchers who put the dataset collectively had been a global group hailing from the U.S., South America, and China. They met weekly for a number of years by way of convention name, every bringing a specific experience to the undertaking.

The undertaking finally spawned two datasets as an alternative of 1 — the Folks’s Speech Dataset and the MSWC — that are individually detailed in whitepapers being introduced this week on the annual Convention on Neural Info Processing Techniques (NeurIPS). The Folks’s Speech Dataset targets speech recognition duties, whereas MSWC includes key phrase recognizing, which offers with the identification of key phrases (e.g., “OK, Google,” “Hey, Siri”) in recordings.

Folks’s Speech Dataset versus MSWC

The Folks’s Speech Dataset includes over 30,000 hours of supervised conversational audio launched beneath a Artistic Commons license, which can be utilized to create the form of voice recognition fashions powering voice assistants and transcription software program. However, MSWC — which has greater than 340,000 key phrases with upwards of 23.4 million examples, spanning languages spoken by over 5 billion individuals — is designed for functions like name facilities and sensible gadgets.

Earlier speech datasets relied on handbook efforts to gather and confirm hundreds of examples for particular person key phrases, and had been generally restricted to a single language. Furthermore, these datasets didn’t leverage “various speech,” which means that they poorly represented a pure atmosphere — missing accuracy-boosting variables like background noise, casual speech patterns, and a combination of recording tools.

Each the Folks’s Speech Dataset and the MSWC even have permissive licensing phrases, together with industrial use, which stands in distinction to many speech coaching libraries. Datasets sometimes both fail to formalize their licenses, counting on end-users to take accountability, or are restrictive within the sense that they prohibit use in merchandise certain for the open market.

“The working group envisioned a number of use circumstances throughout the growth course of. Nonetheless, we’re additionally conscious that these spoken phrase datasets might discover additional use by fashions and programs we didn’t but envision,” Achorn continued. “As each datasets proceed to develop and develop beneath the route of MLCommons, we’re looking for further sources of high-quality and various speech information. Discovering sources which adjust to our open licensing phrases makes this tougher, particularly for non-English languages. On a extra technical stage, our pipeline makes use of compelled alignment to match speech audio with transcript textual content. Though strategies had been devised to compensate for combined transcript high quality, enhancing accuracy comes at a value to the amount of knowledge.”

See also  Report: 69% of enterprises embrace quantum computing

Open supply pattern

The Folks’s Speech Dataset enhances the Mozilla Basis’s Frequent Voice, one other of the most important speech datasets on this planet, with greater than 9,000 hours of voice information in 60 totally different languages. In an indication of rising curiosity within the area, Nvidia just lately introduced that it might make investments $1.5 million in Frequent Voice to have interaction extra communities and volunteers and help the hiring of recent workers.

Lately, voice expertise has surged in adoption amongst enterprises specifically, with 68% of firms reporting they’ve a voice expertise technique in place, in response to Speechmatics — an 18% improve from 2019. And among the many firms that don’t, 60% plan to within the subsequent 5 years.

Constructing datasets for speech recognition stays a labor-intensive pursuit, however one promising strategy coming into wider use is unsupervised studying, which may minimize down on the necessity for bespoke coaching libraries. Conventional speech recognition programs require examples of speech labeled to point what’s being mentioned, however unsupervised programs can be taught with out labels by choosing up on refined relationships inside the coaching information.

Researchers at Guinea-based tech accelerator GNCode and Stanford have experimented with utilizing radio archives in creating unsupervised programs for “low-resource” languages, significantly Maninka, Pular, and Susu within the Niger Congo household. A staff at MLCommons known as 1000 Phrases in 1000 Languages is making a pipeline that may take any recorded speech and mechanically generate clips to coach compact speech recognition fashions. Individually, Fb has developed a system, dubbed Wave2vec-U, that may be taught to acknowledge speech from unlabeled information.

Source link

Tags: datasetsMLCommonsopenrecognitionreleasessourcespeech
Previous Post

Students teach AI to make beer

Next Post

Best Programming Languages for Machine Learning

seprameen

seprameen

Next Post

Best Programming Languages for Machine Learning

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Newsletter

Popular Stories

  • A fan is making the Metroid 64 game that never was

    A fan is making the Metroid 64 game that never was

    0 shares
    Share 0 Tweet 0
  • Android 13 needs to steal a few of Apple’s features to show off at Google IO 2022

    0 shares
    Share 0 Tweet 0
  • Bubbles Raises $8.5M in Seed Funding

    0 shares
    Share 0 Tweet 0
  • Intel shows off new Arctic Sound M graphics chips for the datacenter

    0 shares
    Share 0 Tweet 0
  • Circle Mints 8.4 Billion USDC Within 7 Days, Why?

    0 shares
    Share 0 Tweet 0

Artificial Intelligence Jobs

View 115 AI Jobs at Tesla

View 165 AI Jobs at Nvidia

View 105 AI Jobs at Google

View 135 AI Jobs at Amamzon

View 131 AI Jobs at IBM

View 95 AI Jobs at Microsoft

View 205 AI Jobs at Meta

View 192 AI Jobs at Intel

Accounting and Finance Hub

Raised Seed, Series A, B, C Funding Round

Get a Free Insurance Quote

Try Our Accounting Service

AI EXPRESS

AI EXPRESS is a news site that covers the latest developments in Artificial Intelligence, Data Analytics, ML & DL, Algorithms, RPA, NLP, Robotics, Smart Homes & Cities, Cloud & Quantum Computing, AR & VR and Blockchains

Categories

  • AI
  • Ai videos
  • Apps
  • AR & VR
  • Blockchain
  • Cloud
  • Computer Vision
  • Crypto Currency
  • Data analytics
  • Esports
  • Gaming
  • Gaming Videos
  • Investment
  • IOT
  • Iot Videos
  • Low Code No Code
  • Machine Learning
  • NLP
  • Quantum Computing
  • Robotics
  • Robotics Videos
  • RPA
  • Security
  • Smart City
  • Smart Home

Quick Links

  • Reviews
  • Deals
  • Best
  • AI Jobs
  • AI Events
  • AI Directory
  • Industries

© 2021 Aiexpress.io - All rights reserved.

  • Contact
  • Privacy Policy
  • Terms & Conditions

No Result
View All Result
  • AI
  • ML
  • NLP
  • Vision
  • Robotics
  • RPA
  • Gaming
  • Investment
  • More
    • Data analytics
    • Apps
    • No Code
    • Cloud
    • Quantum Computing
    • Security
    • AR & VR
    • Esports
    • IOT
    • Smart Home
    • Smart City
    • Crypto Currency
    • Blockchain
    • Reviews
    • Video

© 2021 Aiexpress.io - All rights reserved.