AI EXPRESS - Hot Deal 4 VCs instabooks.co
  • AI
    Harnessing the power of GPT-3 in scientific research

    Harnessing the power of GPT-3 in scientific research

    How Tymely combines NLP and a human-in-the-loop approach to improve chatbot conversations

    ChatGPT and LLM-based chatbots set to improve customer experience

    Light Field Lab raises $50M to manufacture its SolidLight holographic displays

    Light Field Lab raises $50M to manufacture its SolidLight holographic displays

    Google 'Live in Paris' event offers muted response to Microsoft's 'race' in search

    Google ‘Live in Paris’ event offers muted response to Microsoft’s ‘race’ in search

    The 'race starts today' in search as Microsoft reveals new OpenAI-powered Bing, 'copilot for the web'

    The ‘race starts today’ in search as Microsoft reveals new OpenAI-powered Bing, ‘copilot for the web’

    You can't find state-of-the-art suppliers alone

    You can’t find state-of-the-art suppliers alone

  • ML
    Optimize your machine learning deployments with auto scaling on Amazon SageMaker

    Optimize your machine learning deployments with auto scaling on Amazon SageMaker

    Amazon SageMaker Automatic Model Tuning now supports three new completion criteria for hyperparameter optimization

    Amazon SageMaker Automatic Model Tuning now supports three new completion criteria for hyperparameter optimization

    first sample notebook

    Share medical image research on Amazon SageMaker Studio Lab for free

    Image classification model selection using Amazon SageMaker JumpStart

    Image classification model selection using Amazon SageMaker JumpStart

    Create powerful self-service experiences with Amazon Lex on Talkdesk CX Cloud contact center

    Create powerful self-service experiences with Amazon Lex on Talkdesk CX Cloud contact center

    Analyze and visualize multi-camera events using Amazon SageMaker Studio Lab

    Analyze and visualize multi-camera events using Amazon SageMaker Studio Lab

    Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

    Predict football punt and kickoff return yards with fat-tailed distribution using GluonTS

    Scaling distributed training with AWS Trainium and Amazon EKS

    Scaling distributed training with AWS Trainium and Amazon EKS

    How to decide between Amazon Rekognition image and video API for video moderation

    How to decide between Amazon Rekognition image and video API for video moderation

  • NLP
    Presight AI and G42 Healthcare sign an MOU

    Presight AI and G42 Healthcare sign an MOU

    Meet Sketch: An AI code Writing Assistant For Pandas

    Meet Sketch: An AI code Writing Assistant For Pandas

    Exploring The Dark Side Of OpenAI's GPT Chatbot

    Exploring The Dark Side Of OpenAI’s GPT Chatbot

    OpenAI launches tool to catch AI-generated text

    OpenAI launches tool to catch AI-generated text

    Year end report, 1 May 2021- 30 April 2022.

    U.S. Consumer Spending Starts to Sputter; Labor Report to Give Fed Look at Whether Rate Increases Are Cooling Rapid Wage Growth

    Meet ETCIO SEA Transformative CIOs 2022 Winner Edmund Situmorang, CIOSEA News, ETCIO SEA

    Meet ETCIO SEA Transformative CIOs 2022 Winner Edmund Situmorang, CIOSEA News, ETCIO SEA

    His Highness Sheikh Theyab bin Zayed Al Nahyan witnesses MBZUAI inaugural commencement

    His Highness Sheikh Theyab bin Zayed Al Nahyan witnesses MBZUAI inaugural commencement

    Hyperscale Revolution

    Companies that are leading the way

    ChatGPT and I wrote this article

    ChatGPT and I wrote this article

  • Vision
    Analyzing the Power of CLIP for Image Representation in Computer Vision

    Analyzing the Power of CLIP for Image Representation in Computer Vision

    What is a Computer Vision Platform? Complete Guide in 2023

    What is a Computer Vision Platform? Complete Guide in 2023

    Training YOLOv8 on Custom Data

    Training YOLOv8 on Custom Data

    The Best Applications of Computer Vision in Agriculture (2022)

    The Best Applications of Computer Vision in Agriculture (2022)

    A Review of the Image Quality Metrics used in Image Generative Models

    A Review of the Image Quality Metrics used in Image Generative Models

    CoaXPress Frame Grabbers for Machine Vision

    CoaXPress Frame Grabbers for Machine Vision

    Translation Invariance & Equivariance in Convolutional Neural Networks

    Translation Invariance & Equivariance in Convolutional Neural Networks

    Roll Model: Smart Stroller Pushes Its Way to the Top at CES 2023

    Roll Model: Smart Stroller Pushes Its Way to the Top at CES 2023

    Image Annotation: Best Software Tools and Solutions in 2023

    Image Annotation: Best Software Tools and Solutions in 2023

  • Robotics
    A red industrial robot arm sitting on a mobile black box base on against a black background.

    Rapid Robotics to offer Yaskawa industrial robots

    A silver SCARA robot.

    Yamaha Motor announces robotics business in Singapore

    A white drone flying out of a black and grey box labeled "Airobotics" against a black and white sky.

    Airobotics receives $3.5M purchase order from SkyGo

    From left to right, a white platform on wheels with three robotic arms, a monitor on a white stand and another white and black stand.

    J&J’s Ethicon completes first robot-assisted kidney stone removal with Monarch platform

    a male model wear the shoulder harness with right arm outstretched.

    Soft robotic wearable restores arm function for people with ALS

    Meet the Robotics Summit & Expo keynote speakers

    Meet the Robotics Summit & Expo keynote speakers

    ABB uses robots to automate COVID antibody testing

    ABB uses robots to automate COVID antibody testing

    A silver and black hollow shaft gear unit from Harmonic Drive.

    Harmonic Drive launches HPF series of hollow shaft gear units

    A UR cobot performs a place operation.

    Rapid Robotics and Universal Robots team up to accelerate cobot deployments

  • RPA
    Avoid Patient Queues with Automated Query Resolution

    Avoid Patient Queues with Automated Query Resolution

    RPA in Banking & Finance 2023 (Use Cases, Benefits, Challenges, Trends)

    RPA in Banking & Finance 2023 (Use Cases, Benefits, Challenges, Trends)

    Future of Electronic Visit Verification (EVV) for Homecare

    Future of Electronic Visit Verification (EVV) for Homecare

    Benefits of Implementing RPA in Banking Industry

    Benefits of Implementing RPA in Banking Industry

    Robotic Process Automation

    What is RPA (Robotic Process Automation)?

    Top RPA Use Cases in Banking Industry in 2023

    Top RPA Use Cases in Banking Industry in 2023

    Accelerate Account Opening Process Using KYC Automation

    Accelerate Account Opening Process Using KYC Automation

    RPA Case Study in Banking

    RPA Case Study in Banking

    Reducing Service Ticket Volumes through Automated Password Reset Process

    Reducing Service Tickets Volume Using Password Reset Automation

  • Gaming
    God of War Ragnarok had a banner debut week at UK retail

    God of War Ragnarok had a banner debut week at UK retail

    A Little To The Left Review (Switch eShop)

    A Little To The Left Review (Switch eShop)

    Horizon Call of the Mountain will release alongside PlayStation VR2 in February

    Horizon Call of the Mountain will release alongside PlayStation VR2 in February

    Sonic Frontiers has Dreamcast-era jank and pop-in galore - but I can't stop playing it

    Sonic Frontiers has Dreamcast-era jank and pop-in galore – but I can’t stop playing it

    Incredible November Xbox Game Pass addition makes all other games obsolete

    Incredible November Xbox Game Pass addition makes all other games obsolete

    Free Monster Hunter DLC For Sonic Frontiers Now Available On Switch

    Free Monster Hunter DLC For Sonic Frontiers Now Available On Switch

    Somerville review: the most beautiful game I’ve ever played

    Somerville review: the most beautiful game I’ve ever played

    Microsoft Flight Sim boss confirms more crossover content like Halo's Pelican and Top Gun Maverick

    Microsoft Flight Sim boss confirms more crossover content like Halo’s Pelican and Top Gun Maverick

    The Game Awards nominations are in, with God of War Ragnarok up for 10 of them

    The Game Awards nominations are in, with God of War Ragnarok up for 10 of them

  • Investment
    CFEX

    CFEX Closes Seed Funding – FinSMEs

    181 travel

    181travel Raises €2.5M in Funding

    HourWork Raises $10M in Series A Funding

    Amai Group Acquires Career Sidekick

    Thorne Helthtech

    Thorne Healthtech Acquires Precon Health, for USD5M

    Partech Africa fund

    Partech Africa II Reaches 1st Close, at €245M   

    Mazepay

    Mazepay Raises €4M in Growth Funding

    uniifi

    Uniify RaiseS €3M in Seed Funding

    Uniphore

    Uniphore Acquires Hexagone

    Avicenna

    Avicenna.AI Raises $10M Series A Funding

  • More
    • Data analytics
    • Apps
    • No Code
    • Cloud
    • Quantum Computing
    • Security
    • AR & VR
    • Esports
    • IOT
    • Smart Home
    • Smart City
    • Crypto Currency
    • Blockchain
    • Reviews
    • Video
No Result
View All Result
AI EXPRESS - Hot Deal 4 VCs instabooks.co
No Result
View All Result
Home AI

MLCommons releases open source datasets for speech recognition

seprameen by seprameen
December 21, 2021
in AI
0
MLCommons releases open source datasets for speech recognition
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter

Hear from CIOs, CTOs, and different C-level and senior execs on information and AI methods on the Way forward for Work Summit this January 12, 2022. Study extra


Let the OSS Enterprise e-newsletter information your open supply journey! Sign up here.

MLCommons, the nonprofit consortium devoted to creating open AI growth instruments and assets, as we speak introduced the discharge of the Folks’s Speech Dataset and the Multilingual Spoken Phrases Corpus. The consortium claims that the Folks’s Speech Dataset is among the many world’s most complete English speech datasets licensed for tutorial and industrial utilization, with tens of hundreds of hours of recordings, and that the Multilingual Spoken Phrases Corpus (MSWC) is likely one of the largest audio speech datasets with key phrases in 50 languages.

No-cost datasets reminiscent of TED-LIUM and LibriSpeech have lengthy been accessible for builders to coach, take a look at, and benchmark speech recognition programs. However some, like Fisher and Switchboard, require licensing or comparatively excessive one-time funds. This places even well-resourced organizations at a drawback in contrast with tech giants reminiscent of Google, Apple, and Amazon, which may collect giant quantities of coaching information by means of gadgets like smartphones and sensible audio system. For instance, 4 years in the past, when researchers at Mozilla started creating the English-language speech recognition system DeepSpeech, the staff needed to attain out to TV and radio stations and language departments at universities to complement the general public speech information that they had been capable of finding.

With the discharge of the Folks’s Speech Dataset and the MSWC, the hope is that extra builders will be capable of construct their very own speech recognition programs with fewer budgetary and logistical constraints than beforehand, in response to Keith Achorn. Achorn, a machine studying engineer at Intel, is likely one of the researchers who’s overseen the curation of the Folks’s Speech Dataset and the MSWC over the previous a number of years.

“Fashionable machine studying fashions depend on huge portions of knowledge to coach. Each ‘The Folks’s Speech’ and ‘MSWC’ are among the many largest datasets of their respective courses. MSWC is of explicit curiosity for its inclusion of fifty languages,” Achorn informed VentureBeat by way of electronic mail. “In our analysis, most of those 50 languages had no keyword-spotting speech datasets publicly accessible till now, and even these which did had very restricted vocabularies.”

See also  Anonymous Source Leaks 4TB of Cellebrite Data Online After Cyberattack

Open-sourcing speech tooling

Beginning in 2018, a working group shaped beneath the auspices of MLCommons to establish and chart the 50 most-used languages on this planet right into a single dataset — and work out a solution to make the dataset helpful. Members of the staff got here from Harvard and the College of Michigan in addition to Alibaba, Oracle, Google, Baidu, Intel, and others.

The researchers who put the dataset collectively had been a global group hailing from the U.S., South America, and China. They met weekly for a number of years by way of convention name, every bringing a specific experience to the undertaking.

The undertaking finally spawned two datasets as an alternative of 1 — the Folks’s Speech Dataset and the MSWC — that are individually detailed in whitepapers being introduced this week on the annual Convention on Neural Info Processing Techniques (NeurIPS). The Folks’s Speech Dataset targets speech recognition duties, whereas MSWC includes key phrase recognizing, which offers with the identification of key phrases (e.g., “OK, Google,” “Hey, Siri”) in recordings.

Folks’s Speech Dataset versus MSWC

The Folks’s Speech Dataset includes over 30,000 hours of supervised conversational audio launched beneath a Artistic Commons license, which can be utilized to create the form of voice recognition fashions powering voice assistants and transcription software program. However, MSWC — which has greater than 340,000 key phrases with upwards of 23.4 million examples, spanning languages spoken by over 5 billion individuals — is designed for functions like name facilities and sensible gadgets.

Earlier speech datasets relied on handbook efforts to gather and confirm hundreds of examples for particular person key phrases, and had been generally restricted to a single language. Furthermore, these datasets didn’t leverage “various speech,” which means that they poorly represented a pure atmosphere — missing accuracy-boosting variables like background noise, casual speech patterns, and a combination of recording tools.

Each the Folks’s Speech Dataset and the MSWC even have permissive licensing phrases, together with industrial use, which stands in distinction to many speech coaching libraries. Datasets sometimes both fail to formalize their licenses, counting on end-users to take accountability, or are restrictive within the sense that they prohibit use in merchandise certain for the open market.

“The working group envisioned a number of use circumstances throughout the growth course of. Nonetheless, we’re additionally conscious that these spoken phrase datasets might discover additional use by fashions and programs we didn’t but envision,” Achorn continued. “As each datasets proceed to develop and develop beneath the route of MLCommons, we’re looking for further sources of high-quality and various speech information. Discovering sources which adjust to our open licensing phrases makes this tougher, particularly for non-English languages. On a extra technical stage, our pipeline makes use of compelled alignment to match speech audio with transcript textual content. Though strategies had been devised to compensate for combined transcript high quality, enhancing accuracy comes at a value to the amount of knowledge.”

See also  The emergence of the chief automation officer

Open supply pattern

The Folks’s Speech Dataset enhances the Mozilla Basis’s Frequent Voice, one other of the most important speech datasets on this planet, with greater than 9,000 hours of voice information in 60 totally different languages. In an indication of rising curiosity within the area, Nvidia just lately introduced that it might make investments $1.5 million in Frequent Voice to have interaction extra communities and volunteers and help the hiring of recent workers.

Lately, voice expertise has surged in adoption amongst enterprises specifically, with 68% of firms reporting they’ve a voice expertise technique in place, in response to Speechmatics — an 18% improve from 2019. And among the many firms that don’t, 60% plan to within the subsequent 5 years.

Constructing datasets for speech recognition stays a labor-intensive pursuit, however one promising strategy coming into wider use is unsupervised studying, which may minimize down on the necessity for bespoke coaching libraries. Conventional speech recognition programs require examples of speech labeled to point what’s being mentioned, however unsupervised programs can be taught with out labels by choosing up on refined relationships inside the coaching information.

Researchers at Guinea-based tech accelerator GNCode and Stanford have experimented with utilizing radio archives in creating unsupervised programs for “low-resource” languages, significantly Maninka, Pular, and Susu within the Niger Congo household. A staff at MLCommons known as 1000 Phrases in 1000 Languages is making a pipeline that may take any recorded speech and mechanically generate clips to coach compact speech recognition fashions. Individually, Fb has developed a system, dubbed Wave2vec-U, that may be taught to acknowledge speech from unlabeled information.

Source link

Tags: datasetsMLCommonsopenrecognitionreleasessourcespeech
Previous Post

Students teach AI to make beer

Next Post

Best Programming Languages for Machine Learning

seprameen

seprameen

Next Post

Best Programming Languages for Machine Learning

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Newsletter

Popular Stories

  • T-Mobile announces another data breach, impacting 37 million accounts

    T-Mobile announces another data breach, impacting 37 million accounts

    0 shares
    Share 0 Tweet 0
  • Study determine the average age at conception for men and women throughout the past 250,000 years

    0 shares
    Share 0 Tweet 0
  • Watch Boston Dynamics’ Stretch unload a DHL trailer

    0 shares
    Share 0 Tweet 0
  • How to Log in to Your Router | Secure your Wi-Fi Network

    0 shares
    Share 0 Tweet 0
  • Tiny11 is out, promising to be Windows 11 without steep hardware requirements

    0 shares
    Share 0 Tweet 0

Artificial Intelligence Jobs

View 115 AI Jobs at Tesla

View 165 AI Jobs at Nvidia

View 105 AI Jobs at Google

View 135 AI Jobs at Amamzon

View 131 AI Jobs at IBM

View 95 AI Jobs at Microsoft

View 205 AI Jobs at Meta

View 192 AI Jobs at Intel

Accounting and Finance Hub

Raised Seed, Series A, B, C Funding Round

Get a Free Insurance Quote

Try Our Accounting Service

AI EXPRESS – Hot Deal 4 VCs instabooks.co

AI EXPRESS is a news site that covers the latest developments in Artificial Intelligence, Data Analytics, ML & DL, Algorithms, RPA, NLP, Robotics, Smart Homes & Cities, Cloud & Quantum Computing, AR & VR and Blockchains

Categories

  • AI
  • Ai videos
  • Apps
  • AR & VR
  • Blockchain
  • Cloud
  • Computer Vision
  • Crypto Currency
  • Data analytics
  • Esports
  • Gaming
  • Gaming Videos
  • Investment
  • IOT
  • Iot Videos
  • Low Code No Code
  • Machine Learning
  • NLP
  • Quantum Computing
  • Robotics
  • Robotics Videos
  • RPA
  • Security
  • Smart City
  • Smart Home

Quick Links

  • Reviews
  • Deals
  • Best
  • AI Jobs
  • AI Events
  • AI Directory
  • Industries

© 2021 Aiexpress.io - All rights reserved.

  • Contact
  • Privacy Policy
  • Terms & Conditions

No Result
View All Result
  • AI
  • ML
  • NLP
  • Vision
  • Robotics
  • RPA
  • Gaming
  • Investment
  • More
    • Data analytics
    • Apps
    • No Code
    • Cloud
    • Quantum Computing
    • Security
    • AR & VR
    • Esports
    • IOT
    • Smart Home
    • Smart City
    • Crypto Currency
    • Blockchain
    • Reviews
    • Video

© 2021 Aiexpress.io - All rights reserved.