Giving everyone the ability to comprehend one another might soon be Watch V Onlinemuch easier.
A team from the University of Oxford's Department of Computer Science has developed new lip-reading software, LipNet, which they claim is the most accurate of its kind to date by a wide margin.
SEE ALSO: It's not just you: Siri is getting smarterThe development of the software, which was supported in part by Alphabet's DeepMind AI program, has been detailed in a paper which reports LipNet has bested the existing top marks in lipreading tech accuracy by 13.8 percent. The previous best software and its 79.6 percent mark was already light-years ahead of the efforts of human lip-readers, who averaged 52.3 percent accuracy with the same test.
Counter to practical logic, the breakthrough is actually in part thanks to a lessrefined approach to the task -- at least in terms of scale. The Oxford team expanded their focus from a speaker's individual words, which every previous system had used, to the larger constructions on the sentence level.
LipNet is the first lip-reading model to operate at sentence-level.
According to the paper, "All existing [lip-reading approaches] perform only word classification, not sentence-level sequence prediction.... To the best of our knowledge, LipNet is the first lip-reading model to operate at sentence-level."
In other words, the software became more effective as it moved closer to the way the human brain best processes this type of visual data. It takes the video of a speaker and instead of honing in on each and every word as a distinct entity, its deep-learning predictive capabilities allow it to place them within a larger context for greater understanding (you can see it in action in the video above).
A member of the team, Oxford Professor and Google DeepMind scientist Nando de Freitas, has taken to social media to give the general public more context than they might have been able to find in the cut-and-dry jargon of the paper.
First, he clarified that the software has not yet been put to task beyond the baseline test and needs further development:
Thanks to CIFAR's support for ambitious research. Note this is still restricted to a simple dataset, but a significant improvement. https://t.co/lHJFfpqyBa
— Nando de Freitas (@NandoDF) November 8, 2016
More hopefully, he hinted at the great potential LipNet has for practical use:
We're excited to use this research to build better human-computer interfaces and hearing aids. https://t.co/lHJFfpqyBa
— Nando de Freitas (@NandoDF) November 8, 2016
Most importantly, this heightened level of accuracy opens up new possibilities. For those who depend on sign language and, to a lesser degree, lip-reading, communication can be extremely challenging.
There are also clear benefits for people in general: Reading lips could potentially become something anyone with a smartphone could do, and voice command systems may become even more accurate with the application of software like LipNet.
Topics Artificial Intelligence
(Editor: {typename type="name"/})
The Baffler’s May Day Round Up
SpaceX's Starlink will provide free satellite internet to families in Texas school district
Best Hydro Flask deal: Save $10 on a 24
Best iPad deal: Save $132 on Apple iPad (10th Gen)
Q&A with tendercare founder and CEO Shauna Sweeney
Best headphones deal: Save up to 51% on Beats at Amazon
Best robot vacuum deal: Eufy Omni C20 robot vacuum and mop at record
How to Easily Make iPhone Ringtones Using Only iTunes
New MIT report reveals energy costs of AI tools like ChatGPT
I'm a college professor. My advice to young people who feel hooked on tech
接受PR>=1、BR>=1,流量相当,内容相关类链接。