Chinese Continuous Visual Speech Recognition Challenge Workshop 2024 Has Concluded Successfully

News
August 30, 2024
On the morning of August 16th, the Chinese Continuous Visual Speech Recognition Challenge Workshop 2024 (CNVSRC Workshop 2024) was held at the 19th National Conference on Man-Machine Speech Communication (NCMMSC 2024) in Urumqi,China. The workshop includes CNVSRC 2024 introduction, address, rank announcement, technical report and system description sharing.
The workshop is a forum to exchange ideas regarding Chinese large vocabulary visual speech recognition techniques,and it is held as a special event at NCMMSC 2024 and co-hosted by Tsinghua University, Beijing University of Posts and Telecommunications, Dataocean AI, and the Speech home.
The competition attracted 45 teams from domestic and overseas to participate. After nearly three months of intense competition, teams from Northwestern Polytechnical University, Inner Mongolia University,Wuhan University, and others have performed exceptionally well and ranked at the top.  Detailed results of the competition and the report video will be published on the official website of the competition, please stay tuned http://cnceleb.org/competition
 
Task 1 Single-speaker VSR Fixed Track
Rank TeamID CER on CNVSRC.Single.Eval Report
1 T237 30.4679% Report-T237.pdf
2 T244 39.3110% Report-T244.pdf
Task 1 Single-speaker VSR Open Track
Rank TeamID CER on CNVSRC.Single.Eval Report
1 T170 30.0680% Anonymous Submission
2 T237 30.4679% Report-T237.pdf
Task 2 Multi-speaker VSR Fixed Track
Rank TeamID CER on CNVSRC.Multi.Eval Report
1 T237 34.2955% Report-T237.pdf
2 T170 45.3244% Anonymous Submission
3 T244 47.9259% Report-T244.pdf
Task 2 Multi-speaker VSR Open Track
Rank TeamID CER on CNVSRC.Multi.Eval Report
1 T237 34.2955% Report-T237.pdf
2 T170 38.3454% Anonymous Submission
3 T405 57.7762% Report-T405.pdf
The workshop was hosted by Professor Wang Dong from Tsinghua University. Helen Wang , CMO of Dataocean AI, and Mr. Bu Hui, founder and CEO of Speech home, announced awards to the winning teams. Liu Zehua, a student from Beijing University of Posts and Telecommunications, shared the technical report. Representatives from three outstanding participating teams were also invited to share their technical solutions and competition experiences.
 CNVSRC 2024 Introduction. – Dong Wang,THU
 CNVSRC 2024 Address. – Helen Wang, Dataocean AI
 CNVSRC 2024 Address. – Hui Bu, Speech Home
 CNVSRC 2024 Technical Report
CNVSRC 2024 Rank Announcement

 

The representative of the Northwestern Polytechnical University team shared technical insights

The representative of the Inner Mongolia University team shared technical insights online.

The representative of the Wuhan University team shared technical insights via an online presentation

 

CNVSRC 2024 Photo

CNVSRC 2024 Organization Committee Member

 

Visual Speech Recognition

Visual Speech Recognition, also known as Lipreading Recognition, is a technology that infers the content of speech from lip movements. This technology has important applications in public safety, assistance for the elderly and disabled, and video authentication, among other fields. Currently, research in Lipreading Recognition is in ongoing development , and while significant progress has been made in the recognition of isolated words and phrases, there are still huge challenges in large vocabulary continuous recognition. Especially for Chinese, the research progress in this field is limited due to the lack of corresponding data resources. To address this, Tsinghua University released the CN-CVS dataset [1] in 2023, becoming the first large-scale continuous visual-speech dataset in Mandarin Chinese, providing possibilities for further advancing large vocabulary continuous visual speech recognition (LVCVSR), and held the CNVSRC 2023 competition [2] in the same year, promoting the progress of Lipreading Recognition in the Chinese domain.

To further promote this research direction, Tsinghua University, in conjunction with Beijing University of Posts and Telecommunications, Dataocean AI, and Speech home, continued to hold the CNVSRC 2024 at NCMMSC 2024. In this competition, many participating teams achieved significant improvements in system performance on the Lipreading Recognition task, with the best results showing an improvement of over 30% compared to the baseline system. In addition, compared to CNVSRC 2023, there has been a noticeable progress in the scores of all tracks in 2024. Various innovative solutions have been proposed by the participating teams, providing new ideas and methods for the research of large-vocabulary continuous visual speech recognition in Chinese.

[1]  C. Chen, D. Wang, T.F. Zheng, CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis, ICASSP, 2023.
[2] C. Chen, Z. Liu, X. Li, L. Li, D. Wang, CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge, INTERSPEECH, 2024.

Share this post

Related articles

WX20240929-172037@2x
Dataocean AI New Datasets - September
cn cover1
Chinese Continuous Visual Speech Recognition Challenge Workshop 2024 Has Concluded Successfully
Presentation2 2.jpg555
Hi-Scene's Dataset of Over 50,000 Sports Videos Enhances AI Referees' Precision in Capturing Thrilling Moments

Join our newsletter to stay updated

Thank you for signing up!

Stay informed and ahead with the latest updates, insights, and exclusive content delivered straight to your inbox.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.