Enhancing Voice Assistant Intelligence through LLM

“Hi Siri, what do you think of the iPhone 15?R […]
Data Cleaning – Warm-up Before Training Large Language Models

In the competition of AI large language models, dataset […]
ChatGPT Goes Multimodal: Excelling in Audio, Text, and Image Interpretation

Recently, OpenAI released a multimodal voice and image […]
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

Audio-visual speech recognition has received a lot of a […]
Chinese Continuous Visual Speech Recognition Challenge 2023

Visual speech recognition, also known as lip reading, i […]
SeamLessM4T: A Multi-Modal Model Beyond the Constraints of LLM

On August 23, Meta released a new large speech recognit […]