Gemini Outperforms ChatGPT 4: Next Steps in Niche Markets?
Gemini is Google’s latest LLMs, first revealed by Google CEO Sundar Pichai at the I/O developers conference in June, and is now being rolled out to the public. Gemini was built from the start as a multimodal model, meaning it can generalize and seamlessly understand, manipulate, and combine different types of information, including text, code, […]
Advancements in Conference ASR Amidst the Ubiquity of Foundation Models
Automatic Speech Recognition (ASR) for meeting scenarios is an advanced voice processing technology designed for multi-party conference environments. Its core goal is to solve the problem of ‘who said what and when,’ meaning it accurately identifies and records the speech of each participant in a meeting, and marks the identity and speaking time of each […]
Endowing AI Companions With More Emotions
Caryn.AI is a virtual persona synthesized using AI virtual character technology, based on the internet celebrity Caryn Marjorie. The Caryn Marjorie is a influencer with 2 million followers on SnapChat. She launched Caryn AI, an AI chatbot based on the GPT-4 API interface, which possesses her voice, speech, and personality. Caryn AI does not have […]
Congratulations to all the winning teams of CNVSRC 2023!
On December 9, the NCMMSC-CNVSRC 2023 Workshop was held in Suzhou, where the final results of CNVSRC 2023 were announced, congratulations to all the winning teams! This challenge is initiated by the organization committee of NCMMSC 2023, and co-sponsored by Tsinghua University, Beijing University of Posts and Telecommunications, DataOceanAI, and SpeechHome. More than 85 teams […]
How to Make Speech Synthesis Benefit Everyone
The development of artificial intelligence technology is advancing by leaps and bounds, especially after the release of various pre-trained large models. Various industries have been impacted to varying degrees. In developed cities, people may have already become accustomed to intelligent robots serving as guides, home interactive devices, smart driving systems, and various other intelligent devices. […]
Vertical Domain LLM Adapter – Retrieval Augment Generation (RAG)
In today’s field of AI, LLM are undoubtedly the most eye-catching aspect. However, the high training costs and lengthy training times serve as major bottlenecks for many enterprises in adopting large-scale models. For large enterprises, the race is about developing their own LLM, while for smaller businesses, it’s about how to best apply existing LLM […]