Overview Recently, a text-to-speech model has become popular, that is: ChatTTS. Moreover, this model was developed by a small team in China. Focusing on […]
The release of GPT-4o marks a new milestone for large hybrid input-output models with richer and faster conversations. Its free use and reduced API fees make this technology accessible to more people. Compared to Gemini1.5 Pro, GPT-4o is superior in conversational experience and promotes the development of robots and AIGC applications. This progress represents an important development in the field of generative AI, bringing new possibilities to areas such as human-computer interaction and content generation.
Video generation models such as Sora and Stable Video Dissfusion often face the problem of being unable to accurately control the output video, especially in terms of character movements. The controllable video model can accurately control the character movements in the video through prompt words. Viggle AI, as the first video-3D model with actual physical understanding capabilities, can freely control character movements and is embedded in the Discord platform. This controllable video technology will significantly reduce the cost of digital human products and enable diversified digital human video creation.
After testing the newly upgraded multi-modal AI model Gemini 1.5 Pro, users found that although it supports a more comprehensive input type including text, pictures, videos, files and folders, the reasoning ability has not been significantly improved, especially in distinguishing right from wrong. Additionally, processing of video, file, and folder inputs takes a long time, and there are limitations in handling large amounts of data.
On February 16, 2024, Open AI released its advanced video generation model named Sora, sparking interest almost rivalling that of GPT. Sora, which is not yet available for public use, combines Transformer and diffusion architectures for high-fidelity video simulation. Open AI's TikTok showcases Sora's capabilities with unedited videos from various prompts, previewing its potential impact in the burgeoning video generation field.
Google Gemini1.5 pro overview Google Gemini1.5 pro on February 15, 2024 […]
1. Google Trends: Compare “AI”, “gpt”, “palworld” This is a screenshot from today (2024/01/31). […]
On November 6, 2023, WordPress v6.4.2 was released. Two days later, I migrated my blog to another server. Later […]
Today, a friend shared an article. Recently, Jasper, the first unicorn company to do AIGC, has returned to zero. Jasper, based on GPT, is […]
AI is a big opportunity. Therefore, everyone is exploring, whether they know what to do or what they don’t know what to do. At present, the direction of exploration is mainly in […]