Search this site
Embedded Files
Skip to main content
Skip to navigation
Generative AI
Insights
Ethical challenges with Generative AI
How to try (even you do not know AI)
News that explains trends
Giant players News: When it becomes a Fight of titans
French News
Chinese News
Creative AI for Kids
Business applications
Text & Generative AI Business applications
Sound & Generative AI Business applications
Visual & Generative AI Business applications
Games & Generative AI
Failures & what we can learn from it
Successes & What we can learn from them?
About
Generative AI
Insights
Ethical challenges with Generative AI
How to try (even you do not know AI)
News that explains trends
Giant players News: When it becomes a Fight of titans
French News
Chinese News
Creative AI for Kids
Business applications
Text & Generative AI Business applications
Sound & Generative AI Business applications
Visual & Generative AI Business applications
Games & Generative AI
Failures & what we can learn from it
Successes & What we can learn from them?
About
More
Generative AI
Insights
Ethical challenges with Generative AI
How to try (even you do not know AI)
News that explains trends
Giant players News: When it becomes a Fight of titans
French News
Chinese News
Creative AI for Kids
Business applications
Text & Generative AI Business applications
Sound & Generative AI Business applications
Visual & Generative AI Business applications
Games & Generative AI
Failures & what we can learn from it
Successes & What we can learn from them?
About
China & Generative AI
A player in the the making?
ChatGPT-like AI is ‘difficult to achieve’, China’s tech minister says
Science and Technology Minister Wang Zhigang raised ethical concerns and said the country must ‘wait and see’ when it comes to developing generative AI products.
China bans AI-generated media without watermarks
China regulates generative AI tech with rules that aim to spur growth and ban deception.
Alibaba, Tencent and Baidu join the ChatGPT rush - Nikkei Asia.pdf
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
Due to the complex attention mechanisms and model design, most existing vision Transformers (ViTs) can not perform as efficiently as convolutional neural networks (CNNs) in realistic industrial deployment scenarios, e.g. TensorRT and CoreML. This poses a distinct challenge: Can a visual neural network be designed to infer as fast as CNNs and perform as powerful as ViTs? Recent works have tried to design CNN-Transformer hybrid architectures to address this issue, yet the overall performance of these works is far away from satisfactory. To end these, we propose a next generation vision Transformer for efficient deployment in realistic industrial scenarios, namely Next-ViT, which dominates both CNNs and ViTs from the perspective of latency/accuracy trade-off. In this work, the Next Convolution Block (NCB) and Next Transformer Block (NTB) are respectively developed to capture local and global information with deployment-friendly mechanisms. Then, Next Hybrid Strategy (NHS) is designed to stack NCB and NTB in an efficient hybrid paradigm, which boosts performance in various downstream tasks. Extensive experiments show that Next-ViT significantly outperforms existing CNNs, ViTs and CNN-Transformer hybrid architectures with respect to the latency/accuracy trade-off across various vision tasks. On TensorRT, Next-ViT surpasses ResNet by 5.5 mAP (from 40.4 to 45.9) on COCO detection and 7.7% mIoU (from 38.8% to 46.5%) on ADE20K segmentation under similar latency. Meanwhile, it achieves comparable performance with CSWin, while the inference speed is accelerated by 3.6x. On CoreML, Next-ViT surpasses EfficientFormer by 4.6 mAP (from 42.6 to 47.2) on COCO detection and 3.5% mIoU (from 45.1% to 48.6%) on ADE20K segmentation under similar latency. Our code and models are made public at: https://github.com/bytedance/Next-ViT
Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training
In this paper, we introduce Jointist, an instrument-aware multi-instrument framework that is capable of transcribing, recognizing, and separating multiple musical instruments from an audio clip. Jointist consists of an instrument recognition module that conditions the other two modules: a transcription module that outputs instrument-specific piano rolls, and a source separation module that utilizes instrument information and transcription results. The joint training of the transcription and source separation modules serves to improve the performance of both tasks. The instrument module is optional and can be directly controlled by human users. This makes Jointist a flexible user-controllable framework. Our challenging problem formulation makes the model highly useful in the real world given that modern popular music typically consists of multiple instruments. Its novelty, however, necessitates a new perspective on how to evaluate such a model. In our experiments, we assess the proposed model from various aspects, providing a new evaluation perspective for multi-instrument transcription. Our subjective listening study shows that Jointist achieves state-of-the-art performance on popular music, outperforming existing multi-instrument transcription models such as MT3. We conducted experiments on several downstream tasks and found that the proposed method improved transcription by more than 1 percentage points (ppt.), source separation by 5 SDR, downbeat detection by 1.8 ppt., chord recognition by 1.4 ppt., and key estimation by 1.4 ppt., when utilizing transcription results obtained from Jointist. Demo available at \url{https://jointist.github.io/Demo}.
How China is building a parallel generative AI universe
The gigantic technological leap that machine learning models have shown in the last few months is getting everyone excited about the future of AI — but also nervous about its uncomfortable consequences. After text-to-image tools from Stability AI and OpenAI became the talk of the town, ChatGPT’s ability to hold intelligent conversations is the new […]
How will China respond to ChatGPT?
Generative A.I. is the next front in the U.S.-China tech rivalry.
How China's synthetic media startup Surreal nabs funding in 3 months
What if we no longer needed cameras to make videos and can instead generate them through a few lines of coding? Advances in machine learning are turning the idea into a reality. We’ve seen how deepfakes swap faces in family photos and turn one’s selfies into famous video clips. Now entrepreneurs with AI research background […]
Movio wants to make your marketing videos with generative AI
Generative AI is suddenly everywhere. Over the past year, you’ve probably seen people showcasing impressive AI-generated artworks, thanks to progress in text-to-image algorithms introduced by groups like OpenAI and Stability AI. A proliferation of startups is now trying to devise applications for this new class of language model, where the machine is capable of creating […]
ZMO.ai secures $8M led by Hillhouse to create AI generated fashion models
With breakthroughs in machine learning, it’s no longer uncommon to see algorithmically generated bodies that can move and talk authentically like real humans. The question is now down to whether startups offering such products can achieve a sustainable business model. Some of them have demonstrated that potential and attracted investors. ZMO.ai, founded by a team […]
Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse