Recently, Professor Qin Shuo attended the CES exhibition in Las Vegas.
Throughout his trip across the United States, a "secret weapon" was always attached to the back of his phone—the DingTalk A1 voice recording card. He used it to capture inspiration during meals, highlight key points during executive interviews, and follow updates at press conferences. From morning till night, from simple to complex scenarios, it was present every step of the way.
Here’s a look at his experience sharing.
My Awkward Past with Recorders
Since entering the media industry in 1990, the voice recorder has been an essential tool in my daily work. I started with bulky tape recorders the size of bricks, paired with TDK tapes; later switched to tiny micro-recorders smaller than phones, using compact cassettes. After each recording session, I had to repeatedly replay and manually transcribe every sentence.
My entire career has thus evolved hand-in-hand with recording devices.
Although I'm fairly capable when it comes to content editing and organization, I’ve often struggled with operating tools. Even for simple recorders—featuring just a few buttons like record, stop, fast forward, rewind, and play—I frequently made mistakes.
The most embarrassing moment happened in 1993, when I represented Nanfeng Chuang magazine in a joint interview with then Guangzhou Mayor Li Ziliu. A Xinhua News Agency reporter led the questioning while I handled audio recording and note-taking. I must have pressed the wrong button—soon after, I noticed the cassette tape bulging out of the recorder. Fortunately, the mayor didn’t notice, so I quickly shoved the device into my pocket and hit stop.
When I reviewed the recording afterward, most of the content was missing. I had to ask a TV journalist for help, falsely claiming I needed to verify versions, just to reconstruct the material. That incident left me with lasting psychological trauma.
Ever since then, whether using recorders, voice pens, smart notebooks, or smartphones, I couldn't resist constantly checking the recording status. For important interviews, I’d even use two phones simultaneously to record—only then did I feel secure.
It wasn’t until one day I flipped through my daughter’s university textbook and read Donald Norman’s The Design of Everyday Things that I finally found peace. The book argues that when products malfunction, people tend to blame themselves, but the real cause often lies in design flaws. “User errors should not be blamed on users, but attributed to product and design failures.”
So it wasn’t my fault after all!
Even though this eased my mindset, in reality, truly user-centered recording devices featuring “visibility” and “intuitiveness” remained hard to find. Especially today, many interviews are conducted in English, and the era of self-media demands rapid transcription and publishing—pressure remains high.
An AI Add-on: First Experience with the DingTalk A1 Recording Card
It wasn’t until recently, during my visit to CES (Consumer Electronics Show), where I used the DingTalk AI Voice Recording Card (DingTalk A1), that I finally overcame my recording anxiety. Attached to the back of my phone, it supports intelligent voice notes, content summarization, real-time translation in eight languages, and simultaneous interpretation in over 20 languages. Even in noisy environments as chaotic as a wet market, it captures clearly, records accurately, translates precisely, and summarizes effectively—becoming my first “AI add-on.”
From the heavy analog tape machines of the past to today’s 40-gram AI recording card; from manual transcription to automatic speech-to-text conversion, content distillation, and meeting summary generation—this evolution clearly reflects my personal journey from information age, to digital era, and now into intelligence-driven times.
AI Needs a Physical Body
At 11:49 a.m. on January 4, I boarded flight UA2229 from Los Angeles to Las Vegas. In the lounge, I opened the package of the DingTalk AI Recording Card and found the main unit, a protective case, and a magnetic ring. By sticking the magnetic ring onto the back of my phone and attaching the device magnetically, setup was complete. The device itself only has two physical buttons—record and voice—and all other operations are managed via the DingTalk app. Downloading the app and activating the device required no instructions and was effortlessly completed.
When I stuck this business-card-sized gadget onto my phone, a foreign couple seated nearby became curious and asked what it was. I replied, “Something I’ve never used before—it can record, translate, and convert speech to text.” They exclaimed, “It’s so cool.”
This year’s CES theme is AI, with the core trend shifting from “informational AI” to “physical AI”—AI is now integrating with hardware, infusing intelligent souls into devices. For example, AI glasses act like adding real-time subtitles to the real world, while the AI recording card integrates large speech models into a small card form factor.
This direction is known as “Everything as AI” or “AI at the Edge” (Edge AI), sometimes referred to as “everything becomes computable.” I summarize it as “terminal intelligence, intelligent terminals.” As large models advance, AI is reshaping all physical hardware.
Though resembling a simple card, the DingTalk AI Recording Card actually features a 6-nanometer AI audio chip, five omnidirectional microphones, and one bone-conduction microphone. It supports voiceprint recognition and spatial identification, enabling visualized recording. Audio data is encrypted both locally on the device and in the cloud for security, and supports intelligent AI-driven retrieval.
How I Used the AI Recording Card at CES
On the morning of January 5, my CES schedule officially began. At the Venetian Hotel, I attended Lenovo Group’s pre-launch event, where multiple experts presented new personal AI computing products in English. Sitting in the front-right row, with the stage about five to six meters away to my left, I activated the DingTalk recording card and turned on “real-time translation,” viewing both the original English audio and its Chinese transcript side by side. After the half-hour presentation ended, the AI-generated meeting summary and sectioned outline were already ready—content could be directly used within DingTalk or exported as documents for sharing.
My first impression was excellent: functionality matched needs perfectly, operation was simple, and recognition accuracy was high. While there were occasional mistranslations of technical terms, performance would gradually improve if authorized to learn from my own speech data. Traditional speech recognition achieves around 70% accuracy, general large models reach 80%, while DingTalk, backed by Alibaba’s Tongyi Lab trained on one billion hours of audio-video data, reaches 90%, and up to 97% after specialized training.
At noon on the 5th, I participated in a lunch gathering with executives from a New York PR firm, held outdoors in a noisy environment. Despite overlapping conversations among five people, the recording card maintained high accuracy. Combined with the “real-time translation” feature, efficiency improved significantly.
On the morning of the 6th, I had a conversation with the North Asia-Pacific COO of a globally renowned company at another hotel restaurant. The indoor setting was loud, and parts of the three-person dialogue were muffled—but the recording card captured everything clearly.
Later that morning, I interviewed FIFA's Innovation Director Johannes Holzmuller with media colleagues. In the quiet environment, both the recording quality and AI-generated summary excelled.
Then, on the morning of the 7th, three consecutive group interviews with Lenovo executives followed. By then, I had become fully comfortable using the recording card. Yang Yuanqing mentioned that AI is becoming ubiquitous—once users grant permission, AI agents within endpoints can respond to commands and even initiate actions proactively. Soon, various hardware may evolve from passive executors into active participants.
Throughout CES, I accumulated nearly seven to eight hours of recordings, yet battery consumption was less than 30%. With a claimed 45-hour battery life and Type-C charging compatible with standard phone cables, there was absolutely no range anxiety.
In the afternoon of January 7, I made a special trip to the DingTalk booth (No. 22020) to express my gratitude. Since being attached to my phone, I haven’t taken it off once.
For me, it offers three major advantages: first, clear recording even at a distance or in noisy conditions; second, support for real-time and simultaneous translation, ideal for international settings; third, instant transcription and automatic summarization, saving tremendous time. Though not perfect, its capabilities continuously improve with usage—unlike traditional fixed-function hardware.
A New Era of AI Hardware Is Arriving
In fact, the functions I’ve used represent only the tip of the iceberg.
For instance, its built-in AI Q&A function allows querying based on a knowledge base created from recorded content. Instead of endlessly scrolling through long files searching for answers, you can simply ask, “Who said what about X topic?” and get an immediate response.
Another example: multiple recordings can be merged into a single comprehensive summary—especially useful for multi-source interviews or users dealing with large volumes of audio.
I haven’t tried these features yet, which shows I’m still a “slow learner” when it comes to adopting new tech tools.
But unlike before, today’s “slow bird” can fly early thanks to AI assistance.
For enterprises, the value of the DingTalk AI Recording Card is even greater. At the launch event for AI DingTalk 1.1, Xu Xiaoying from Youcheng Company shared how their chairman initially questioned equipping staff with recording cards for a Mexico business trip. However, during a Spanish-Japanese meeting, the card not only provided accurate real-time translation but also caught omissions missed by human interpreters, greatly improving communication quality. The company immediately rolled out the devices company-wide for management and overseas personnel.
At this year’s CES, I saw many novel AI hardware devices—rings, necklaces, earrings, and other wearable accessories. The focus is no longer merely on “wearability,” but on “workability and interactivity”—say something, and they respond. AI is coming down from the cloud and embedding itself into everything.
This progress stems from rapid advancements in AI large models in recent years, along with breakthroughs in five key technologies—chips, algorithms, architecture, sensing, and communications (such as NPU + compute-in-memory, lightweight large models, and multi-sensor fusion)—enabling portable devices, through a “local capture + smartphone/cloud computing” model, to become exceptionally intelligent.
According to Frost & Sullivan, the global edge AI hardware market is projected to surge from 321.9 billion yuan in 2025 to 1.22 trillion yuan by 2029, growing at a compound annual rate of 40%, far outpacing traditional consumer electronics.
Despite challenges such as insufficient proprietary data, high computing costs, and network dependency, the biggest advantage of AI hardware lies in the rapid iteration enabled by tight software-hardware integration. For example, robots that couldn’t walk upright during last year’s humanoid robot half-marathon in Beijing were able to perform tasks just months later. NVIDIA has even shortened its AI GPU architecture update cycle from two years to one, accelerating the evolution of the “robot brain.”
DingTalk, China’s largest collaborative office platform with 26 million organizations and 700 million individual users, provides fertile ground for innovative AI hardware due to its vast array of workplace, meeting, and communication scenarios. The DingTalk Recording Card has thus become Alibaba’s flagship consumer-grade hardware for the AI era.
The product leverages Alibaba’s computing power and Tongyi large models. DingTalk’s strategic push into AI hardware across countless industries also determines whether Alibaba’s large models can achieve widespread adoption.
Conclusion
Chinese manufacturing already holds strong competitiveness in supply chains. Now is the time to embed AI into all kinds of hardware. This will drive a shift from “manufacturing” to “smart manufacturing”—not just intelligent production processes, but the production of intelligent hardware itself.
In this process, internet super-apps and large-model companies are increasingly entering the “ecosystem terminalization” race. Beyond DingTalk, other tech giants are exploring enterprise voice AI terminals, smart customer service/meeting hardware, and cross-ecosystem embedded devices.
DingTalk aims to create a closed loop—from task capture, content analysis, to collaborative execution—through seamless software-hardware integration. By providing accessible hardware, it helps organizations accumulate data assets and unlock the full value of AI. This is a full-stack vision spanning software, AI, hardware, and enterprise services, with limitless potential.
From this small recording card, I see the dawn of a grand new era of AI-powered hardware—built upon China’s manufacturing strengths, supply chain advantages, massive organizational scale, and rich application scenarios.
It also signals that the curtain has risen on a new interconnected age—moving beyond mobile internet toward an era where intelligent agents and smart hardware converge.
| Author: Qin Shuo
We dedicated to serving clients with professional DingTalk solutions. If you'd like to learn more about DingTalk platform applications, feel free to contact our online customer service or email at
Using DingTalk: Before & After
Before
- × Team Chaos: Team members are all busy with their own tasks, standards are inconsistent, and the more communication there is, the more chaotic things become, leading to decreased motivation.
- × Info Silos: Important information is scattered across WhatsApp/group chats, emails, Excel spreadsheets, and numerous apps, often resulting in lost, missed, or misdirected messages.
- × Manual Workflow: Tasks are still handled manually: approvals, scheduling, repair requests, store visits, and reports are all slow, hindering frontline responsiveness.
- × Admin Burden: Clocking in, leave requests, overtime, and payroll are handled in different systems or calculated using spreadsheets, leading to time-consuming statistics and errors.
After
- ✓ Unified Platform: By using a unified platform to bring people and tasks together, communication flows smoothly, collaboration improves, and turnover rates are more easily reduced.
- ✓ Official Channel: Information has an "official channel": whoever is entitled to see it can see it, it can be tracked and reviewed, and there's no fear of messages being skipped.
- ✓ Digital Agility: Processes run online: approvals are faster, tasks are clearer, and store/on-site feedback is more timely, directly improving overall efficiency.
- ✓ Automated HR: Clocking in, leave requests, and overtime are automatically summarized, and attendance reports can be exported with one click for easy payroll calculation.
Operate smarter, spend less
Streamline ops, reduce costs, and keep HQ and frontline in sync—all in one platform.
9.5x
Operational efficiency
72%
Cost savings
35%
Faster team syncs
Want to a Free Trial? Please book our Demo meeting with our AI specilist as below link:
https://www.dingtalk-global.com/contact

English
اللغة العربية
Bahasa Indonesia
Bahasa Melayu
ภาษาไทย
Tiếng Việt
简体中文 