博客

111篇文章

MikuTools 最新文章：工具教程、产品更新、AI 工具实践和工程笔记。

2026年5月8日
MiniMax Music Cover: a $0.15 cover that keeps the melody
MiniMax Music Cover does one thing well: it takes a song you already have and reimagines it in a different style while preserving the melody. Flat /bin/zsh.15 per generation. No subscription required.
8 分钟阅读
2026年5月8日
ACE-Step v1.5 review: open-source AI music after Suno v5
ACE-Step 1.5 went live on January 28, 2026. The headline number, a SongEval score of 8.09, beats Suno v5. Both claims are real. Neither tells the whole story.
11 分钟阅读
2026年5月7日
TTS in an accessibility plan: where it helps, where it stops
Adding text-to-speech to a website or app looks like an accessibility win. Sometimes it is. Sometimes it is a checkbox that distracts from the real accessibility work. Here is the honest map of where TTS helps and where it does not.
10 分钟阅读
2026年5月7日
Text-to-speech pricing math for long scripts and small teams
TTS is priced by the character. The pricing looks small until your scripts get long. Here is the cost math for typical projects, podcasts, audiobooks, courses, marketing libraries, and the levers that change the bottom line for a small team.
10 分钟阅读
2026年5月7日
A five-minute casting worksheet for picking your TTS voice
A short worksheet that shortlists your TTS voice in five minutes by answering five small questions about the script, the audience, and the medium. Designed for first-time users who do not want to scroll a sixty-voice catalog without a plan.
7 分钟阅读
2026年5月7日
Word-level timestamps and what to actually build with them
The word-timestamp option on the text-to-speech tool returns the start and end of every word alongside the audio. That metadata is the difference between a player that just plays audio and a player that does follow-along reading, language learning, jump-to-paragraph navigation, and accessibility narration that earns its place in the page.
9 分钟阅读
2026年5月7日
Speed and clarity: how fast can you push synthetic narration
The speed slider on a text-to-speech tool runs from 0.5 to 4.0. Above 1.5 the audio degrades in ways that are not obvious until your listener tells you. Here is the honest band where each speed setting works and where it stops working.
7 分钟阅读
2026年5月7日
Mandarin text-to-speech without breaking the tones
Mandarin TTS reads most prose acceptably and stumbles in predictable places. Tone sandhi rules, polyphonic characters, and code-switched English are where modern models still fail. Here is the practical guide for Mandarin scripts that need to actually sound right.
10 分钟阅读
2026年5月7日
EU AI Act, ACX, and the disclosure rules every TTS user should know in 2026
Synthetic voice is now regulated. The EU AI Act puts hard disclosure obligations on AI-generated audio starting August 2026, with penalties up to 15 million euros. ACX still bans AI narration for general distribution. Here is what changes, when, and how to comply without panic.
10 分钟阅读
2026年5月7日
E-learning narration: where TTS holds up and where it doesn't
Modern text-to-speech reaches statistical parity with human narrators on learning outcomes for some content types. For others, it underperforms in measurable ways. Here is the field guide for L&D teams choosing between synthetic narration, hired narrators, and the in-house host.
8 分钟阅读
2026年5月7日
Podcast intros and outros without a voice actor on call
A practical recipe for producing a clean podcast intro, outro, and mid-roll bumpers using synthetic voice. The whole package, ready to drop into your editor, in under thirty minutes per episode template.
8 分钟阅读
2026年5月7日
Use AI text-to-speech for the audiobook draft, not the final
Synthetic narration is good enough to write a whole book through end-to-end. ACX still does not accept it for distribution. Here is how indie authors should actually use TTS in 2026 — as a draft tool, not a finished narrator.
8 分钟阅读