Both versions of the free AI voice cloner are still on the website. V1 is small and good enough for basic cloning. V2 is bigger but a much better app. Here's the side-by-side so you can decide.
Quick recommendation
- You have an NVIDIA GPU (RTX 3060+) and decent broadband: get V2. The redesigned editor + 7-band EQ + 16 built-in voices are worth the larger download.
- You're on a laptop without a discrete GPU, or your internet is slow: V1 is fine. Smaller download (~248 MB), still does voice cloning + 28+ language TTS, just without the V2 polish.
- You're not sure: get V2. CPU fallback works (just slower), and the better app is worth the wait.
Side-by-side comparison
| Feature | V1 (Voice Cloner) | V2 |
|---|---|---|
| Download size | ~248 MB | ~2.0 GB |
| Built-in voices | 8 | 16 (6 M, 6 F, 2 boy, 2 girl) |
| Voice cloning | 5–30s sample | 15–30s sample (better quality) |
| Languages | 28+ (XTTS v2 native) | 17 (XTTS v2 native) + auto-translate |
| Built-in translator | No | Yes — Google Translate, no API key |
| Audio editor | Basic waveform + trim | Hero waveform, live cursor, drag-select, right-click menu, tabbed effects |
| 7-band parametric EQ | No | Yes — 6 voice presets |
| Keyboard shortcuts | Limited | Full editor + nav shortcuts |
| GPU CUDA support | Manual setup, fragile | Bundled CUDA 12.8 runtime — works with any 12.x/13.x driver |
| Diagnose page | No | Yes — Ctrl+D, GPU mode + per-package check |
| Theming / accent picker | No | Yes — 5 accent colours |
| Logo | Microphone | EQ-bars in circle (new brand) |
Why V2 is so much bigger
V1 was ~248 MB because it expected you to install Python + PyTorch + CUDA separately. V2 ships those inside the installer (PyTorch with CUDA 12.8 runtime) so it works on any modern Windows PC with no extra setup. That convenience costs ~1.7 GB of bundled runtime. After install, both versions download the same XTTS v2 model (~2 GB) on first launch.
If your install footprint matters more than convenience and you already have a working Python + PyTorch + CUDA setup, V1 is the lighter option. For everyone else, V2.
Can I have both installed?
Yes. V1 and V2 install to separate folders with separate Start-menu entries:
- V1:
C:\Program Files\RBS Software\RBS Voice Cloner\ - V2:
C:\Program Files\RBS Software\RBS Voice Cloner V2\(V2 directory)
Voice profiles are saved to %APPDATA%\RBS Software\ in different sub-folders so they don't collide.
What about the languages — V2 dropped some?
V1 advertised "28+ languages". V2 lists 17. The 17 in V2 are the languages where XTTS v2 produces genuinely good output. The extra languages V1 listed were technically supported by the engine but produced unreliable quality. V2 is honest about what works well — it's not a regression, it's just clearer marketing. The translator works across all 17.
Quality difference — does V2 actually sound better?
Underneath, both versions use the same XTTS v2 model. So the raw voice quality of a clone is similar between V1 and V2 — the model is the model. Where V2 sounds noticeably better is everything around the model:
- Sentence-level rendering — V2 sends complete sentences to the model rather than chunks split arbitrarily. Prosody (the natural rise and fall of speech) is more consistent.
- The 7-band EQ — most generated voices benefit from a touch of low-cut and presence boost. V2 has 6 voice-tuned presets that fix common issues; V1 has none.
- The redesigned audio editor — V1's editor was basic. V2 lets you cut, fade, and merge clips visually with a live cursor, which makes patching together long content much faster.
On a single short generated phrase you might not be able to tell V1 and V2 apart. On a 5-minute audiobook chapter, the difference becomes clear: V2 is steadier and more natural to listen to.
Will V1 keep getting updates?
V1 will continue to be available for download — there are people on slow connections or older hardware who genuinely need the smaller install. But active development has moved to V2. Bug fixes for V1 will continue if anything serious turns up; new features land on V2.
If you've been on V1 for a few months and it's working fine, there's no urgency to switch. V2 is better, but V1 is not going away.
Real-world: switching from V1 to V2
If you decide to migrate:
- Install V2 — it goes to a separate folder, V1 keeps working.
- Voice profiles don't auto-migrate. Re-record your voice samples in V2 (it takes 30 seconds per voice) — V2's clone quality benefits from a fresh recording with the V2 settings.
- If you have generated audio files from V1 you want to edit in V2's editor, just open them — the editor is format-agnostic.
- Once you've used V2 for a week, uninstall V1 from Settings > Apps if you want the disk space back. Voice profiles in V1's
%APPDATA%folder will be left behind unless you tick "Delete data" during uninstall.
Verify your download
Both V1 and V2 publish their SHA-256 hash and a VirusTotal scan link on their detail pages. After downloading, run Get-FileHash <path> -Algorithm SHA256 in PowerShell and compare to the hash on the page. If they match, you have an unmodified release.