Vocal Synthesis Tool UTAU (歌声合成ツール UTAU) (commonly shortened to UTAU) is a voice synthesizer program currently available for Windows and Mac OS X systems (the Mac version being named UTAU-Synth).
UTAU is a shareware vocal synthesizer program unlike VOCALOID, VOCALOID2, and VOCALOID3, which are commercially-sold programs with an accompanying voice bank. Distribution of UTAU began in March of 2008.
UTAU, meaning "to sing" in Japanese, has its origin in "Jinriki VOCALOID" (人力ボーカロイド, translating to "Manual Vocaloid"). It was created by re-editing an existing singing voice, extracting tones as WAV files, and reassembling them. For this purpose, a support program was created. In March 2008, Ameya/Ayame (飴屋／菖蒲) released a free, advanced support tool as UTAU.
The program comes with a voicebank of 142 samples of Japanese syllables generated from the default voice of AquesTone's text-to-speech software "AquesTalk". Any user can load their own voicebank into UTAU to use. However, without the explicit permission of the voice donor, it is a violation of copyright laws. Those laws protect the rights of any vocalist who may not wish for their voices to be used within the program, such as celebrities. Any music made through this program can be used in the commercial sector. UTAU can be downloaded for free from the home page. It will not run properly on computers which do not support Japanese text or AppLocale.
Some UTAU voicebanks have been put out as "real" VOCALOIDs, such as the April Fool's joke origins of Teto Kasane. Songs using both UTAU and VOCALOID are also not unheard of. Some users have also began to enforce their copyright ruling over their voicebanks; UTAU or fanmade VOCALOIDs who are guilty of plagiarizing an UTAU's name or using a voicebank without permission risk violating UTAU software agreements and voicebank copyright ownerships.
Usage in MusicEdit
UTAU is well supported as an alternative to VOCALOID and is favoured in both the VOCALOID and UTAU fandoms as an alternative to pirating the VOCALOID software itself. The principle of both software is the same as UTAU and VOCALOID both share a multitude of common traits and abilities with each other. For those unsure of their handling of VOCALOID, UTAU can also act as an introduction to synthesized vocals and aid in making the decision to purchase a VOCALOID.
The reasons for UTAU's popularity are owed to some major differences between it and the VOCALOID software (listed below in the "strengths" and "weaknesses" section). It's for these reasons that there is some debate as to if this software is overall better than the VOCALOID software or worse. While it is able to compete with VOCALOID, the reason being is because there is a sizable gap between what areas both software are covering. UTAU has also earned a reputation as the closest rival software to VOCALOID for these reasons and in comparison to other rival software it has managed to stay competitive over the course of its existence, whereas other software such as Cantor failed to see continued development.
✔ Strengths Edit
UTAU saves data in the .UST (UTAU Sequence text) format and is capable of converting .VSQ files to .UST. Since few software packages can read the .VSQ file format beside VOCALOID itself, UTAU has been an attractive alternative and partner software to VOCALOID.
UTAU also has the advantage of having its development occur at a faster pace. It has plug-in support and users have made a number of plug-ins that greatly improve the software's handling and experience. This support was established fairly early in the software's existence, whereas VOCALOID did not gain this ability until VOCALOID3 in late 2011 and even now it only offers a limited access to source code and plug-in support. Therefore the plug-ins for UTAU can often prove invaluable to users as they can effect the software's results and quality greatly.
Triphone ("VCV"; vowel-consonant-vowel) voicebanks were created by 2010, whereas VOCALOID did not gain this capability until 2011 when VOCALOID3 was released. Even in comparison to VOCALOID3, the amount of languages offered is much larger with some vocals able to do more than 10 languages. For VOCALOID, there are very few VOCALOIDs with bilingual capabilities, and the software only offers 5 languages at the most. Voicebanks practically work with any version of the software, thus issues seen between different versions of VOCALOID and VOCALOID2 software (such as those displayed by KAITO and Prima) are usually absent.
The UTAU software is open license, which means that vocals from other software can be used in conjunction with the software, so long as it complies with the other software's agreement (VOCALOID cannot be used in UTAU legally for this reason as its licensing is restricted). There are hundreds of vocals for the software and the type of vocals are much broader and cover a variety of different genres and vocal types. Most of these vocals can be obtained for free. In VOCALOID, one is restricted to just the vocals offered for sale, with no chance of producing one's own vocals for the software should none of the current releases spark one's interests.
✘ Weaknesses Edit
UTAU is one of the few programs able to convert VOCALOID data files for its own use. However, .UST files itself do not hold as much data as the VOCALOID engines' VSQ or VSQX file extensions, and UTAU does not try to convert many things into even its rough equivalent, only placing the notes. As a result, loss of data may occur.
As for the engine itself, there is a level of uncertainty in how to grade the results of the software. The advantage of UTAU being simply an interface has resulted in a large range in quality of UTAU's results, with many engine plug-ins ("resamplers") being created, all with different results.
UTAU is not professional software while VOCALOID is produced as a professional software package. For this reason it overall doesn't produce the same quality results as VOCALOID. This also gives an additional drawback to the software; whereas VOCALOID gives a means for professional singers to release their vocals much safer, with the singers not only getting something out of each sale, but also there is a definite structure to using the vocals with and without the singers consent. In contrast, UTAU vocals may not offer any form of commercial-based distribution security; there is less chance of a professional singer considering to offer their vocals to the engine. As a result of this, it can be at times difficult to find a standard level of quality within the vocals offered.
A large part of the vocals offered for the engine are of poor quality in comparison to the standard of vocals offered by VOCALOID. Users creating vocals for the engine may not take full advantage of the tools UTAU offers.
UTAU was created for Japanese vocal synthesis and a large majority of the fanbase is in Japan, so finding quality non-Japanese vocals is often harder with more complex languages such as English. Additionally, UTAU does not officially support a way to handle final consonants, which are featured in many languages, such as English, Korean, and Chinese.
The standard of practice within the UTAU community is also vast. Technical support may or may not be offered and most common support is found within the UTAU community. Not all vocals are offered freely and some have to be paid for in order to be used (However, this is very uncommon). Unfinished vocals may be never completed, or abandoned altogether. Some vocals are also not recorded with high quality microphones or configured properly. This is also one of the reasons why users will fall back onto the few reliable vocals at times, as these vocals are considered the "safest" to work with.
- UTAU offered the ability to create a voice, with the advantage of legally owning the voicebank and controlling how it was passed around. UTAU led to a decrease in fanmade VOCALOIDs in Japan because their creators did not have this advantage.
- The best-known voicebank for UTAU is Kasane Teto. She is recognized as the first UTAUloid.