Cross Synthesis (クロス-シーサセス) is a parameter that allows a voicebank to be able to blend into another voicebank, in a gradually and progressive way. For example, a "natural" voicebank being able to blend into sound like a "power" voicebank. It is only available with the full version of the software and is not accessible with the "lite" version of the software.
How to useEdit
To make use of this, the users must access the Cross-Synthesis Web browser through the Singer Editor, and can assign a Primary Voice and a Secondary Voice for be used in the Cross-Synthesis. This feature works with VOCALOID3 voicebanks imported into the VOCALOID4 engine however this is limited to a few voicebanks.
A chart of the voicebank packages that supports this feature is available in Yamaha's official website.'
Cross-Synthesis will morph the Primary Voice using the Secondary Voice as an analytical data guide for adaption of the first. The cross-synthesis is not simply switching between two voices as the users draws a parameter curve, but is instead increasing the ratio of the Secondary Voice being combined with the Primary Voice, being more complex than a simple fade out. As the Primary Voice is used as "template" for the morphing, it means the vocal combination used in the Cross-Synthesis is not mutually exclusive or interchangeably, thus meaning the rendering and behavior will be slightly different depending which voicebank is used as primary or secondary voice.
- Example: A Power-Normal setting will not be equal to Normal-Power.
Users can also choose how much of the second voice impacts the first, from the subtle to extreme. The way the XSY works is very similar in concept to how the original VOCALOID engine used recorded analytic data to adapt the Vocaloid engine noise to sound like the vocalist behind the data.
Note that some pacages such as the Megpoid V4 vocals, may come with their respective pairs already set up for XSY within the feature.
Benefits of XSY?Edit
When used effectively, XSY expands the capabilities of the Vocaloid package in use, increasing its overall abilities beyond what any single vocal within the package offers.
As noted by the Megpoid V4 package, the potential created by XSY is large. By mixing vocals with more extreme results, the combination of the two vocals creates a effect that mimicks having a entirely different Voicebank. XSY mixes in traits of the other vocal to create an entirely new vocal sound thats didn't exist before. By mixing a "whisper" type vocal to a "power" type, one can essentially achieved a "power-whisper" result. Essentially, a Vocaloid with just two vocals can achieve the equivalent of a "third voicebank" and "fourth voicebank" via the XSY function.
While the result can be used as the entirety of a song, multiple XSY switching create a song that can achieve more realistic results by switching between XSY vocals and tracks. The vocal can go from a normal "whisper" vocal to singing a high ballad to express a joyful happiness achievable by a"power-whisper" mix. In the same song, by also mixing the Whisper with a "dark" tone to create a dark-whisper", creating a sad tone in the process. In the case of the Megpoid V4 5 pairs, they were intended to use XSY to allow the user to ease from one vocal to another, allowing for easier/smoother switching between the two vocals in each intended pairing.
Either way, for Vocaloids with access to even just 1 more additional vocal expansion library and XSY will see their overall tone capabilities doubled. This makes them a much more attractive package then Vocaloids with just a single vocal offered within their package.
Voicebank XSY chart variantsEdit
The number of potential vocal libraries variations creates;
- 2 voicebanks; produces 2 variations with XSY, 4 variations altogether including the original vocals
- 3 voicebanks; produces 6 variations with XSY, 9 variations including the original vocals
- 4 voicebanks; produces 12 variations with XSY, 16 variations including the original vocals
- 5 voicebanks; produces 20 variations with XSY, 25 variations including the original vocals
- 6 voicebanks; produces 30 variations with XSY, 36 variations including the original vocals
- 7 voicebanks; produces 42 variations with XSY, 49 variations including the original vocals
- 8 voicebanks; produces 56 variations with XSY, 64 variations including the original vocals
- 9 voicebanks; produces 72 variations with XSY, 81 variations including the original vocals
- 10 voicebanks; produces 90 variations with XSY, 100 variations including the original vocals
- 11 voicebanks; produces 110 variations with XSY, 121 variations including the original vocals
- 12 voicebanks; produces 132 variations with XSY, 144 variations including the original vocals
- 13 voicebanks; produces 156 variations with XSY, 169 variations including the original vocals
- 14 voicebanks; produces 182 variations with XSY, 196 variations including the original vocals
- 15 voicebanks; produces 210 variations with XSY, 225 variations including the original vocals
- It is worth noting that as XSY is controllable viable, it is possible to create further variations then this chart displays by mixing more of one vocal against the other. However, the factors involved with how much you can get out of two vocal is dependent entirely on the individual two vocals involved.
- This chart is based on an "ideal setting" wherein the traits of both vocals are allowed to leak through enough to mimic the effect of having an entirely new voicebank. It does not factor in that more then 1 variable can be created, this is done so for simplicity sake.
- User will have to discover the right mix of the two vocals at times, as not all vocals have major differences between them when used for XSY.
XSY is far from perfect. As with the "GWL" function, this feature may prove very limiting to vocals that are too similar. A common complaint with Megurine Luka V4X English vocals "Straight" and "Soft" is how little use XSY is for the two vocals, as it barely impact either vocal.
Users should be aware it can offer unpredictable or unexpected results when used. This is particularly true when the two vocals used for the feature have great differences between them or when the involved vocals where created without XSY in mind, as the function wasn't in the software at the time. This occurred with many of the VOCALOID3's voicebanks that gained XSY when they were imported into VOCALOID4.
- Example: In the Megpoid V4 package, wherein while the respective pairs of vocals (Native and NativeFat for example) offer a moderate XSY result, XSY between the other 8 vocals offers more unpredictable result (Power and Sweet for example). Comparing also to the V3 Megpoid vocal, the V3 Megpoid vocals use of XSY is not so effective, as the voices were never intended to be set up for use with the function. The result is compared to V4, the V3 Megpoid package is consider lower in quality and produce more unsatisfactory results.
Some common issues included;
- Altered optimum range, either increased or decreased depending on the combination
- Creation of new noises or the enhancement of known bad ones (such as mild "popping", "crunching/plucking", "static"), resulting in a loss of quality compared to the original two vocals that were used for it or produce undesirable results.
- tones that change unexpectedly as the vocal goes up and down the octaves (particularly of vocals with large vocal ranges and/or variation between the two vocals), this makes them unstable and even suffer from tonal collapse.
- Causing vocals to produce a croaky results, a bug which does not impact all vocaloids naturally (see BIG AL as an example of a Vocaloid with a known natural croakiness).
Why do Bugs appear?Edit
These results are not necessarily found in either vocal and vary in strength from subtle to extreme results depending on the variables strength and setting involved with XSY.
The reason for these bad results is down to how XSY works itself. XSY's differences are worked out by use of mathematical equations, to work out the differences between the primary and secondary vocals and alter the wavelength of the primary vocal in accordance. The calculations don't work particularly well when the voicebanks have libraries built entirely different to each other as it was not designed to handle this and it is can be impossible to get good quality results.
It can even result in loss of clarity because the traits of 1 voice may alter the other, since the maths involved may average out a result. In other words, the vocal result produced may not be dedicating itself enough to a single trait of either vocal. For some vocals, they gain clarity because of a relied upon single or more trait such as a strong attack, powerful/clear tones or high quality recorded samples. For example, "soft" type vocals have looser pronunciations, so mixing with this vocal can cause the sounds on the primarily vocal to loosen to mimic the traits of the soft vocal. This can make the resulting primary vocals results appear to mumble more.
In the opposite direction, the exaggeration of a flaw may occur, particularly if the two vocals both have this same flaw. In the worst case scenarios, the secondary vocal adds parts of its own flaws onto the primarily. Without other traits to lessen the impact of the natural flaws the primarily vocal may have, and with the possibility of additional flaws being added, its glitches and bugs become more noticeable.
This is why VOCALOID3 and VOCALOID4 XSY is not always feasible, nor XSY between languages and multiple vocaloids. This is also why features such as E.V.E.C. may cause issues with XSY.
Due to the licencing agreements between Vocaloids and studios, Vocaloids generally are not open for XSY between Vocaloids or Vocaloids of studios.
Note that Ah-Software have since made it opening to XSY all their vocals.
- See Controversy Concerns for details on XSY Mods.
- "VOCALOID 4 新機能 「クロスシンセシス」 - Cross Synthesis - "