Cross Synthesis (クロス-シーサセス), often shortened in reference to "XSY", is a parameter that allows a voicebank to be able to blend into another voicebank, in a gradually and progressive way. For example, a "natural" voicebank being able to blend into sound like a "power" voicebank. It is only available with the full version of the software and is not accessible with the "lite" version of the software.
How to useEdit
To make use of this, the users must access the Cross-Synthesis Web browser through the Singer Editor, and can assign a Primary Voice and a Secondary Voice for be used in the Cross-Synthesis. This feature works with VOCALOID3 voicebanks imported into the VOCALOID4 engine however this is limited to a few voicebanks.
A chart of the voicebank packages that supports this feature is available in Yamaha's official website.
Cross-Synthesis will morph the Primary Voice using the Secondary Voice as an analytical data guide for adaption of the first. The cross-synthesis is not simply switching between two voices as the users draws a parameter curve, but is instead increasing the ratio of the Secondary Voice being combined with the Primary Voice, being more complex than a simple fade out. As the Primary Voice is used as "template" for the morphing, it means the vocal combination used in the Cross-Synthesis is not mutually exclusive or interchangeably, thus meaning the rendering and behavior will be slightly different depending which voicebank is used as primary or secondary voice.
- Example: A Power-Normal setting will not be equal to Normal-Power.
Users can also choose how much of the second voice impacts the first, from the subtle to extreme. The way the XSY works is very similar in concept to how the original VOCALOID engine used recorded analytic data to adapt the Vocaloid engine noise to sound like the vocalist behind the data.
Note that some packages such as the Megpoid V4 vocals, may come with their respective pairs already set up for XSY within the feature.
Benefits of XSY?Edit
When used effectively, XSY expands the capabilities of the Vocaloid package in use, increasing its overall abilities beyond what any single vocal within the package offers.
As noted by the Megpoid V4 package, the potential created by XSY is large. By mixing vocals with more extreme results, the combination of the two vocals creates a effect that mimicks having a entirely different Voicebank. XSY mixes in traits of the other vocal to create an entirely new vocal sound thats didn't exist before. By mixing a "whisper" type vocal to a "power" type, one can essentially achieved a "power-whisper" result. Essentially, a Vocaloid with just two vocals can achieve the equivalent of a "third voicebank" and "fourth voicebank" via the XSY function.
While the result can be used as the entirety of a song, multiple XSY switching create a song that can achieve more realistic results by switching between XSY vocals and tracks. The vocal can go from a normal "whisper" vocal to singing a high ballad to express a joyful happiness achievable by a"power-whisper" mix. In the same song, by also mixing the Whisper with a "dark" tone to create a dark-whisper", creating a sad tone in the process. In the case of the Megpoid V4 5 pairs, they were intended to use XSY to allow the user to ease from one vocal to another, allowing for easier/smoother switching between the two vocals in each intended pairing.
Either way, for Vocaloids with access to even just 1 more additional vocal expansion library and XSY will see their overall tone capabilities doubled. This makes them a much more attractive package then Vocaloids with just a single vocal offered within their package.
From Ver.4.3.0 of the VOCALOID4 engine "groups" were added, which allowed for the first time a certain number of vocals to XSY between them that were never able to do so before. It allowed for the first time vocals to XSY between characters.
A benefit of this expanded function is that vocals can now exist in a producer's options purely for utility purposes with desired vocals. For example, Megpoid V4's 10 voicebanks can be switched around to act as "tone controlling", adding a slightly different tone to another vocal depending on if the user wants their primary vocal to alter to. This is possible due to all 10 having the same vocal range and tempo. Others such as Kokone can be used to bring a vocal to a more "falsetto" tone of voice. Then there is vocals like Macne Nana who can be used to give a vocal support to lean on during faster songs due to her stability during high tempo songs.
Voicebank XSY variants calculationsEdit
When buying voicebanis with the intention of using XSY, it is rarely noted how many potential tonal variations the package will produce. For the users who wish to know, it is notable that it is often not too difficult to calculate the potential vocals that can be achieved. However, the following is the calculations for working this out for users who do not know how to work out the number themselves.
Note that "α" represents the number of vocals that are optional to be XSY with the intended vocal being used - this includes its usage both as a primarily and secondary vocal.
Equivalent potentially created additional vocals;
- α x (α - 1) =
Theoretical total number of vocals is simply the previous result plus the original number of vocals thus the formula is;
- α x (α - 1) + α =
For example, the Megpoid V4 releases 10 voicebanks work out as
- 10 x 9 = 90 additional possible results
- 10 x 9 + 10 = 100 total theoretical results
The following examples detail the results up to 15;
- 2 voicebanks; produces 2 variations with XSY, 4 variations altogether including the original vocals
- 3 voicebanks; produces 6 variations with XSY, 9 variations including the original vocals
- 4 voicebanks; produces 12 variations with XSY, 16 variations including the original vocals
- 5 voicebanks; produces 20 variations with XSY, 25 variations including the original vocals
- 6 voicebanks; produces 30 variations with XSY, 36 variations including the original vocals
- 7 voicebanks; produces 42 variations with XSY, 49 variations including the original vocals
- 8 voicebanks; produces 56 variations with XSY, 64 variations including the original vocals
- 9 voicebanks; produces 72 variations with XSY, 81 variations including the original vocals
- 10 voicebanks; produces 90 variations with XSY, 100 variations including the original vocals
- 11 voicebanks; produces 110 variations with XSY, 121 variations including the original vocals
- 12 voicebanks; produces 132 variations with XSY, 144 variations including the original vocals
- 13 voicebanks; produces 156 variations with XSY, 169 variations including the original vocals
- 14 voicebanks; produces 182 variations with XSY, 196 variations including the original vocals
- 15 voicebanks; produces 210 variations with XSY, 225 variations including the original vocals
- It is worth noting that as XSY is controllable viable, it is possible to create further variations then this chart displays by mixing more of one vocal against the other. For example, you may get a different result at 25% as oppose to 50% or 75% of the influence of the secondary vocal.
- However, the factors involved with how much you can get out of two vocal is dependent entirely on the individual two vocals involved. For example, two vocals which are relatively close in tone with each other are less likely to give much more then a single tone result.
- These calculations are based on an "ideal setting" wherein the traits of both vocals are allowed to leak through enough to mimic the effect of having an entirely new voicebank. It does not factor in that more then 1 variable can be created, this is done so for simplicity sake.
- It is sometimes difficult to pinpoint the exact setting that gives the biggest differences and User can only use trial and error to find it.
- Note that the Vocaloid wikias calculation will not take into account that some XSY compatible vocals are simple updates that add GWL. So some XSY combinations between older and newer versions of a software may show little response despite the Vocaloid wikia noting there may be one for XSY. However, the fact that GWL can change a vocals tone when combined with other GWL compatible vocals is why they are taken as "different" XSY combination despite being the same vocal.
As mentioned elsewhere on this page Ver.4.3.0 of the VOCALOID4 engine added the "XSY group" assignments, extending the XSY usage beyond its previous capabilities. The XSY groups allow for vocals within them to XSY regardless of the character, while previously XSY had been restricted to just XSY with a character only. This opened the doors for very different vocal results from very different vocaloids, with many even capable of producing a result that sounds like neither Vocaloid used for the process.
The same rules for XSY apply to this new group system, so nothing has changed except the potential number of vocals that can now be XSY. However, some groups have added no new vocals for XSY such as the "Fukase" release, which only contains his current vocals.
This group is the group produced by the company Internet Co., Ltd.. the original Galaco vocal and Rana's vocal are not include as though Internet staff members did work on these vocals, they were not released as "Internet" vocals, but instead are part of the "Yamaha" release line up.
This is by far the largest of the groups and has the most potential for complex experimetation within the Vocaloid engine in regards to XSY. Note that none of the VOCALOID2 vocals are part of this group as XSY was never opened up for them.
This is the second largest group within the Vocaloid engine.
Note that the original vocal release of Macne Nana is not part of this group due to being released in the Yamaha line-up. Note that none of the VOCALOID2 vocals are also part of this group as XSY was never opened up for them.
The "Miku V4X" group are all the vocals released for the character "Hatsune Miku" in Japanese.
This is currently the third largest group within the Vocaloid engine. Note that her original release is not part of this group as it is for the VOCALOID2 engine.
Miku V4 EnglishEdit
The "Miku V4 English" is very similar to the "Miku V4X" group, but contains only Hatsune Miku English vocals.
Luka V4 EngEdit
This is the same as the "Luka V4X" group, but contains only Luka's English vocals. Her original release is not part of this group as it is for the VOCALOID2 engine.
The same as the "Rin V4X" group, except it forces on Kagamine Len instead of Kagamine Rin. His original release is not part of this group as it is for the VOCALOID2 engine.
Contains all of IA's vocals.
Since it is a VOCALOID2 vocal, the original vocal is not within this group.
Contains Galaco's official commercial released VOCALOID 3 vocals.
However, her original vocal given out as a prize is not part of this group.
Contains all of Fukase's vocals.
Contains all of ARSLOID's vocals.
XSY is far from perfect. As with the "GWL" function, this feature may prove very limiting to vocals that are too similar. A common complaint with Megurine Luka V4X English vocals "Straight" and "Soft" is how little use XSY is for the two vocals, as it barely impact either vocal.
Users should be aware it can offer unpredictable or unexpected results when used. This is particularly true when the two vocals used for the feature have great differences between them or when the involved vocals where created without XSY in mind, as the function wasn't in the software at the time. This occurred with many of the VOCALOID3's voicebanks that gained XSY when they were imported into VOCALOID4.
- Example: In the Megpoid V4 package, wherein while the respective pairs of vocals (Native and NativeFat for example) offer a moderate XSY result, XSY between the other 8 vocals offers more unpredictable result (Power and Sweet for example). Comparing also to the V3 Megpoid vocal, the V3 Megpoid vocals use of XSY is not so effective, as the voices were never intended to be set up for use with the function. The result is compared to V4, the V3 Megpoid package is consider lower in quality and produce more unsatisfactory results.
Some common issues included;
- Altered optimum range, either increased or decreased depending on the combination
- Creation of new noises or the enhancement of known bad ones (such as mild "popping", "crunching/plucking", "static"), resulting in a loss of quality compared to the original two vocals that were used for it or produce undesirable results. 
- tones that change unexpectedly as the vocal goes up and down the octaves (particularly of vocals with large vocal ranges and/or variation between the two vocals), this makes them unstable and even suffer from tonal collapse.
- Causing vocals to produce a croaky results, a bug which does not impact all vocaloids naturally (see BIG AL as an example of a Vocaloid with a known natural croakiness).
Why do Bugs appear?Edit
These results are not necessarily found in either vocal and vary in strength from subtle to extreme results depending on the variables strength and setting involved with XSY. In addition, while the original two vocals may be HQ, the resulting in XSY may turn out to be LQ or MQ at best.
The reason for these bad results is down to how XSY works itself. XSY's differences are worked out by use of mathematical equations, to work out the differences between the primary and secondary vocals and alter the wavelength of the primary vocal in accordance. The calculations don't work particularly well when the voicebanks have libraries built entirely different to each other as it was not designed to handle this and it is can be impossible to get good quality results.
This is why VOCALOID3 and VOCALOID4 XSY is not always feasible, nor XSY between languages and multiple Vocaloids. This is also why features such as E.V.E.C. may cause issues with XSY.
Many cases, these can be fixed with further editing. However, the editing may be a lot more common or extreme then a non-XSY result will produce.
Loss of Clarity;
XSY can result in loss of clarity because the traits of 1 voice may alter the other significantly enough. In other words, the vocal result produced may not be dedicating itself enough to a single trait of the other vocal to maintain the trait that made it clear originally.
For some vocals, they gain clarity because of a relied upon single or more trait such as a strong attack, powerful/clear tones or high quality recorded samples, all are traits which are common to "power" types. But mixing the vocals with other very different traits can remove the very reason for their clarity.
"Soft" vocals, have been known impact clarity, due to their to looser pronunciations. So mixing with this vocal can cause the sounds on the primarily vocal to loosen to mimic the traits of the soft vocal. This can make the resulting primary vocals results appear to mumble more.
Exaggeration of a flaw;
In another direction, the exaggeration of a flaw may occur, particularly if the two vocals both have this same flaw contained within their data library. While two vocals with similar traits are often HQ then ones that are not, there is a potential for the result to not be so true.
Often the results are because both vocals were drawn from the same set of (bad) data, or simply both ended up with a similar/same glitch (both in the case of vocals within a same release or different releases). In the worst case scenarios, the secondary vocal adds parts of its own flaws onto the primarily.
When only 1 vocal contains the flaw, the second vocal has been known to mask the issue (and while not true, can appear to "fix" it), by lessening the effect of the flaw of the other vocal due to XSY using the trait of the other vocal in the result. But in the case of a flaw becoming more notable, the "good" trait are lessened due to the two vocals differences, but the XSY calculations have less room to work with the "flawed" results due to both having more equal traits to each other. So the flaw is kept in the process.
Glitches where there were none;
Sometimes it is a simple case of a flaw in the additional vocal being so strong, that its impact on the other vocal is impossible for the vocal to deal with, so XSY copies the other vocals flaws into the results. The Vocaloid will produce a flaw otherwise wherein not using XSY would not produce a flaw.
Due to the licencing agreements between Vocaloids and studios, Vocaloids generally are not open for XSY between Vocaloids or Vocaloids of studios.
- See Controversy Concerns for details on XSY Mods.
|VOCALOID 4 新機能 「クロスシンセシス」 - Cross Synthesis -||YouTube|
|Hiyama Kiyoteru Rock + Nekomura Iroha Soft||SoundCloud|
|Yuzuki Yukari Jun + Tohoku Zunko Natural||SoundCloud|
|Otomachi Una V4 Sugar + Megpoid V4 Sweet||SoundCloud|
|Otomachi Una V4 Spicy + Megpoid V4 Power||SoundCloud|
|Megpoid V4 Adult + Lily V3||SoundCloud|
|Gackpoid V4 Native + Gachapoid V3||SoundCloud|
|Tohoku Zunko Natural + Hiyama Kiyoteru Natural||SoundCloud|
|Tohoku Zunko Natural + miki Natural||SoundCloud|
|Tohoku Zunko Natural + Yuzuki Yukari Jun||SoundCloud|
|Macne Nana Natural + Yuzuki Yukari Lin||SoundCloud|
|Macne Nana Natural + Tohoku Zunko Natural||SoundCloud|
|Macne Nana Petit + Nekomura Iroha Soft||SoundCloud|
Please note we are waiting for more information on some languages