Japanese Phonetics

!

The following is a tutorial made for VOCALOID fans by fellow VOCALOID fans.

!

Japanese VOCALOIDs are VOCALOIDs that are capable of mimicking the Japanese language much easier than VOCALOIDs of other languages. The followings are lists of phonemes needed to make the VOCALOID sing in Japanese.

About[]

The origin of the language is mostly unknown, including when it first appeared in Japan. Its main influences are Chinese and Old Japanese. More modern decades have seen many European influences on the language, especially many English loanwords having been adopted into the Japanese phonetic system. However, the lack of influence from other languages, in addition Japan's isolation from the rest of the world, has contributed much to the precision of the Japanese phonetic system.

Far less new sounds entered the language for many centuries, in comparison to other languages such as English which had heavy influences from other languages.

VOCALOID and the Japanese Language[]

Due to lack of influence and the isolation of the language, this has worked in favour of the language within VOCALOID. The result is that Japanese voicebanks are some of the simplest to make for VOCALOID, as the sounds simply have to be correct and errors in Japanese language skills often are minimum, being a result of general bugs, glitches and the odd mispronunciations. The language is fairly straight forward to produce as most sounds are more definite than with languages such as English.

In addition, sounds such as diaphones and triphones often get acquired from recording data used for the languages main sounds, as a result when triphones were introduced in VOCALOID3, old voicebanks already had recorded the necessary triphones. This made updating the voicebanks fairly easy from VOCALOID2 to VOCALOID3 or VOCALOID4. In addition, development occurs significantly faster due to the lesser amount of sounds required. Japanese is one of the cheapest and fastest languages to produce for VOCALOID overall, with even multi-voicebank releases seeing their provider spend no more time in the studio than approximately a week.

One downside to the language is that during recording, less overall traits can be captured with certain recording techniques. This leads to Japanese VOCALOIDs at times having very little variation in how they sound. This especially impacted VOCALOID3, as this engine introduced the largest number of new Japanese voicebanks and common "vocal types" began to form especially among the female voicebanks. A new recording technique was introduced for VOCALOID4 by Internet Co., Ltd., allowing to capture more traits. As a result, VOCALOID4 and later Japanese voicebanks often do not behave the same even when they sound similar and have far more quirks and characteristics then pre-VOCALOID4 ones.^[1]

Another issue is that when a Japanese voicebank says a certain word, there is no work around like with larger language voicebanks. In addition as a contrast to languages as English or Spanish Phonetics, there is less overall variation of sounds, which means the manner a voicebank says a certain sound will often be identical every single time. There is simply no need for many variable of certain sounds such as "ma", "do", etc, so far less diaphones and triphones are needed overall, with some sounds being largely the same if they are at the beginning, middle or end of a word.

While this means the Japanese voicebanks are often among the most consistent and stable, they can suffer due to the repetition of their limited number of sounds. They can often be easier to edit, fix and adjust because of this overall because the result of the limitation variation is much more consistency's and predictabilities within each voicebanks. Once a producer learns something, such as the weakness of the sound "ma" in Hatsune Miku's VOCALOID2 vocal, you can easily pick up on such as weakness every single time. In other languages, though it may be common for "ma" to be a weakness, it may not be a weakness in every example, making them much less predictable and consistent.

This makes them also fairly simple to tune as a result. Often it is possible to use VSQ, VSQX or VPR files for any Japanese Vocaloid, with sometimes only slight tweaks here and there overall. However, that does not mean a user does not have to check all vocal results at all for any weakness in the voicebanks vocal performance and pronunciation, as that is never advisable at all.

Japanese Scripts[]

Japanese VOCALOIDs can use a standard YAMAHA script.

The script has been adjusted considerably since the early VOCALOID days, with adjustments even being made during the production of the Hatsune Miku vocal in VOCALOID2 to improve the needed sounds for Japanese.^[2] Thus, there was a slight leap between VOCALOID and VOCALOID2 quality even without the engine itself taken into account. With the addition of triphones in VOCALOID3, Japanese VOCALOIDs also became much smoother then VOCALOID2 vocals.

Notes on Accents[]

Despite the general belief that singers completely lose their accents when they sing, this is not the case in every instance and an accent is possible to be heard even in singing vocals.

However, the reason many are led to believe this is that there are several methods of training singers to disguise or otherwise hide their natural accents - they may even adopt an accent that isn't their own for singing. Samples include genres such as western or country, black music such as Jazz or Soul. Singing also uses different muscles to speech, resulting in difference of air pressure and way the throat moves. Genres such as Opera are most likely to make a accent appear almost entirely absent thanks to the impact of the opera vibrato.^[3]^[4]

VOCALOID will capture any form of accent quite easily at times. It depends on the recording method used on the voicer, type of sound being recorded per sample (accent impact varies per sample and language), and overall number of samples that make up the voicebank (the more samples, the more chance of it slipping in).

For Japanese Vocals, accents can appear in a voicebank, but their impact is very little on how it sounds and has a tendency to impact non-native Japanese voicebanks rather then the native Japanese voicebanks.

The reason being is that Japanese mostly uses pitch accent and words do not need to be stressed at all. Thus, in the majority of cases, the typical Japanese VSQ/VSQX file will work without little adjustments. Any oddities in regards to how a vocal sounds tends to be a result of a fatal flaw in the voicebank and its samples themselves or with VOCALOID, rather then a result of a form of accent. However, one important element to note that an accent can be a contributor to the common issue with Japanese voicebanks. If a sound is incorrect due to an accent, then the language itself can often be off. In these examples, there is rarely a alternative sound that can be used to replace such cases.

In general, accents tend to serve only as a contributor to the "traits" of a Japanese voicebank, giving it a little bit more distinction at times then others and can be dismissed as such except in extreme cases. As previous mention, due to the limited number of phonetics it is much harder to record the traits of a vocalist provider, accent as well may be impacted just for this reason alone in addition. As such, all of these things can cause the accent to have very little impact on the performance of the voicebanks overall compared to the accents captured on voicebanks of other languages.

Native Accented[]

Non-Native Accented[]

Misc. Voicebanks[]

The following are VOCALOIDs who were confirmed to be sampled from multiple voice providers for their Japanese voicebanks.

Luo Tianyi (Voiced by Shan Xin, a Chinese voice actress, and Kano, a Japanese Utaite)^[5]

Phonetic System's Characteristics[]

There are 41 phonetic pronunciations which make up the Japanese Vocaloid library, these phonetic inputs will use any set of the estimated 500 total samples needed for Japanese recreation per pitch.^[6]

Due its moraic nature, the Japanese language has a simple phonotactics and syllable structure. For this reason the Japanese Phonetic system was designed to be encoded as [C V] syllables. For that reason, the voicebanks may struggle in pronouncing consonant clusters, diphthongs or consonants in coda position.

Vowels[]

The Japanese Phonetic System includes the 5 vowels of the Japanese Language.

As per the palatalization phenomena found in the Japanese Language, the system is designed so that the vowel [i] needs to have a palatalized consonant in front of it to produce sound. If this isn't the case then the combination will be silent, even if both phonemes are separated in different notes. The only exception to this are the phonemes [s] and [dz], as those ones produce sound when followed by an [i].

It's important to note that some voicebanks may have problems with certain vowel combinations, which can end up sounding choppy. Some techniques to help correct this exist. Generally, this was a more common problem in the first generation of the software but as the release of the Japanese voicebanks progressed, the vowel combination problem become much less apparent. This is due to the improvement of the recording and processing methods as well as overall experience with the synthesis engine within the companies. The problem was phased out completely by the third generation.

Consonants[]

The Japanese Phonetic System includes 36 consonant phonetic pronunciations. Due to Japanese being a language which has little to no consonant clusters, the system was designed without consideration to standalone consonants. Because of this, consonants always need to be accompanied by a vowel. If not, the synthesizer won't be capable of reproducing the consonant. This will instead generate audio distortion, clicks, electronic buzzing or sound loops.

The exception to this are the nasal consonants associated to the Japanese N or ん, which is the only consonant in the Japanese Language which is pronounced without a vowel, as this character is considered a mora in their own.

Palatalized Consonants[]

In the case of the consonants, due the Yōon and the related palatalization phenomena of the Japanese Language, the system includes two versions of the same phoneme: the standard one and the respective palatalized version.

The palatalization has two definitions, a phonetic one and a phonological one. For the phonetic term, it refers to a secondary articulation, which adds a small y-like glide sound at the end of the consonant. The phonological term refers to a kind of sound mutation or assimilation process, that changes the sound of a consonant into a more palatal articulation. In the case of the Japanese language, both kinds of phenomena can be found.

For the Japanese Phonetical System, most of the palatalized consonants are differentiated from their standard version with the addition of a small apostrophe ('), which is the X-SAMPA's equivalent to the IPA's small superscript ‹ʲ›, used to denote the secondary palatal articulation. Example: [tʲ] for a palatalized [t].

Nasal Consonants[]

In the Japanese language, one of the few consonants that are pronounced is the N (ん in hiragana, ン in katakana). This letter has a lot of assimilation allophones, and all those are nasal consonants. Due this, all the nasal phonemes ([n], [J], [m], [m'], [N], [N'], [N\]) can be reproduced standalone, without a vowel accompanying them.

Forbidden Combinations[]

Due to the way the Japanese voicebanks were recorded and the way the VOCALOID editor was made, there are some phoneme combinations that are forbidden or aren't recognized by the synthesizer. If you attempt to enter these combinations, the editor will either produce a broken sounding syllable with the consonant and vowel disconnected, or won't produce sound at all.

Some of there forbidden combinations are:

non-palatalized phoneme + [i] (Exceptions: [s], [ts], [dz])
[w M], [j i] and [h M]: nonexistent in the Japanese Language. The [h M] combination is replaced by [p\ M].
Some palatalized phonemes + vowel different to [i] (check the previous chart)

Also, there are some consonant phonemes that are restricted to certain vowels. If the combination isn't the correct one, the synthesizer won't produce sound.

[h\]: Restricted to the vowels [e], [o]
[z] and [Z]: Restricted to the vowels [e], [o], and [M]

Voiceless Phonemes[]

A new set of phonemes was added with the release of the VOCALOID3 software. This new set of phonemes are unvoiced versions of the vowels and the sonorant consonants (Liquids and Nasal Consonants, including they palatalized versions) found in the Japanese Phonetic System.^[7]

In linguistics, voicelessness is the property of sounds being pronounced without the larynx vibrating. Sometimes the sonorants (vowels and sonorant consonants) can became pronounced in a voiceless manner. When this occurs, you can actually see the person articulate the sonorant, but it's either barely audible or silent altogether.

Example: the Japanese word sukiyaki is pronounced [su̥.ki.ja.ki]. This may sound like [s.ki.ja.ki] to an English speaker, but the lips can be seen compressing for the [u̥]. Something similar happens in English with words like peculiar [pʰə̥ˈkjuːliɚ] and potato [pʰə̥ˈteɪtoʊ].

To use them, the user must add the suffix [◌_0] to the sonorant, which corresponds to the X-SAMPA's diacritic for <◌̥>, the IPA's diacritic for a voiceless phonation.

Example: For a voiceless [o] the user must type [o_0]. For a voiceless [4] the user must type [4_0]

When a VOCALOID2's Japanese voicebank is imported to VOCALOID3 this new set of phonemes is generated from the samples existing on it.

Sonorant Type	Default	Devoiced
Vowels	[a]; [e]; [i]; [o]; [M]	[a_0]; [e_0]; [i_0]; [o_0]; [M_0]
Nasals	[n]; [J]; [m]; [m']; [N]: [N']; [N\]	[n_0]; [J_0]; [m_0]; [m'_0]; [N_0]: [N'_0]; [N\_0]
Liquids	[4]; [4']	[4_0]; [4'_0]

Techniques[]

Fixing choppy vowel combinations[]

Glides[]

It is possible to correct the problem of certain choppy vowel combinations with the aid of the phonemes [j], [w], and [h\].

The consonant phonemes [j] and [w] can be utilized as semivowels or glides for the vowels [i] and [M] respectively, which allows them to fix the vowel combinations with those vowels.
These consonants can be utilized either in replacement of their vowel:

Examples:

The first Japanese VOCALOIDs (MEIKO and KAITO) have some problems pronouncing [a i]. This can be fixed replacing the [i] for a [j]. [a i] → [a j]

or can used to unite the both vowels inserting it between them (don't forget the combinations [j i] and [w u] are forbidden).

If the combination [M e] sounds choppy, the note can be split in two. [M e] → [M w][e] or [M][w e] (you will most likely have to play with the OPE or VEL parameter(s) to get a smooth pronunciation).

Blending Phonemes[]

Choppy Vowel Combination
$Vowel Combination fixed with the phoneme [h\]$
Vowel Combination fixed with the phoneme [h\]
Waveform comparison between both samples

Comparison using the [h\] phoneme<br /><br />Miku sings the Vowel combination [e a]. First without fix and then fixed with the help of the phoneme [h\]. Compare the second case one has a smoother pronunciation in comparison with the first one.

In the case where you can't use these phonemes, you always can use the restricted phoneme [h\]. This phoneme only produces sound if is succeeded by a [e] or [o], but when combined with the other vowels, this consonant won't produce any sound. However, if after the mute combination you add a vowel on a different note, the synthesizer will skip the mute combination and immediately will reproduce the following vowel, allowing users fix choppy vowel combinations.

Examples:

Hatsune Miku V2 is known to struggle with some of her vowel combinations. For example, a well known transition issue she has is [e i]. When fixing this, it should be input as [e i] → [e h\ i][i] or [e][e h\ i][i].
Kagamine Rin/Len ACT1 and ACT2 are known to have various choppy vowel combinations. This can be fixed using this technique.

In VOCALOID2, the phoneme [Asp] generates a similar effect to the phoneme [h\] with any vowel combination, allowing it to fix choppy vowel combinations as well.

Gemination and Consonant Length[]

The gemination (consonant length) is when a spoken consonant is pronounced for an audibly longer period of time than a comparative short consonant. This is an important distinctive phonetic process in the Japanese Language.

Example: Two words can have a different meaning just for the different consonant's length

河川 kasen IPA:[ka.sẽɴ] 'Rivers'

合戦 kassen IPA:[kas.sẽɴ] or [kasːẽɴ] 'Battle'

For VOCALOID2[]

As was mentioned before, the Japanese Phonetic system wasn't designed to allow the consonant to be reproduced alone, if the user tries to encode it without a vowel, this will generate an almost inaudible loop sounding as an electronic buzz. However, if the consonant is in middle of two reproducible notes or syllables, the system is capable of handling it better, making it possible to encode it alone. This allows the ability to use it to extend some consonants.

For increase the length of a consonant the user must create a gap between the the preceding syllable and the next one containing the consonant to extend. Then fill the gap with a short note containing the consonant phoneme to extend, without a vowel.^[8]

Example:

It's important that the note preceding the consonant alone must end in a vowel, so if this isn't the case, the synthesizer won't be capable of handling it, producing an undesired chop. Also, it's important to emphasize that although this method allows the user to extend consonants, the system stills struggles with the consonants encoded alone, especially if they are too long. This can generate sound loops or distortion of the phoneme, so it's important to not abuse this method.

For VOCALOID3[]

For the third version of the software, the parameter Velocity (VEL), was corrected, now effectively affecting the length of the consonants when this one is modified. This, added to the addition of the devoiced phonemes, allows effective modification of the length of consonants without utilizing complicated methods or post-processing techniques like in VOCALOID2.

Phonetics List[]

Symbol	Classification	IPA Symbol	Sample Hiragana/Kunrei-shiki Romaji	Notes	Related Phonemes
[a]	vowel	ä open central unrounded vowel	あ a
[i]	vowel	i close front unrounded vowel	い i		[j] (glide)
[M]	vowel	ɯᵝ or ɯ͡β close back compressed vowel	う u	The japanese "u" is neither rounded /u/ nor unrounded /ɯ/, but compressed.	[w] (glide)
[e]	vowel	e̞ mid front unrounded vowel	え e
[o]	vowel	o̞ mid back rounded vowel	お o, を o
[k]	consonant	k voiceless velar plosive	か ka, く ku, け ke, こ ko		[g] (voiced) [k'] (palatalized)
[k']	palatalized consonant	kʲ palatalized voiceless velar plosive	き ki, きゃ kya, きゅ kyu, きぇ kye, きょ kyo	Palatalized /k/.	[g'] (voiced) [k] (depalatalized)
[g]	consonant	g voiced velar plosive	が ga, ぐ gu, げ ge, ご go		[k] (voiced) [g'] (palatalized) [N] (nasal)
[g']	palatalized consonant	gʲ palatalized voiced velar plosive	ぎ gi , ぎゃ gya, ぎゅ gyu, ぎぇ gye, ぎょ gyo	Palatalized /g/.	[k'] (voiced) [g] (non-palatalized) [N'] (nasal)
[N]	consonant	ŋ velar nasal	が ga, ぐ gu, げ ge, ご go, ん n-n'	Nasalized /g/. Also is an allophone of /n/ before a velar consonant.	[N'] (palatalized) [g] (plosive) [n] (develarized)
[N']	palatalized consonant	ŋʲ palatalized velar nasal	ぎ gi , ぎゃ gya, ぎゅ gyu, ぎぇ gye, ぎょ gyo, ん n-n'	Palatalized nasal /g/. Also is an allophone of /n/ before a palatalized velar consonant.	[N] (depalatalized) [g'] (plosive) [J] (develarized)
[s]	consonant	s voiceless alveolar sibilant	さ sa, す su, せ se, そ so, すぃ si		[z] (voiced) [S] (palatalized) [ts] (affricated)
[S]	palatal consonant	ɕ or ʃʲ voiceless alveolo-palatal sibilant	し shi, しゃ sha, しゅ shu, しぇ she, しょ sho	Palatalized /s/. The X-SAMPA symbol incorrectly suggest it's a /ʃ/, although both phonemes sound similar they aren't the same one.	[Z] (voiced) [tS] (affricated) [s] (depalatalized)
[z]	consonant	z voiced alveolar sibilant	ざ za, ず zu, づ zu, ぜ ze, ぞ zo	Often used between vowels, however not all the Japanese speakers use this sound.	[s] (voiceless) [Z] (palatalized) [dz] (affricated)
[Z]	palatal consonant	ʑ or ʒʲ voiced alveolo-palatal sibilant	じ ji, ぢ ji, じゃ ja, じゅ ju, じぇ je, じょ jo, ぢゃ ja, ぢゅ ju, ぢぇ je, ぢょ jo	Palatalized /z/, often used between vowels, however not all Japanese speakers use this sound. The X-SAMPA symbol incorrectly suggests that it's a /ʒ/, although both phonemes sound similar they aren't the same.	[S] (voiceless) [z] (depalatalized) [dZ] (affricated)
[t]	consonant	t voiceless alveolar plosive	た ta, て te, と to, とぅ tu		[t'] (palatalized) [tS] (affricated)
[t']	palatalized consonant	tʲ palatalized voiceless alveolar plosive	てぃ ti, てゅ tyu	Palatalized /t/, usually used in non-Japanese words incorporated into the language.	[d'] (voiced) [t] (depalatalized) [tS]
[ts]	consonant	ʦ voiceless alveolar affricate	つ tsu, つぁ tsa, つぃ tsi, つぇ tse, つぉ tso		[dz] (voiced) [t] (deaffricated) [s] (spirantizated)
[tS]	palatal consonant	ʨ voiceless alveolo-palatal affricate	ち chi, ちゃ cha, ちゅ chu, ちぇ che, ちょ cho	Palatalized /t/. The X-SAMPA symbol incorrectly suggests that it's a /ʧ/, although both phonemes sound similar they aren't the same.	[dZ] (voiced) [ts] (palatalized) [t] (deafrricated) [S] (spirantizated)
[dz]	consonant	ʣ voiced alveolar affricate	ざ za, ず zu, づ zu, ぜ ze, ぞ zo, ずぃ zi	Often used at the beginning of a word or after ん, however some Japanese speakers also use this sound instead of z or Z.	[ts] (voiceless) [dZ] (palatalized) [z] (spirantizated) [d] (deaffricated)
[dZ]	palatal consonant	ʥ voiced alveolo-palatal affricate	じ ji, ぢ ji, じゃ ja, じゅ ju, じぇ je, じょ jo, ぢゃ ja, ぢゅ ju, ぢぇ je, ぢょ jo	Palatalized /dz/ or /d/, some Japanese speakers use this sound instead of z or Z. The X-SAMPA symbol incorrectly suggests that it's a /ʤ/, although both phonemes sound similar they aren't the same.	[tS] (voiceless) [dz] (depalatalized) [Z] (spirantizated) [d] (deaffricated)
[d]	consonant	d voiced alveolar plosive	だ da, どぅ du, で de, ど do		[t] (voiceless) [d'] (palatalized) [dz] (affricated)
[d']	consonant	dʲ palatalized voiced alveolar plosive	でぃ di, でゅ dyu	Palatalized /d/, usually used into non-Japanese words incorporated to the language.	[t'] (voiceless) [d] (depalatalized)
[n]	consonant	n alveolar nasal	な na, ぬ nu, ね ne, の no, ん n	This consonant can be articulated without a vowel. This phoneme appears before alveolars and alveolo-palatals.	[J] (palatalized) [N] (velarized) [m] (labialized)
[J]	consonant	ɲ or nʲ palatal nasal	に ni, にゃ nya, にゅ nyu, にぇ nye, にょ nyo	Palatalized /n/. This phoneme also appears as an allophone of /n/ before another palatal nasal.	[n] (depalatalized) [N'] (velarized) [m'] (labialized)
[h]	consonant	h voiceless glottal fricative	は ha, へ he, ほ ho		[C] (palatalized) [p\] (labialized) [h\] (voiced)
[h\]	consonant	ɦ voiced glottal fricative	ぁ xa, ぃ xi, ぅ xu, ぇ xe, ぉ xo	Intervowel /h/. Only works for [a], [e] and [o].	[h] (voiceless)
[C]	palatal consonant	ç voiceless palatal fricative	ひ hi, ひゃ hya, ひゅ hyu, ひぇ hye, ひょ hyo	In Japanese it is perceived as a palatalized /h/.	[h] (depalatalized)
[p\]	consonant	ɸ voiceless bilabial fricative	ふ fu, ふぁ fa, ふぇ fe, ふぉ fo		[h] (debuccalizated) [p] (spirantizated)
[p\']	palatalized consonant	ɸʲ palatalized voiceless bilabial fricative	ふぃ fi, ふゃ fya, ふゅ fyu, ふぇ fye, ふょ fyo,	Palatalized /ɸ/.	[p\] (depalatalized) [h] (delabialized)
[b]	consonant	b voiced bilabial plosive	ば ba, ぶ bu, べ be, ぼ bo		[p] (voiceless) [b'] (palatalized)
[b']	palatalized consonant	bʲ palatalized voiced bilabial plosive	び bi, びゃ bya, びゅ byu, びぇ bye, びょ byo	Palatalized /b/.	[p'] (voiceless) [b] (depalatalized)
[p]	consonant	p voiceless bilabial plosive	ぱ pa, ぷ pu, ぺ pe, ぽ po		[b] (voiced) [p'] (palatalized)
[p']	palatalized consonant	pʲ palatalized bilabial plosive	ぴ pi, ぴゃ pya, ぴゅ pyu, ぴぇ pye, ぴょ pyo	Palatalized /p/.	[b'] (voiced) [p] (depalatalized)
[m]	consonant	m bilabial nasal	ま ma, む mu, め me, も mo	Also is an allophone of /n/ before bilabial consonants. This consonant can be articulated without a vowel.	[m'] (palatalized) [n] (delabialized)
[m']	palatalized consonant	mʲ palatalized bilabial nasal	み mi, みゃ mya, みゅ myu, みぇ mye, みょ myo	Palatalized /m/. Appears before palatalized bilabial consonants.	[m] (depalatalized) [J] (delabialized)
[j]	consonant	j palatal approximant	や ya, ゆ yu, よ yo, いぇ ye		[i] (silibant) [dZ]
[4]	consonant	ɾ alveolar flap	ら ra, る ru, れ re, ろ ro	Although the X-SAMPA suggest that this phoneme is a alveolar tap, technically is an apical postalveolar flap undefined for laterality, hence the Japanese /r/ tends to sound somewhat between a ɽ and a ɺ. If the consonant has a more R-like or L-like sound, depends of its context.	[4'] (palatalized)
[4']	palatalized consonant	ɾʲ palatalized alveolar flap	り ri, りゃ rya, りゅ ryu, りょ ryo		[4] (depalatalized)
[w]	consonant	w͍ or wᵝ compressed labio-velar approximant	わ wa, うぃ wi, うぇ we, うぉ wo	Similar to its "u", the Japanese /w/ is compressed.	[M] (syllabic)
[N\]	consonant	ɴ uvular nasal	ん n	/n/ at the of end of word or before a vowel.	[n]

Additional notes[]

Windows in Japanese locale displays the symbol <\> (backslash) as <¥> (Japanese yen), so it's written as <¥> in the official phonetic system.
Crypton's VOCALOIDs, including KAITO and MEIKO, have almost the same Japanese phonetic system throughout all databases.^[9] To use [z], [Z], [h\], [N] and [N'] , users need to edit the phonemes, not entering kana characters.
Kagamine Rin & Len Act 1 can pronounce [h\] while their Act 2 cannot (comparison of consonant sounds Act 1, Act 2).
VOCALOIDs of Internet Co., Ltd., such as Gackpoid or Megpoid, mostly share the same system as Crypton's, but they do not have [z] and [Z] sounds. As is often the case with the Japanese language, they are replaced by [dz] and [dZ].^[10]^[11]
Japanese VOCALOID2 voicebanks can combine a and i phonemes (eg. [w a i]) but not with the original VOCALOID voicebanks. The workaround is to simply use the y consonant. (eg. [w a j])
[N\], [N] or [n] alone tends to be pronounced as "ng". This is the basis for Japanese VOCALOIDs being used for South-East Asian languages.
[N'] followed by a vowel different to [i] may produce odd results, however, due to its use within the Japanese language there is no actual call for this phonetic to be followed by a vowel different to [i].

Continued Development[]

The Japanese language is by far the most popular VOCALOID language and is the most well-refined, as it is the language that create the cultural phenomenon that made VOCALOID popular in late 2007. Though, this popularity has considerably leveled off since 2014. Japanese vocals make up the largest selection of the VOCALOID vocals available for purchase as a result.

It is much quicker to produce a Japanese vocal and much cheaper than the other languages currently available for VOCALOID. Because of this, the popularity of Japanese voicebanks and cheapness/speed Japanese VOCALOID has led to faster productions, allowing more content in their releases with an overall cheaper product than most other languages.

One of the most notable concerns about this language in regards to VOCALOID is that there has been a reduction over time of advancements to it, with the most major event to improve all Japanese voicebanks being the release of VOCALOID3. From 2014 onwards, a slow decrease in the number of voicebanks has been witnessed and in VOCALOID5 the only 3rd party studio releasing a voicebank by 1st Jan, 2019 being released by AH-Software Co. Ltd. and was their Haruno Sora vocal.

References[]

Navigation[]

[1] ↑ link

[2] ↑ link

[3] Explanation for accents in singing and also a lack of

[4] [ http://www.todayifoundout.com/index.php/2013/08/why-british-singers-lose-their-accent-when-singing/ "Why do British singers lose their accents?"]

[5] ttps://weibo.com/5146173015/Gckc5ax0y

[6] [1]

[7] ttp://a0010.web.fc2.com/text/v3memo/index.html

[8] ttp://ahou2chome.sakuratan.com/misc/mikuvoice/lesson02.html

[9] Japanese Phonetic System of VOCALOID KAITO

[10] Japanese Phonetic System of Megpoid

[11] Japanese Phonetic System of Gackpoid

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]