Vocaloid Wiki

English - Japanese

7,985pages on
this wiki
Add New Page
Comments3 Share

Japanese Phonemes on Phoneme Entry Screen

Language DifferencesEdit

Due to the set up of the Japanese Vocaloids, they are more limited for the use of the English language, since the phonology of the Japanese language including phonemes, accents, tones, intonations, moras and assimilation's, is very different from that of the English language. As each consonant sound is always followed by inseparable vowels and consonants do not get in cluster in the Japanese language, generally each of them is pronounced weakly and not independently, except んn, sokuon and some transliterated phonemes for non-Japanese words. Because of this, some of Japanese Vocaloids’ consonant sounds slightly contain vowel sounds to be smooth and sound right in Japanese when they are connected to the following vowels.[1]

It is also important to know that the symbols suggested by the X-SAMPA couldn't match their actual pronunciations leading to some errors; for instance, the Vocaloid symbol [S] correspond to the /ʃ/ in English Vocaloids and /ɕ/ in Japanese ones,[2] basically Japanese "a" is a low central vowel and is between the English "a" in "father" and the English "a" in "dad",[3] and "r" in Japanese is not as same as either "r" or "l" in English.[4][5]

In addition, the English language often puts emphasis on certain letters of words (stress accent) while the Japanese language frequently use pitch accents.[6] These differences between two languages frequently make Japanese Vocaloids retain a Japanese accent when there is no perfectly equivalent phonemes, even if users manage them to sing in the correct language. On the contrary, the same things can happen to English Vocaloids and they often have English accents when they sing in other languages.

Another consideration with English Vocaloids is their regional accent. This will not affect any of the Vocaloids' overall performance or the handling of the VOCALOID engine and they will use identical Phonemes regardless. In fact, the only effect this will have on the Vocaloid is simply a particular stress or emphasis on certain vowels and consonants that may not be seen in another English Vocaloid, but may make an English Vocaloid sound not how a user expects. Examples of Vocaloids who may be affected by this include Sonika who has a British accent and Big Al who has an American; also included in this is Luka Megurine who will retain a Japanese accent. One noted example of a regional accent affecting a Vocaloid's outcome is Big Al's pronunciation of vowel sounds; he can often be harder to make sing in Japanese because of it. In contrast, Japanese Vocaloids do not have as much of a regional accent effect between them in Japanese.

As of VOCALOID3 Japanese Vocaloids can more closely mimic English language sounds thanks to the addition of new sounds they lacked in VOCALOID2. However, more complex words and sounds are still beyond the Japanese Vocaloids' reach and this limits the capabilities of a Japanese Vocaloid mimicking English sounds.

Other issues that exist are sample amount per pitch (500 for Japanese, 2500 for English, resulting in a typical English vocal being x5 larger then a Japanese one) and sample length (English Vocaloid are often cut at a longer length then Japanese). These difference in particular make the conversion from one language to another much harder then with some other languages.

English to JapaneseEdit

Techniques and TipsEdit

Working the VowelsEdit

English does not have the same vowels as the Japanese. In most of the cases, Japanese vowels fall in the middle of English vowels in terms of openness.

Example: The Japanese O () is between the O in core (more open) and the O in go (closer and often diphthongized).

Also, the English U and Japanese U differs in their rounding, as the later one is a compressed vowel. These differences becomes more evident when comparing the vowel charts of both languages. 

Beside these facts, English pronunciation tends to be more lax. As a result, it is possible to see a strong English accent in the vowels if the user doesn't work them carefully.

Fortunately, English has a big array of vowels which allows multiple possibilities to replace a vowel. For a vowel the user must consider the similitude between both the Japanese and the intended languages in order to minimize the foreign accent. In addition, the user must realize that the dialect could affect the realization of the English vowel.

Example: Oliver's [{] has been reported to be more centralized, sounding more similar to an /a/ than a /æ/ in comparison to other English voicebanks.[7]

The dialect effects specially to the diphones (diphthongs and rhotic diphones). For example, if the accent is non-rothic, the rhotic vowels can be realized either as long pure vowels or shwa-eding diphthongs; while if the accent is rhotic, these ones can be realized as vowel~[r] combination or as rhotic vowel.

If the rhotic vowel is realized as a long vowel, those ones can be used as a possible semiequivalent for the Japanese vowel intended to imitate.

Example: Big Al's [eI] phoneme is realized more as a long /e:/ than a /eɪ/ diphthong.

Knowing this the user must test which vowels sound better. For make easier the work, the best is group the vowels accord how much similar are to the Japanese counterpart.

JP Vowel's Symbol JP Vowel's IPA Available English Semi-equivalents
[a] ä open central unrounded vowel

[{], [V], [Q]. [@], [Q@] (if realized as /ɑː/)

[e] mid front unrounded vowel

[e], [I], [eI] (if realized as //)

[i] i close front unrounded vowel

[i:], [I], [j] (glide)

[o] mid back rounded vowel

[Q], [O:], [U], [@U] (if realized as //), [O@] (if realized as /ɔː/)

[M] u͍ or ɯᵝ close back compressed vowel

[u:], [U], [w] (glide)

After choose the closest or most fitting vowel, the pronunciation can be approximated further more adjusting the Parameters. The Gender Factor (GEN) can be used to change the overall tone a vowel as it affects the timbre and the formants, allowing give a more "light" or "dark" sound to the vowel, while the Opening (OPE) allows affect how open or rounded, affecting mainly the open vowels. Combining both allows to modify the stress and pronunciation to sort degree, however is important not abuse it as it may distort the sound in an unpleasant way.

Another possibility is combining the vowels properly. Occasionally may occur when two similar vowels  are put together, you can get a intermediate pronunciation as the sound glides from a vowel to the other one. Playing with this may allow control the stress better and allow a more native pronunciation. This trick however requires a smooth pronunciation between the vowels to be effective.

Examples: In some cases a combination of of the [V] and [{] phonemes on a same note/syllable can produces a sound closer to the Japanese [a] phoneme.

Finally it's possible use the diphthongs as replacement in certain vowel combination.

Example: The last part of the word 外科医 (gekai 'surgeon') can be done using the English diphthong [aI].


The palatalization is a phonological process where the articulation of a consonant is modified, causing the middle of the tongue to be raised to the palatal position. Due this modified consonant can turn into a palatalized consonant which has a brief palatal glide or "ee"-like sound , or can shift completely to the closest palatal consonant.

The Japanese has some clear lexical and grammatical rules for denote when occurs the palatalization, being and important phonological process in their language.

Main article: wikipedia:Yōon

In contrast, in the case of the English, this is be a allophonic process which generally is unnoticed by the English speakers. This one generally occurring due the influence of the glide [j].


Knowing this is possible take advantage of the allophonic palatalization in the English when you attempt to make an English Vocaloid sing in Japanese, for that is necessary create a brief [j] or "ee"-like sound after the consonant, for that the user can either:

1)  Intercalating the glide [j] between the consonant to palatalize and its vocal: The addition of the palatal approximant [j] will influence the consonant palatalizing. Maybe will be required adjust the Velocity (VEL).
  • ぎょうざ (gyōza 'fried dumpling') IPA: /ɡʲoːza/ → JP vb: [g' o z a] → EN vb: [g j O: z a]
2)  Do a short note with the consonant to palatalize along the vowel [i:]: If the note is the short enough, the articulation of the [i:] will be the incomplete or barely listen, given the a j-colored sound to the consonant. Probably the user will need adjust the Velocity (VEL), also is important take in consideration the Tempo.
  • ぎょうざ (gyōza 'fried dumpling') IPA: /ɡʲoːza/ → JP vb: [g' o z a] → EN vb: [g i:][O: z a]

In the case of the post-alveolar sibilants like [S], [tS], [Z] and [dZ], this trick may be required to acquire a more native pronunciation. Although similar, it is important note that the Japanese post-alveolars actually are alveolo-palatal, in contrast to the English post-alveolars which are  palato-alveolar . This gives to the Japanese post-alveolars a brief y-like glide sound and a somewhat weaker or less strident sound. [8]

  • 少女 (shōjo 'girl') IPA: /ɕoːdʑo/ → JP vb: [S o][dZ o] → EN vb: [S j O:][dZ j O:] or [S i:][O:][dZ i][O:]

Liquid ConsonantEdit

The liquid consonants are those ones that groups the lateral and rhotic consonants. Generally the languages tends to have 2 liquid consonant, one lateral (generally associated to the L) and one rhotic consonant (generally associated to the R).

In the Japanese there isn't a clear distinction between the both, so for the Japanese R is realized as an undefined post-alveolar liquid consonant which its sounds tends to vary depending its context, and being perceived by the Native Japanese Speakers as one phoneme. The sound usually is between /ɾ/ (more R-like and similar to the unstressed American D/T ) and /ɺ/ (flapped L, more lateral or L-like), tending to one or another depending the vowel which follows it.[9] Its for this reason the English users tends to perceive it between their L, R and D sounds.[10][11]

When the user attempts a more L-like sound the user simply can use the phoneme [l0] or Light L (usually the [l] or Dark L isn't recommendable, as it sounds awkward or with an excessive Anglo-saxon accent due the velarization).

Now, if the user wants a more [ɾ]-like sound, it's possible use the alveolar flap phoneme or [4], an additional phoneme that has become recurrent within the newest American accented  English voicebanks. Alternatively, if the user is working with a vocal that lacks this phoneme, it's possible imitate this sound by combining the the phoneme [r] with a D-sounding phoneme as [d], [dh] or [D], a method that was often used by some users when they attempt imitate the American accent before the addition of a proper [4] phoneme.[12][13] As they have different degrees of stress and prominence, probably the user will need test which combination gives the best result.

Examples: The word 光 (hikari 'light') it could be transcribed as [h i:][k V][r d i:] into an English voicebank.

No matter if the user it's attempting a more R-like or L-like sound, it's important adjust the Velocity (VEL) due the Japanese R (as other of its consonants), tends to be shorter or more "percussive".[14]

Finally, if the user wants a more aggressive/emphatic pronuniation like the one that happens in the Makijita (巻き舌) phenomena, it's possible use the Rolling R phoneme or [R] within the voicebanks that have it available.

Conversion ChartEdit

Special note: this is based on Big Al's help file and some information is added to show English equivalent/quasi-equivalent phonemes for Japanese phonemes with symbols and compare their actual pronunciations. Even if the Vocaloid symbol transcriptions are the same, their actual pronunciations in each of the language are often different as each IPA shows. This guide is meant for users who is working to make an English to Japanese Vocaloid to sing in the opposite language. However, additional work will be needed to get closer to the target language's phoneme usage.

Japanese Sample in Hepburn Romaji + notes Japanese Symbol IPA for Japanese Symbol Equivalent / Quasiequivalent English Symbol IPA for English Symbol
ai a ä









ima i



[I] (uppercase i)


ɪ (semi-equivalent)

uta M ɯᵝ ᴏʀ u͍




ʊ (semi-equivalent)

egao e





omoi o o̜ or ɔ̜







kokoro k k




kibou (followed by /i/) k' kʲ ᴏʀ kç

[k j] ᴏʀ [kh j]

[k i:] ᴏʀ[kh i:]

kj ᴏʀ kʰj

kiː ᴏʀ kʰiː

genki (note: at the beginning of word or sometimes after んn) g g




giri (note: followed by /i/. at the beginning of word or sometimes after んn) g'

[g j] ᴏʀ [gh j]

[g i:] ᴏʀ [gh i:]

gj ᴏʀ gʰj

giːᴏʀ gʰiː

hanpen, gagaku (note: similar to [m] when followed by occlusive, [ŋ] as nasalized g) N




[n g] ᴏʀ [n gh]

[m] (before occlusive)


ng ᴏʀ ngʰ


kagi (note: palatalized nasalized g') N' ŋʲ

[N j] ᴏʀ [N i:]

[n g j] ᴏʀ [n gh j]

[g i:] ᴏʀ [gh i:]

ŋj ᴏʀ ŋi:

gj ᴏʀ gʰj

giːᴏʀ gʰ iː

sadame s s




ʦ ᴏʀ t͡s

shiawase (palatalized /s/) S ɕ ᴏʀ ʃʲ


[S j] or [S i:]


ʃj ᴏʀ ʃi

kizu (note: generally intervowel, however some Japanese use dz or dZ instead) z z [z] z
iji (note: followed by /i/. often between vowels, however some Japanese use dz or dZ instead) Z ʑ ᴏʀ ʒʲ


[Z] j] or [Z i:]


ʒj ᴏʀ ʒi:

zuboshi, kazu (note: often [ʣ] at the beginning of word or after んn, [z] in other cases) dz



[d], [dh] ᴏʀ [D]


[d z], [dh z] ᴏʀ [D z]


d, dʰ ᴏʀ ð


dz, dʰz ᴏʀ ðz

jibun, kaji (note: followed by /i/. often [ʤ] at the beginning of word or after んn, [ʒ] in other cases) dZ ʥ ᴏʀ ʤʲ





taido t





baraetii (note: palatalized /t/, used for non-Japanese words) t' tʲ or ti

[t j] ᴏʀ [th j]

[t i:] ᴏʀ [th i:]

tj ᴏʀ tʰj

tiːᴏʀ tʰiː

tsuki ts ʦ ᴏʀ t͡s

[t s]




inochi (palatalized /t/) tS ʨ ᴏʀ ʧʲ


[tS j]

[tS i:]




daichi, kaden (note: [d] at the beginning of word or after んn, [ð] in other cases) d



[d] ᴏʀ [dh]

[D] (middle of a word)

d or


merodii (note: palatalized /d/, used for non-Japanese words) d' dʲ ᴏʀ di

[d] ᴏʀ [dh] + [j]

[d] ᴏʀ [dh] + [i:]



namida, kanpa (note: [n] when followed by fricative/flap consonant or vowel/semi-vowel, similar to [m] when followed by occlusive) n




[m] (before occlusive)



nioi (note: followed by /i/) J nʲ ᴏʀ ɲ

[n j]

[n i:]



hana h h [h] h
h\ ɦ [h] h
hinagiku (note: palatalized /h/) C ç

[h j]

[h i:] (short note)



fushigi p\ ɸ




fianse (note: used for non-Japanese words) p\' ɸʲ, fi ᴏʀ fj

[f j] ᴏʀ [f i:]

[ph j] ᴏʀ [ph i:]

fj ᴏʀ fi:

pʰj ᴏʀ pʰi:

boku b

b ᴏʀ β


bijin (note: followed by /i/) b'

bʲ ᴏʀ βʲ

[b j]

[b i:] (short note)



tanpo p p




henpi (note: followed by /i/) p'

[p j] or [ph j]

[p i:] ᴏʀ [ph i:]

pj ᴏʀ pʰj

piː ᴏʀ pʰiː

manako m m [m] m
imi (note: followed by /i/) m'

[m j]

[m i:]



yume j j [j] j
renge, sora (note: often [ɺ], [ɭ] or [ɖ] at the beginning of word or after んn, [ɾ] or [ɽ] in other cases) 4

ɺ ~ ɭ

ɖ ~ ɾɽ

r (makijita)


[r d], [r dh] ᴏʀ [r D]



"Faux flap"


rikutsu, teiri (note: followed by /i/, often palatalized when it is not at the beginning of word or after んn) 4' [ɾʲ] ᴏʀ [ɖ]

[r] + [d], [dh], or [D]+ [j] ᴏʀ [i:]

ɹj ᴏʀ ɹi:

watashi (note: compressed /w/) w ɰ͡β̞, w͍ ᴏʀ wᵝ


w or ɰʷ

kantan (note: end of word) N\ ɴ






Additional notesEdit

  • Linguistically, the phonemes which the English language and the Japanese language share in common are k, g, s, z, Z, tS, h, b, p, j and m. Also both English and Japanese voicebanks have e, S, dZ, d, N, n and w, however, these phonemes generally do not sound the same. (See IPA in each language)
  • Since all the voicebanks have their distinctive characteristics, their phonemes do not always produce the same result especially in languages which they are not intended for.
    • The above is particularly true for Miku and Rin, who are remarked to sound excessively aged when singing in normal configurations, higher octaves, but in another language.
  • The most of the consonants in the Japanese phonemes (with exception of the Nasal Consonants) and certain English phonemes are not intended to be encoded standalone. Using them for such may sometimes result in audio distortion, clicks or sound loops

Japanese to EnglishEdit

Techniques and TipsEdit

Working the VowelsEdit

As mentioned previously, the Japanese vowels don't match the English vowels. Aside this, there is the issue the number of available vowels in the Japanese language is much more limited in comparison to the English. This strongly limits the possibilities to minimize the inherent accent. Also, the restrictive phonotactics and moraic nature of the Japanese

1) Working the Monothongs or Pure vowels:
2) Working the Diphthongs and Glides: Being a moraic language with open syllables, the Japanese language lacks diphthong like the English. This along, the fact the Japanese /u/ and /w/ differs from the English ones, can make the diphthongs to be a tricky thing to imitate when using a Japanese voicebank
3) Working the Rhotic Vowels: Depending the dialect, the rhotic vowels can be pronounced either as a vowel+[r] combination or a rhotized vowel, in the case of the rhotic accents, or can be pronounced as a long monothong or a vowel+/ə/ diphthong, in the case of the non-rhotic accents. Knowing this, along the fact the phoneme [4] and its palatalized counterpart [4'] may have a more L-like sound than a R-like one, it may be preferable use a non-rhotic approximation when working them.

In general terms the ending schwa it can be approximated using a central vowel like [a], or a mid vowel like [e] or [o]. Which one works better will depend the context of the word.

Working the Aspirated PlosivesEdit

In the Japanese language, the plosives are slightly aspirated. Although this makes easier to work them, in may be possible the aspiration and stress isn't enough for the context of the word. In these cases, there are to possible alternatives to work around this issue:

1) Use the aid of the the devoiced vowels for imitate the voiced plosives: As the [*_0] phoneme already produces a whisper or breath-like effcet, it can be used to imitate the ending breath burst of the aspirated plosives. The effect can be further aided using the [h] as a glide or bridge in the syllable
Example: Pie IPA: /paɪ/ → EN vb: [ph aI] → JP vb [p a_0][a i] or [p a_0][h a i]
2)Use the affricates as replacement of the aspirated plosive: In the English the affricates [ts] and [dz] may appear as allophones of the aspirated /t/ and /d/ in some dialect of the English. This, along the fact the fricative release of the affricates can double as an aspirated release partially, it allows to use them as replacement of the [th] and [dh] phonemes, respectively.[15]
Example: Time IPA: /taɪm/ → EN vb: [th aI m] → JP vb: [ts a I m]

No matter what method is used, in both cases it's critical to adjust the note lenght, VEL and BRE parameters for achieve a convincing pronunciation.

Other useful phonemesEdit

Some phonemes can be used can be used to. As the Japanese voicebanks usually struggles with the consonant clusters, the phoneme [ts] can be used as replacement to the /ts/ cluster, usually formed by the combination of a word ending in /t/ and a abbreviated "is".

Alternatively the palatalized consonant can used to make easier some diphthong (particularly the ones which starts with [j]) or for

Conversion ChartEdit

English Sample (Received Pronunciation) English Symbol IPA for English Symbol Equivalent / Quasiequivalent Japanese Symbol IPA for Japanese Symbol
aware, synthesis, harmony, the @ ə schwa





strut, unclean, cut,
V ʌ  open-mid back unrounded vowel




them, egg e ɛ open-mid front unrounded vowel [e]

kit, it, synthesis I ɪ near-close near-front unrounded vowel






beef, eat, harmony i: iː  close front unrounded vowel





trap, axe { æ  near-open front unrounded vowel




taught, ought, ball O: ɔː  open-mid back rounded vowel




lot, off Q ɒ  open back rounded vowel




put, look U ʊ  near-close near-back rounded vowel [M] ɯᵝ or u͍
boot, view u: uː  close back rounded vowel [M] ɯᵝ or u͍
urge, bird, marker @r ɚr-colored schwa




pay, age, date eI

[e i]

[e j]



buy, eye, died aI

[a i] or [a j]

[a e]

äi or äj


boy, oil, choice OI ɔɪ

[o i]

[o j]



oat, soak, show @U əʊ

[o M] or [o w]

[e M] or [e w]

o̞u͍ or o̞w

e̞u͍ or e̞w

loud, out, cow aU

[a M] or [a w]

[a o]

äu͍ or äw


beer, ear I@ ɪə

[i a]

[i e]


bear, air, aware e@ ɛə

[e] [e a]

e̞ e̞ä

poor, surely U@ ʊə

[M a]

[M e]



pour, sort O@ ɔə


[o a]

[o e]



star, are, harmony Q@ ɒə


[a e]




w labio-velar approximant [w] wᵝ or ɰ͡β̞
yellow j palatal approximant [j] j
cab b voiced bilabial plosive [b] b
big bh aspirated voiced bilabial plosive [b] b
bad d d voiced alveolar plosive [d] d
dog dh [d] d
bag g g voiced velar plosive [g] g
god gh [g] g
jeans dZ ʤ voiced postalveolar affricate [dZ] ʥ
vote v v voiced labiodental fricative





their D ð voiced dental fricative





resort z z voiced alveolar fricative





Asia Z ʒ voiced postalveolar fricative







mind m m bilabial nasal [m] m
night n n alveolar nasal [n] n
long N ŋ velar nasal [N] ŋ
red r ɹ alveolar approximant

[4 w]




feel l ɫ Velarized alveolar lateral approximant



[M] or [w]1


ɯᵝ or wᵝ

list l0 l alveolar lateral approximant [4]


dip p p voiced bilabial plosive [p]


peace ph


[p p\]



sit t t voiceless alveolar plosive [t] t
top th



rock k k voiceless velar plosive [k] k
kiss kh



touch tS ʧ voiceless postalveolar affricate [tS] ʨ
feel f f voiceless labiodental fricative




ɸ ~ f

think T θ voiceless dental fricative




[C dz] or [dz h]




"faux TH"

sea s s voiceless alveolar fricative


[dz] or [z]



share S ʃ voiceless postalveolar fricative [S] ɕ
hat h h voiceless glottal fricative


[C] (front i)

[p\] (front u)





1^ L Vocalization, the /ɫ/ is replaced by a vowel or semivowel.


English to JapaneseEdit

Romaji/English Jidai (Era)
Featuring OLIVER, Sweet ANN (back-up)
Category Cover song

Japanese To EnglishEdit

Castle in a Cloud
Featuring Kaai Yuki
Author(s) Konki-P
Category Cover song
It's a fine day
Featuring Kagamine Rin Append Sweet
Author(s) HorizonsP
Category Cover song


  • The word "Engrish" is commonly used to describe odd Asian -> English words. The word itself originates from Japanese users habits of using a "r" instead of a "l" when spelling English words. In the Overseas Vocaloid fandom, the word is also often used to describe a Japanese Vocaloid singing in English. This is not as an act of disrespect, but rather just a note that Japanese phonetics were used to make "English".
  • Wat commented on how frustrated he felt when developing the Kaito English voicebank and commented how even a native speaker without patience might shoot their computer as a response to it.[16] The reason he gave was the huge gap between Japanese and English and how the two operate.[17]

See alsoEdit

External linksEdit


  1. Piapro - Rin/Len Kagamine’s Consonant Sounds
  2. Wikipedia:Shi(kana )
  3. Wikimedia:Japanese vowel chart
  4. Wikipedia:Vowel
  5. Wikipedia:Consonant
  6. Wikipedia:Japanese Pitch Accent
  7. link
  8. Reviewing the Kanji forum - / t͡ʃ / VS / t͡ɕ /
  9. Wikipedia:Japanese phonology
  10. link
  11. Overseas UTAU - How to Pronounce the Japanese "R" (dialect comparisons)
  12. link
  13. link
  14. VocaloidMaster - Ponyo en el acantilado, dúo en japonés
  15. link
  16. link
  17. link

Please note we are waiting for more information on some languages


Start a Discussion Discussions about English - Japanese

  • Im confused

    4 messages
    • <div class="quote"> wrote:<br />so my friend got a vocaloid pack and called me up. she had apperently see...
    • Officially, "Gakupo" is "Camui Gackpo".^_^ Some Vocaloids have English voicebanks like Miku, Gumi, Luka, Meiko, etc. Its...
  • How do you react to an Engloid speaking Japanese?

    8 messages
    • IMO at least it isn't as bad as "Engrish"... But I can't listen to Engloids singing Japanese songs too long because th...
    • I usually m not that impressed. I can see why there was a notion for English to be made from Japanese vocaloids who don't have a engl...

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Also on Fandom

Random Wiki