Wikia

Vocaloid Wiki

English Phonetics

Comments6
2,628pages on
this wiki
Englishphonetics
English Phonetics in use on the Vocaloid piano roll system
Angel EmfrblAdded by Angel Emfrbl

English Vocaloids are Vocaloids that are capable of mimicking the English language much easier than Vocaloids of other languages. The following is a list of phonemes needed to make the Vocaloid sing in English.

Contents

AboutEdit

The English language has one of the greatest variations of dialect in the world. Thus, there is much more variety of pronunciation for English Vocaloids than Vocaloid such as those that sing in Japanese.

The English language itself is made up of about 20 vowel sounds and 24 consonant sounds. Also, English doesn't have a systematic orthography, so there is not  a one-to-one or near one-to-one match between letters and sounds as it happens in Japanese, for example. "W" can sound /w/ in "what" and /u:/ in "few", "Y" can sound /j/ in "yes" and /i/ in "play".[1] there are also differences between spellings of words, such as those seen in British and American spellings of words such as "Colour/color".

Vocaloid and Vocaloid 2, uses American spelling for the lyrics. Vocaloid 3 is confirmed to be capable of localisation, but it is unknown if it will open up the ability to have American and British spelling.

However the phonetic notation doesn't follow this, and instead uses the Received Pronunciation written in X-SAMPA, with some minor modifications when it's required, like its the case of the allophones.

English VocaloidsEdit

Despite the general belief that singers lose their accents when they sing, this is not the case and a accent is possible to be heard even in singing vocals. However, the reason many are led to believe this is that there are several methods of training singers to disguise or otherwise hide their natural accents. Though the English language is not alone in the problems of accent as other languages may suffer from this same problem, English Vocaloids have proven to be difficult to avoid issues with accents. Even the first two Vocaloids in English, Leon and Lola, were noted their distinctly "British" accent. The result is that the accent has been known to aid or add difficulty to the use of synthesizing software and Vocaloid is no stranger to this effect. English vocaloids have ended up with the most variation on how they sound out of all the current languages offered for the Vocaloid software so far produced.

The impact of the dialect/accent on English Vocaloids can result in a notorious variation of certain sounds, being notorious in the case of the diphthongs and rhotic vowels. Users who are not aware of the potential difficulty of accents may overlook odd pronunciations that need to be adjusted for better results. This is true for non-native based accents voicebanks more so, due the voice provider may have pronunciation issues with a non-native language.

In some instances, Producers may be found to have adjusted VSQ and VSQX files so heavily to make them work for 1 particular English vocaloid that they become "Vocaloid specific" and are unable to work particularly well without further adjustments on other English vocaloids. Cases like this are often rare in languages such as Japanese , though not foreign to them and many VSQ and VSQX files will work without too much adjusting.

British-English AccentedEdit

British-English accented Vocaloids were Vocaloids whose provider was known to have been of "British" nationality. As Great Britain is the main origin of English, British-English Vocaloids sing in a native English accent. Originally, they were the standard English accent type used to develop the English engine. British accented Vocaloids mostly came originally from Zero-G who worked solely with British artists to collect their vocal samples from.

Note: The term 'British' applies to anyone from England, Scotland, Wales and Northern Ireland and therefore the variation of the accent can differ greatly overall. The British Isles have the greatest variation of accents for English in the world per sq. mile of land. (For more information see Wikipedia.)

American-English AccentedEdit

American accented Vocaloids have providers that came from the United States of America, and for this they are native speakers of the English language. Due to the fact there is only one American-English accented Vocaloid, practically there isn't any other voicebank to compare against it. The most notorious difference with the British accented voicebanks is in the rhotic vowels[2]. This is because the British dialects usually are non-rhotic; in North America rhotic dialects of the English are predominant.[3] (For more information see Wikipedia.)

Australian AccentedEdit

Australian Accents are the normal english accent for individuals from Australia. This particular accent is normally very distinct compared to all other English accents, with features unique from all other Engish dialects. (For more information see Wikipedia.)

  • Sweet Ann - her provider "Jody" supposedly came from Australia.

South-African AccentedEdit

South African accents are accents belonging to individuals from South Africa. English was not a native language to Africa and was introduced during the colonisation of African countries by English colonist, resulting in the English language becoming widely used in South Africa itself as the general Lingua franca between regions. Variation in impact of native languages on the English language results in a large variation of strength and tone of the accent, though in general most south african accents resemble closely to South England accents in nature. (For more information see Wikipedia.)

Japanese-English AccentedEdit

Japanese-English accented Vocaloids are English Vocaloids produced by those who came from Japan. They have the Japanese language as their native language, but were used to produce English voicebanks. Therefore the Japanese-English accent is a non-native English accent, showing significant and notorious differences in comparison to the native English accents. As more releases of such voicebanks have been produced by studios, common traits that are clearly able to be picked out amongst these vocals.

The major issue seen with Japanese accents is that they often struggle with distinction of some sounds. This usually happens because the providers aren't familiarized with these foreign sounds. Among the most common issues are:

  • Lack of distinction in vowel sounds. These ones usually are either too tense or too lax, as the speaker tends to approximate the vowel sound to their 5-vowel system.
  • Lack of distinction in the liquids consonants. Luka's use of English to pronounce the words "Road Roller", which risks coming out as sounding like "roe rorora", is the most famous case.
  • Distortion of some sounds toward similar Japanese sounds. As example,  the [f] phoneme usually is realized as a bilabial fricative instead a labiodental fricative, as it should be.

Depending on the providers efficiency in English, depends on the level of difficulty the Vocaloid will have in making pronunciations distinct. For instance, Miku's provider Saki Fujita had troubles pronouncing English triphones.[citation needed]  Despite this, Japanese-English Vocaloids are capable of more closely mimicking the English language than a purely Japanese voicebank.

Korean-English accentedEdit

Korean-English accented Vocaloids are Vocaloids produced by those who come from Korea. As there is only one unreleased Vocaloid voicebank with this accent, details cannot be released.

SeeU's Korean voicebank was given English phonemes to mimic "English". However, again this does not produce quality results enough to comment on.

  • SeeU - A English Voicebank is in production.(yet released as it is currently delayed;Kim Dahee)

Misc.Edit

  • Prima - Accent unconfirmed
  • Sonika - Accent unconfirmed
  • Tonio - Accent unconfirmed

Phonetic System's CharacteristicsEdit

There are 52 phonetic pronunciations which make up the English Vocaloid library, these phonetic inputs will use any set of the estimated 2500 diphonetic samples, (Vocaloid uses a total of approx 8,500 samples altogether for english) needed for English recreation altogether. According to English Gumi's development notes, there were over 4,000 phonetic connections for that particular vocal alone[4]; a similar number is therefore likely for all English vocaloids.

VowelsEdit

The English phonetic system includes 3 types of vowels: monothongs, diphthongs and R-colored vowels. Being the nucleus of the syllable, the vowels can be encoded alone

The English phonetic system includes 10 vowels of the 11 monophthongs or pure vowels of the English Language, missing the phoneme /ɑː/ or open unrounded vowel.

The pronunciation of some vowels may change slightly, depending the dialect or the way how was recorded the Vocaloid.

  • Example: Oliver's [{] phoneme has been reported to sound more like an /a/ than an /æ/.

The English phonetic also includes an array of 5 diphthongs or gliding vowels : 3 y-colored diphthongs and 2 w-colored diphthong. The diphthongs behave as a single vowel, despite the glide at the end of them.

The diphthongs used to cause some problems for the user when they need to be extended across 2 or more notes.[5]For work around this, Vocaloid2 used the symbols [-] and [/] in the lyrics, however the results usually weren't smooth. This isn't an issue anymore for Vocaloid3, as the phoneme [-] allows to extend diphthongs across various notes.

Also is important consider the diphthongs, like the monothongs and the rhotic vowels, they can vary they pronunciation, depending the dialect, recording and stress of the word.

  • Example: The diphthong [eI] can be pronounced with different degrees of stress, being realized either as [eː], [eɪ], [ei] or [ej]. Big Al is know to vary noticeably the pronounciation of this phoneme accord the context.[6]

The English phonetic also includes including 6 r-colored or rhotized vowels. These ones are used mainly used for the vowel + R combinations. These vowels are modified by the R that follows them, incorporating to them and forming a single unit, as it's in the case of the diphthongs.

Like the diphthongs, these ones tends to vary in their pronunciation, specially if the voice provider has a rhotic accent or not.

  • Example: The phoneme [e@] may be realized as /ɛː/, /ɛə/, /ɛɚ/ or /ɛɹ/

ConsonantsEdit

The Phonetic System also includes 31 consonants phonemes, including the allophones of various consonant. The system includes to the 3 plosive of the English, along with their respective aspirated allophones and the two variations of the L for the English, the Clear L and the Dark L. In some voicebank includes the Rolling R as additional phoneme.

Aspirated PlosivesEdit

English makes distinctions with the normal and aspirated consonant. The aspiration is the strong burst of air that accompanies at the release of of some obstruents. In International Phonetic Alphabet the aspirated phonemes are indicated by a small superscript ‹h›, as with [kʰ] for a aspirated [k].

In the English language, the consonants [b], [d], [g], [p], [t], [k] became aspirated at the beginning of the words. The aspirated phonemes are distinguished from their standard versions due to the addition of a h which represents the IPA's small superscript ‹ʰ›.


The English Phonetic system includes an array of 3 to 4 liquid consonants. These ones includes to both English's allophones of the L. The English R usually is used at the beginning of the syllables, as the 'R's after a vowel, are included in the R-colored vowels. Additionally, it can include the non-native English phoneme, the Rolling R. This one is mainly used for loan words, for sing in other languages or for some particular genres as the case of the opera.

Dark L and Clear LEdit

The system includes both allophones for the L in the English, the [l0] or alveolar lateral approximant, also known as Clear L (used at the beginning of the syllables); and the [l] phoneme or velarized alveolar lateral approximant, also known as Dark L (which it's used at the end of the syllables).

These phonemes aren't designed to be encoded alone; however, the [l0] seems to handle better to be reproduced without a vowel in comparison to the [l] phoneme. The former results in audio loop, while the latter generates electronic buzzing or doesn't produce sound at all without a vowel.

The only exception to this is Luka, which her [l] phoneme can be used alone and extended without suffer distortion.[7]

Rolling REdit

Although it is not a phoneme of the English language, the alveolar trill or rolling R was included to the English phonetic system to increase the Opera singing capabilities of Prima. After this, it became a common phoneme in the Vocaloid2's English voicebanks released after Prima (with exception of Luka).[8] However, its addition to the Vocaloid3's English voicebanks seems to be less common.

Nonetheless, the performance of this phoneme may vary between different English Vocaloids. For example, it is known that Big Al is capable of using it only at the end of words and requires some techniques and further edition to use it in the beginning or middle of a word.

The symbol which represents it in the English Phonetic System is the phoneme [R].

Phonetic ListEdit

Special note: This was the list is based in the Big Al's help file, complimented with the chart of Vocaloid-User.Net[9] and expanded to include the IPA's symbols and names. However there were some incorrect entries within the released list. Entering some of the words provided here as examples for the phoneme usage will not result in the expected phonemes that were used for the list. In addition, the list did not indicate which particular letters the phoneme applied to; this section has underlined the relevant letters for the benefit of readers. Of the Japanese Vocaloids, only Luka will be able to use this system properly.

Symbol Classification IPA's Symbol / Name Sample Notes Related Phonemes
[@] vowel ə schwa aware, synthesis, harmony, the In the Vocaloid program, it is not actually used by itself but rather with other phonetics. However, Luka can use this phoneme to make a the "a" sound in aline

[V] (stressed)

[@r] (r-colored)

[Q@]

[V] vowel ʌ open-mid back unrounded vowel strut, unclean, cut,
duck
Actually it's an /ɐ/ in various most of the dialects. Despite this, the notation /ʌ/ still is used for tradition and because some dialects still retains the old pronunciation.

[@] (unstressed)

[{] (fronted)

[Q@] (r-colored)

[e] vowel ɛ open-mid front unrounded vowel them, egg Usually transcribed as /e/ by the AHD

[e@] (r-colored)

[eI] (diphthongized)

[I] vowel ɪ near-close near-front unrounded vowel kit, it, synthesis

[i:] (tense)

[I@] (r-colored)

[i:] vowel close front unrounded vowel beef, eat, harmony

[I] (lax)

[I@] (r-colored)

[{] vowel æ near-open front unrounded vowel trap, axe

[aU] (diphthongized)

[O:] vowel

ɔː open-mid back rounded vowel

taught, ought, ball This vowel has a lot of variations depending on the dialect. In US dialects it varies between /ɑ/ for the cot–caught mergers and /ɒ~ɔ/ for the rest.

[O@]

[U@]

[Q]

[Q] vowel ɒ open back rounded vowel lot, off

[O:]

[OI]

[U] vowel ʊ near-close near-back rounded vowel put, look

[u:] (tense)

[U@] (r-colored)

[u:] vowel close back rounded vowel boot, view

[w] (semivowel)

[U] (lax)

[U@] (r-colored)

[@r] rhotic vowel əɹ or ɚ urge, bird, marker r-colored schwa

[@] (non-rhotic)

[V]

[eI] diphthong eɪ̯ pay, age, date j-colored /e/ [e] (monothong)
[aI] diphthong aɪ̯ buy, eye, died j-colored /a/

[@] 

[V]

[{]

[OI] diphthong ɔɪ̯ boy, oil, choice j-colored /ɔ/

[Q]

[O:]

[O@]

[@U] diphthong

oʊ̯ (UK)

oʊ̯~o (US)

oat, soak, show w-colored /o/. Usually transcribed as /əʊ̯/ or /oː/ [@]
[aU] diphthong aʊ̯ loud, out, cow w-colored /a/ [{]
[I@] rhotic vowel

ɪə (UK)

i(ə)ɹ (US)

beer, ear r-colored /ɪ/

[I] (uppercase i)

[i:]

[e@] rhotic vowel

ɜː (UK)

ɝ (US)

bear, air, aware r-colored /ɛ/ [e] (non-rhotic)
[U@] rhotic vowel

ʊə (UK)

ʊɹ (US)

poor, surely r-colored /ʊ/

[U]

[u:]

[O:]

[O@]

[O@] rhotic vowel

ɔː(ɹ) (UK)


ɔɹ~oɹ (US)

pour, sort, pour r-colored /ɔ/

[O:]

[Q]

[Q@] rhotic vowel

ɑː(ɹ) (UK)

ɑɹ (US)

star, are, harmony r-colored /ɑ/ [@], [V]
[w] consonant w labio-velar approximant way

[u:] (syllabant)

[U]

[j] consonant j palatal approximant yellow

[i:] (syllabant)

[I] (uppercase i)

[b] consonant b voiced bilabial plosive cab

[p] (voiceless)

[bh] (aspirated)

[bh] consonant aspirated voiced bilabial plosive big at the beginning of syllable, /b/ with aspiration

[ph] (voiceless)

[b] (deaspirated)

[d] consonant d voiced alveolar plosive bad

[t] (voiceless)

[dh] (aspirated)

[D] (lenited, lowered)

[dh] consonant aspirated voiced alveolar plosive dog at the beginning of syllable, /d/ with aspiration

[th] (voiceless)

[d] (deaspirated)

[D] (lenited, lowered)

[g] consonant g voiced velar plosive bag

[k] (voiceless)

[gh] (aspirated)

[N] (nasalized)

[gh] consonant aspirated voiced velar plosive god at the beginning of syllable, /g/ with aspiration

[kh] (voiceless)

[g] (deaspirated)

[dZ] consonant ʤ voiced postalveolar affricate jeans

[tS] (voiceless)

[Z] (spirantizated)

[d] (deaffricated)

[v] consonant v voiced labiodental fricative vote [f] (voiceless)
[D] consonant ð voiced dental fricative their

[T] (voiceless)

[d] (fortited)

[dh] (aspirated)

[v] (Th-fronting)

[z] consonant z voiced alveolar fricative resort

[s] (voiceless)

[Z] (palatalized)

[Z] consonant ʒ voiced postalveolar fricative Asia

[S] (voiceless)

[z] (depalatalized)

[dZ] (affricated)

[m] consonant m bilabial nasal mind

[n] (alveolarized)

[n] consonant n alveolar nasal night

[N] (velarized)

[m] (labialized)

[N] consonant ŋ velar nasal long [n] (develarized)
[r] consonant ɹ alveolar approximant red The /r/ is the symbol for the alveolar trill or rolling R for the IPA and the X-SAMPA, the symbol in this case seems be based on AHD

[R] (rolled)

[w] (gliding)

[l] consonant ɫ velarized alveolar lateral approximant feel Dark L, at the syllable coda position

[l0] (develarized)

[w] (L-vocalized)

[u] (L-vocalolized)

[U] (L-vocalized)

[l0] consonant l alveolar lateral approximant list Clear L, at the beginning of syllable

[l] (velarized)

[p] consonant p voiceless bilabial plosive dip

[b] (voiced)

[ph] (aspirated)

[ph] consonant aspirated voiceless bilabial plosive peace At the beginning of syllable, /p/ with aspiration

[bh] (voiced)

[p] (deaspirated)

[t] consonant t voiceless alveolar plosive sit

[d] (voiced)

[th] (aspirated)

[th] consonant aspirated voiceless alveolar plosive top At the beginning of syllable, /t/ with aspiration

[dh] (voiced)

[t] (deaspirated)

[k] consonant k voiceless velar plosive rock

[g] (voiced)


[kh] (aspirated)

[kh] consonant

aspirated voiceless velar plosive

kiss At the beginning of syllable, /k/ with aspiration

[gh] (voiced)

[k] (deaspirated)

[tS] consonant ʧ voiceless postalveolar affricate touch

[dZ] (voiced)

[S] (spirantizated)

[t] (deaffricated)

[f] consonant f voiceless labiodental fricative feel [v] (voiced)
[T] consonant θ voiceless dental fricative think

[D] (voiced)

[s] (Th-alveolarization)

[f] (Th-fronting)

[s] consonant s voiceless alveolar fricative sea

[z] (voiced) [S] (palatalized)

[S] consonant ʃ voiceless postalveolar fricative share

[Z] (voiced)

[tS] (affricated)

[s] (depalatalized)

[h] consonant h voiceless glottal fricative hat
[R] consonant r alveolar trill tierra (earth)

Rolling R. Generally used in non-English words

[r] (approximant)

TechniquesEdit

Phoneme ReplacementEdit

Due the big array of allophones and similar sounding phonemes available in the English Language, this allows a great flexibility for replacing the phonemes. This has a lot of applications, like altering the emphasis or stress of a word, correcting a strange pronunciation found in a voicebank,[10] alter the accent or general pronunciation of a particular Vocaloid,[11] etc.

This added to some auxiliar phonemes allows a great diversity of combinations and possibilities to experiment. However, the user must consider the results may vary between the different voicebanks due the individual differences like accent, pronunciation and samples' quality present in the voicebank. The most recommended is take these tips as a guide and experiment by yourself.

As said before, the user can replace the plosives for aspirated allophones without major issues due to sounding practically identical, just varying in the stress and air release. If a consonant sounds too strident or too weak, it's possible to replace it with the corresponding allophone.

  • Example: Various Vocaloids have been reported to have too strident T sounds at the end of a syllable. In these cases is possible replace the phoneme [t] for its aspirated counterpart [th]
Also its possible swap the phoneme by its respective (un)voiced counterparts. Usually it's possible do this at the end of the syllable, where the voicing contrast is minor and the consonant are prone to voicing assimilation phenomena.
  • Examples:

Liquid and L-VocalizationEdit

The L has two allophones: the Clear L, used at the beginning of and

The Dark L is prone to be replaced by a vowel in a process called L-vocalization. Due to its (labio)velar quality, this one usually is replaced by close back vowel like //, /ʊ/. Knowing this is possible replace the phoneme [l] for a close back vowel as [u:] or [U] if the user seeks to imitate this process.

Alternatively the user can add a short close back vowel for improve the sound of the consonant and even stretch it as generally the consonant stands better to between two vowels (remember that usually the Dark L doesn't stand to be alone in a note, with exception of Luka's). This last tip can be aided further more with the use of the Vocaloid3's devoiced vowels.

Vowel ReplacementEdit

The English phonetic system has one biggest number of available vowels among the 5 languages currently available for Vocaloid (including monothongs, diphthongs and rhotic vowels).

The point is see how similar are the vowel sounds. For this reason, it's a good idea revise the IPA vowel chart and for see the vowel proximity.

  • Example: The vowels [V], [Q] and [{] are open unrounded vowels, for this reason they sound similar, allowing use them as mutual replacement if it's required.

 • iː
 • uː
 • ɪ
 • ʊ
 • ɪ
 • ɛ
 • ə
ɔː •
 • æ
 • ʌ
ɒ •
ɑ •
Blank vowel trapezoid

 • [i:]
 • [u:]
 • [I]
 • [U]
 • [e]
 • [@]
[O:] •
 • [{]
 • [V]
[Q] •
Blank vowel trapezoid


General Tips
  • The phonemes [V], [{] and [Q] are open unrounded vowels
  • The left side of the IPA vowel chart is for the front vowels. In the Englis all them are unrounded and as they become more closed, they become more tense
← more open (lax) more closed (tense) →
[{]     [e]  [I]  [i:] 
  • The right side of the IPA vowel chart is for back vowels. Following the , the most of them are rounded.

The case of the dipthongs and

Diphone Replacement/SplittingEdit

[12] [13]

Original Diphone Type IPA's notation Replacement for First Phoneme Replacementf or Second Phoneme
[aI] Diphthong aɪ̯ [V], [{] or [Q] [e], [I], [i:] or [j]
[eI] Diphthong eɪ̯ [e] [I], [i:] or [j]
[OI] Diphthong ɔɪ̯ [Q] or [O:] [I], [i:] or [j]
[aU] Diphthong aʊ̯ [V], [{] or [Q] [O:], [U], [u:] or [w]
[@U] Diphthong oʊ̯ [Q] or [O:] [O:], [U], [u:] or [w]
[@r] Rhotic Vowel əɹ or ɚ

[@]

[r]
[Q@] Rhotic Vowel

ɑː(ɹ) (UK)


ɑɹ (US)

[V], [{] or [Q]

[@]


[@r] or [r]

[e@] Rhotic Vowel

ɜː (UK)


ɝ (US)

[e]



[@]


[@r] or [r]

[I@] Rhotic Vowel

ɪə (UK)


i(ə)ɹ (US)

[I] or [i:]

[@]


[@r] or [r]

[O@] Rhotic Vowel

ɔː(ɹ) (UK)


ɔɹ~oɹ (US)

[Q] or [O:]

[@]


[@r] or [r]

[U@] Rhotic Vowel

ʊə (UK)


ʊɹ (US)

[U] or [u:]

[@]


[@r] or [r]

See alsoEdit

Conversion Lists
Interwiki articles

ReferencesEdit

  1. Wikipedia:English phonology
  2. Engloids Blog - Big Al’s Big Article
  3. Wikipedia:Rhotic and non-rhotic accents
  4. link
  5. VocaloidOtaku - Vocaloid Newbie requiring some assistance
  6. link
  7. link
  8. VocaloidOtaku - Rolling Tongue
  9. Vocaloid-User - English Phoneme Chart – Vocaloid1
  10. VocaloidOtaku - Sonika Tutorial
  11. Engloids Blog - Tips on Americanizing Sweet Ann
  12. http://www.vocaloidotaku.net/index.php?/topic/6096-how-can-i-make-vocaloid-2-sing-an-one-syllable-word-on-several-notes
  13. http://blogs.itmedia.co.jp/closebox/2010/01/big-al-d71d.html

Please note we are waiting for more information on some languages

Navigation

Start a Discussion Discussions about English Phonetics

You can find discussions about everything related to this wiki on Vocaloid Wiki Forum!

Advertisement | Your ad here

Photos

Add a Photo
7,275photos on this wiki
See all photos >

Recent Wiki Activity

See more >

Around Wikia's network

Random Wiki