The Spanish language has only 5 vowel sounds and 18 consonants.[1] The language also has 29 possible allophones and 841 theorically possible combinations, requiring only 521 to cover more of  the 99.99% of the concurrences within the language.[2]

Spanish VocaloidsEdit

The following are a list of Vocaloids that use Spanish.

Phonetic System's CharacteristicsEdit


The system includes the 5 vowels of Spanish. In comparison to other languages such as English or Korean, the system doesn't includes diphones for the diphthongs. Instead, the system includes the respective glides or semivowels of the "weak" vowels ([i] and [u]) which allows it to perform the diphthongs when combined with the corresponding vowels.


The system includes 4 glides which allows to perform all the diphthongs of the Spanish. There are 2 tyoes of glides:

  • The semivowels  [I] and [U], used for the falling diphthongs (vowel+glide).
  • The approximants [j] and [w], used for the  raising diphthongs (glide+vowel).


Weak AllophonesEdit

Lenition or Weakening, is a kind of sound change that alters the consonants, making them "softer" in some way. Lenition occurs especially often intervocalically (between vowels). In this position, lenition can be seen as a type of assimilation of the consonant to the surrounding vowels, in which features of the consonant that are not present in the surrounding vowels (e.g. obstruction, voicelessness) are gradually eliminated.

In the Spanish, the Lenition has been an important phenomena since the evolution from the Latin, and continues affecting some consonants, particularly the voiced plosives /b/, /d/ and /g/. Those ones in intervowel context are realized as "softer" voiced fricative or approximant allophones.

voiced stop continuant (fricative) approximant (spirant)
[b] voiced bilabial plosive [β] voiced bilabial fricative [β̞] bilabial approximant
[d̪] voiced dental plosive [ð] voiced  dental fricative [ð̞] dental approximant
[g] voiced velar plosive

[ɣ] voiced velar fricative

[ɣ˕] velar approximant

Due this, the Spanish Phonetic system includes individual phonemes for the softer allophones. These ones are differenced of their standard "stronger" counterparts by the uppercase symbol, fitting in that way with their respective X-SAMPA's symbol for the fricatives.

The "harsher" plosives generally appears at the beginning of the words, after a nasal consonant like [m] or [n], and after a pause, while their "softer" allophones appears in all the other context, especially intervowel.

Like in the case of the English's aspirated allophones, both versions can be interchanged without alter the overall word meaning, varying only by the degree of stress and emphasis of the words. The slow speech tends to favor the "harsher" plosives while the fast speech tends to favor their "softer" allophones, as the first one has more pauses and silences that allows a full realization and articulation of the plosives while the later do not.[3]

Rhotic ConsonantsEdit

The Spanish language is one of the few Indo-European languages which has a clear distinction of the rothics consonants /ɾ/ alveolar tap (the "flapped D" in the American English, known as "ere" in the Spanish) and /r/ alveolar trill (Rolling R, known as "erre" in the Spanish).

The alveolar trill and the alveolar tap are in phonemic contrast word-internally between vowels but are otherwise in complementary distribution. In the Spansih orthograpy, for distinct a intervowel alveolar trill the double R (or 'rr') notation is used while a single intervowel R always is an alveolar tap. In the Spanish phonetic system, this orthographic notation was used instead the usual X-SAMPA notation, as the the alveolar tap is represented as [r] while the alveolar trill is represented as [rr] (not as [4] or [r] how they should be respectively in the X-SAMPA).


Phoneme ReplacementEdit

In the Spanish shows a notorious contrast at the beginning of the syllable, however at the end of the syllable (coda position) the contrast of some consonant is much less marked, making them prone to assimilation processes or merging. Knowing these ones it's possible replace some of the phonemes for the respective allophone, allowing change the stress and pronunciation without alter the meaning of the word.

Voicing AssimilationEdit

Nasal AssimilationEdit

In syllable-final position the nasal consonant are prone to assimilate the place of articulation of following consonant, even across a word boundary. Knowing this, it's possible replace a nasal consonant with another one more appropiate for the context of said phoneme.

  • For the word Chancho ('Pig') it may be input as [tS a J][tS o] instead [tS a n][tS o] in the VOCALOID Editor because the /n/ should be palatalized in that context due the influence of the following /tʃ/.
  • In the phrase Corazón C''onfundido ('Confused Heart'), its possible replace the the [n] at the end of the first word for a its velar counterpart [N] if the context allows the assimilation of the nasal consonant.
    [k o][r a][T o n][k o n][f u n][D i][D o] → [k o][r a][T o N][k o n][f u n][D i][D o]

Realization of the REdit

In coda or syllable-final position the realization of the Spanish R is neutralized, which it means this one can be realized either as flap or trill.

Phonetic ListEdit

Symbol Classification IPA's Symbol / Name Sample Notes Related Phonemes
[a] vowel ä open central unrounded vowel padre
[e] vowel mid front unrounded vowel enero [i] (lowered)
[i] vowel i close front unrounded vowel finca, mío

[j] (glide)

[I] (non-syllabic)

[o] vowel mid back rounded vowel foco, oído [u] (lowered)
[u] vowel u close back rounded vowel musa, dúo

[w] (glide)

[U] (non-syllabic)

[j] semivowel j palatal approximant amplio, ciudad Used in raising diphthongs (glide+vowel).

[i] (syllabic)

[I] (non-syllabic)

[j\] (fortitied)

[w] semivowel w voiced labio-velar approximant huevo, buitre Used in raising diphthongs (glide+vowel).

[u] (syllabic)

[U] (non-syllabic)

[G] (unrounded)

[I] semivowel aire, muy Used in falling diphthongs (vowel+glide).

[i] (syllabic)

[j] (glide)

[U] semivowel pausa, neutro Used in falling diphthongs (vowel+glide).

[u] (syllabic)

[w] (glide)

[p] consonant p voiceless bilabial plosive perro, apto [b] (voiced)
[t] consonant voiceless dental plosive tuyo, traba [d] (voiced)
[k] consonant k voiceless velar plosive caña, quise, kilo

[g] (voiced)

[b] consonant b voiced bilabial plosive bestia,embuste, vaca, envidia At the beginning of the word or after a pause or after a nasal consonant.

[p] (voiceless)

[B] (lenited)

[B] consonant

β~β̞ bilabial spirant

bebé, obtuso, vivir, curva Lenited /b/. In middle of a word, in all the cases where /b/ isn't used. [b] (fortited)
[d] consonant voiced alveolar plosive dedo, cuando, aldaba At the beginning of the word or after a pause or after a nasal consonant or after /l/.

[t] (voiceless)

[D] (lenited)

[D] consonant

ð~ð̞ dental spirant

dedo, arder, admirar Lenited /d/. In middle of a word, in all the cases where /d/ isn't used. [d] (fortited)
[g] consonant ɡ voiced velar plosive gato, lengua, guerra At the beginning of the word or after a pause or after a nasal consonant.

[k] (voiceless)

[G] (lenited)

[G] consonant

ɣ ~ ɣ˕ or ɰ velar spirant

trigo, amargo, sigue Lenited /g/. In middle of a word, in all the cases where /g/ isn't used

[g] (fortited)

[w] (rounded)

[tS] consonant ʧ voiceless postalveolar affricate chancho [t] (deaffricated)
[f] consonant f voiceless labiodental fricative fase, café
[T] consonant θ voiceless dental fricative cerro, cima, zumo, paz

[D] (voiced)

[s] (seseo or th-alveolarization)

[t] (th-stopping)

[f] (th-fronting)

[s] consonant s voiceless alveolar silibant casa, xilófono [T] (ceseo; dentalized or lisped)
[x] consonant x voiceless velar fricative jamón, reloj, genero, México
[m] consonant m bilabial nasal mamá , campo, invertir Also an allophone of /n/ in front of labial consonants. [n] (delabialized)
[n] consonant n alveolar nasal nido, sin

Contains various allophones:

/n/ at the beginning of word or after a pause

/ɲ/ or /nʲ/ before palatals as /ʎ/, /ʝ/ or /ʧ/

/ŋ/ before velars as /x/, /k/, /g/ or /ɣ/

// before dentals as /d̪/, /ð/ or /t̪/

[J] (palatalized)

[m] (labalized)

[J] consonant ɲ palatal nasal ñandú, enyesar Also an allophone of /n/ in front of a palatals as /ʎ/, /ʝ/ or /ʧ/. [n] (depalatalized)
[l] consonant l alveolar lateral approximant lana, principal
[r] consonant ɾ alveolar tap caro, bravo, Amor eterno [rr] (trilled)
[rr] consonant r alveolar trill rumbo, carro, honra, alrededor, disruptivo, Azrael At the beginning of the word or after a nasal consonant, /l/, /s/ or /θ/. Intervowel only if is specified by a double R. [r] (lenited)
[L] consonant ʎ palatal lateral approximant llave, pollo

[j\] (yeísmo)


[j\] consonant ʝ voiced palatal fricative ayuno

[L] (lleísmo)

[j] (lenited)

Additional PhoneticsEdit

The following is a list of additional phonemes avaible for MAIKA. Although this phonetic expansion is intended mainly for Catalan, Voctro Labs suggested that with her added phonemes she would be able to achieve a decent imitation of other languages like English, Portuguese and Japanese - although disclaimed that she would not sound like a native speaker.

Aside it's potentional for imitate other languages, it's important to point out this phonetic extension also can be used for complement the Spanish language, as many of the additional sounds are allophones or variants existent in other dialects or variations of said language.

Symbol Classification IPA's Symbol / Name Sample Notes Related Phonemes
[@] vowel ə schwa

amb (CAT)

the (ENG)

Reduced vowel. [a] (fronted)
[E] vowel ɛ open-mid front unrounded vowel

mel (CAT)

egg (ENG)

It may be considered a more open and lax counterpart of /e/. [e] (tense)
[I0] vowel ɪ near-close near-front unrounded vowel

it (ENG)

English KIT vowel. It may be considered a more open and lax counterpart of /i/. [i] (tense)
[Q] vowel ɒ open back rounded vowel

soc (CAT)

lot (ENG)

It may be considered a more rounded and back counterpart of /a/.

[a] (open, centralized)

[O] (closed)

[O] vowel ɔ open-mid back rounded vowel

iode (CAT)

taught (ENG)

It may be considered a more open and lax counterpart of /o/.

[o] (tense)

[Q] (open)

[r\] consonant ɹ alveolar approximant

red (ENG)

English R.

[r] (approximant)


[L0] consonant l̠ʲ, ʎ̟ or ȴ Alveolo-palatal lateral approximant ull (CAT) A more lateralized variant of /ʎ/. [L]
[N] consonant ŋ velar nasal

sang (CAT)

king (ENG)


[ts] consonant ʦ voiceless alveolar affricate

potser (CAT)

metsu (JAP)

[dz] (voiced)

[dz] consonant ʣ voiced alveolar affricate

metzines (CAT)

tsudzuku (JAP)

[ts] (voiceless)
[dZ] consonant ʤʥ voiced postalveolar affricate

metge (CAT)

jeans (ENG)

jishin (JAP)

Allophone of of /ʝ/ and /ʎ/ in some dialects.

[tS] (voiceless)

[j\], [L] (allophone)

[S] consonant ʃɕ voiceless postalveolar sibilant

caixa (CAT)

share (ENG)

shio (JAP)

Deaffricated variation of /tʃ/ in some dialects.

Allophone of /ʝ/ and /ʎ/ in Rioplatense dialects.

[tS] (affricated)

[Z] (voiced)

[j\], [L] (allophone)

[z] consonant z voiced alveolar sibilant

onze (CAT)

zoo (ENG)

[s] (voiceless)
[Z] consonant ʒʑ voiced postalveolar sibilant

ajut (CAT)

vision (ENG)

kaji (JAP)

Allophone of /ʝ/ and /ʎ/ in Rioplatense dialects.

[S] (voiceless)

[j\], [L] (allophone)

[v] consonant v voiced labiodental fricative

viu (CAT)

vote (ENG)

[f] (voiceless)

[B] (bilabial)

[h] consonant h voiceless glottal fricative

hot (ENG)

Allophone of /s/ or /x/ in some dialects (Debuccalization)



