Question 1

What are diacritics?

Accepted Answer

Diacritics are marks added to letters to indicate a different pronunciation — for example, accents (é, à), umlauts (ü, ö), cedillas (ç), tildes (ñ), and others common in French, German, Spanish, and Turkish.

Question 2

How do I remove diacritics from text?

Accepted Answer

Paste or type your text in the input panel. Diacritics are stripped instantly using Unicode NFD decomposition, converting each accented letter to its base ASCII equivalent.

Question 3

How does Unicode NFD decomposition work to remove diacritics?

Accepted Answer

NFD (Canonical Decomposition) decomposes composite characters into their base letter plus separate combining mark code points. For example, 'é' (U+00E9, a single code point) decomposes into 'e' (U+0065) followed by a combining acute accent (U+0301). After decomposition, this tool strips all combining characters (Unicode category Mn — Mark, Nonspacing), leaving only the base letters. Finally, the string is re-normalized to NFC to produce clean output. This technique reliably strips accents from all Latin-based scripts.

Question 4

Does this tool remove all diacritics, including Arabic vowel marks and Hebrew niqqud?

Accepted Answer

Yes — any Unicode combining mark (category Mn) is stripped, including: Latin accents and umlauts (é → e, ü → u, ñ → n), Arabic harakat (short vowel marks above/below letters), Hebrew niqqud (vowel pointing), Devanagari matras, Thai tone marks, and other combining diacritical marks across all scripts. The base consonants and letters are preserved. This broad stripping is useful for normalization but should be used thoughtfully with non-Latin scripts where diacritics may be structurally significant.

Question 5

Why does diacritic removal sometimes produce unexpected results for ligatures?

Accepted Answer

Ligatures like 'æ' (ae ligature), 'œ' (oe ligature), and 'ß' (German sharp s) are single code points that are NOT diacritics — they are base characters that happen to look like combined letters. NFD decomposition does not split ligatures into separate letters because they do not have a canonical decomposition into base + combining mark. To expand ligatures (æ → ae, œ → oe, ß → ss), you need a separate NFKD normalization step combined with character mapping — which is what full transliteration libraries do, but is beyond what NFD decomposition alone handles.

Diacritic Remover

Related Tools

Common Use Cases

What Are Diacritics?