Home
Knowledge Base
Fixing Encoding Issues

Fixing Encoding Issues

Last updated: 2026-02-20 · Troubleshooting

What Are Encoding Issues?

Character encoding defines how text characters are stored as bytes. When the wrong encoding is used to read a file, special characters — accented letters (é, ñ, ü), CJK characters, or symbols — appear as garbled text (sometimes called "mojibake").

Common Symptoms

Place names with accented characters appear as MÃ¼nchen instead of München
Attribute values show ï¿½ or â€" replacement characters
Chinese, Japanese, or Korean text appears as question marks or boxes

How ConvertGeoData Handles Encoding

Auto-detection: We check for a .cpg file (Shapefiles), BOM markers, and use heuristic detection.
Manual override: In the conversion wizard, you can specify the source encoding if auto-detection gets it wrong.
UTF-8 output: All output files are written in UTF-8 (the universal standard).

Common Encodings by Region

Region	Likely Encoding	IANA Name
Western Europe	Windows-1252 or ISO-8859-1	windows-1252
Central/Eastern Europe	Windows-1250 or ISO-8859-2	windows-1250
Japan	Shift_JIS or EUC-JP	shift_jis
China	GBK or GB2312	gbk
Korea	EUC-KR	euc-kr
Universal (modern)	UTF-8	utf-8

Tips

If you created the data, always save in UTF-8 when possible.
For Shapefiles, include a .cpg file containing just the encoding name (e.g., UTF-8).
If auto-detection fails, try Windows-1252 first — it's the most common legacy encoding for Western European data.

Stop emailing zip files and hoping for the best.

GeoShare: cloud preview + flexible download for every geospatial file you share. Coming soon to ConvertGeoData.

Learn More Join the Waitlist