File encoding: typesetting accents and other non-ASCII characters with Texifier

Text encoding

A text document, like a tex source file, consists of a list of characters. There is a choice on how those characters are converted to data and written to disk. This conversion between characters in an editor, and data on disk is known as Text Encoding. Historically, there are a large number of different ways of encoding text, but current best practices have converged on one of a small number of Unicode encodings (such as UTF-16 and UTF-8) as the best way to do this. Recent versions of TeX default to UTF-8, as does Texifier. We strongly recommend that all users use UTF-8 as their text encoding.

Sadly, there is no way to deduce the encoding of a file from the file, the information does not exist in the file. We wish we could sniff encodings automatically, but this is an imprecise deduction, and the results are catastrophic if Texifier were to guess incorrectly. This means that Texifier must be set by the user to the correct encoding before opening a file. In almost all cases this is not a problem, as almost everybody uses UTF-8 in the TeX, and as Texifier is set to this by default, as are recent versions of TeX.

Cannot Save in this encoding errors

Apart from incompatibility with modern tools and users, we recommend Unicode encodings as they are the only encodings that are capable of encoding all Characters. If you are using a legacy encoding (ie something without UTF in the name), then there will be characters you may type into your document that Texifier is not able to save to disk. You will receive an error from Texifier, and you will have to remove those characters.

Migrating to Texifier from other editors

A known problem for users migrating to Texifier from another editor is that when typesetting LaTeX throws errors related to the presence of non-ascii characters (for example é, ü and other accents).

This is usually because Texifier is set by default to use UTF-8 encoding for saving files, and users are often accustomed to older, obsolete encodings such as Latin-1 or MacOSRoman. To make LaTeX typeset correctly please replace any old inputenc or fontenc lines with the appropriate UTF-8 line.

\usepackage[utf8]{inputenc}

Please note that this is not required with more recent versions of MacTeX, but it is still good practice to add it.

Changing Texifier to a non-UTF-8 encoding

If you have a large collection of TeX files files in a legacy encoding, then you may wish to swap Texifier to the older encoding rather than converting the files to the up to date UTF-8 encoding. To do this, close all files and navigate to the File File Encoding menu on macOS, or Editor Preferences,Encoding on iOS and choose the encoding corresponding to your files.

You will need to find the correct encoding from the preferences in your previous editor, common choices are

  • Mac OS Roman if you are migrating from an macOS LaTeX editor such as TeXShop
  • ISO Latin 1 if you are migrating from a Windows LaTeX editor

We accept that this is necessary when a collaborator is either unable, or unwilling, to modernise to UTF-8, but we do not recommend going this route. It is always better to bring the files up to date by converting them to UTF-8.

Font encodings

Please note that often you need to specify the font(output) encoding as well as the input encoding. This should only be done with older 8-bit typesetters such as pdfTeX, it is not required when using modern fonts (e.g. with fontspec) or with LuaTeX, XeTeX, or TexpadTeX when running in Unicode mode. In fact in those cases it is potentially harmful.

\documentclass{article}
\usepackage{german}
\usepackage[T1]{fontenc}
\begin{document}
  äöüÄÖÜ Hallo
\end{document}