UTF-8 INI Files

UTF-8 INI Files #

Introduction #

In my previous article Doing UTF-8 in Windows, I showed how you can work with UTF-8 using basically only two functions, utf8::narrow and utf8::widen. For general file I/O you just have to convert the file name from UTF-8 to UTF-16 and all the reading and writing functions remain unchanged:

  FILE *f = utf8::fopen (u8"ܐܪܡܝܐ.txt", "w");
  fputs (u8"This text is in Aramaic ܐܪܡܝܐ", f);
  fclose (f);

There is one case that is not covered by these rules: the INI files, also called “profile files” in Microsoft parlance. Although there are many other ways of storing application settings, INI files are still widely used either for compatibility reasons or because they are simple to work with.

The problem is that the basic Windows API calls for reading and writing INI files, GetPrivateProfileString and PutPrivateProfileString, combine both the file name and the information to be read or written in one API call. As an example, here is the signature of the GetPrivateProfileStringW function:

DWORD GetPrivateProfileStringW(
  LPCWSTR lpAppName,
  LPCWSTR lpKeyName,
  LPCWSTR lpDefault,
  LPWSTR  lpReturnedString,
  DWORD   nSize,
  LPCWSTR lpFileName
);

If we would use the utf8::widen function to convert all our UTF-8 strings we would end up with an INI file that contains UTF-16 characters.

The solution is to completely forget about the Windows API functions and roll our own implementation for accessing INI files. This is by far not the only implementation of INI files that you can find out there. For a list of implementations you can check the Wikipedia page. Some of them might be a bit over-hyped; one such project claims to be “the ultimate and most consistent INI file parser library written in C”. The only claim I make is that my implementation struggles to be as compatible as possible with the original Windows API.

As such, you will find no arbitrary extensions to the file format and I’ve done a lot of testing to identify different corner cases. Here are the rules I discovered by trying different combination of calls to the original Windows API:

  • the only comments lines are the ones starting with a semi-colon (hashes are not considered comments by Windows API)
  • there are no trailing comments; anything after the ‘=’ sign is part of the key value
  • leading and trailing spaces are removed both from returned strings and from parameters The only changes compared to the Windows API are:
  • line length defaults to 1024 (the INI_BUFFER_SIZE value) while Windows limits it to 256 characters
  • files without a path are in current directory while Windows places them in Windows folder

Implementation #

An INI file is implemented as a IniFile object. The basic member functions IniFile::GetString and IniFile::PutString allow you to read or write settings in the INI file like in the code below:

utf8::IniFile test ("test.ini");
test.PutString ("key1", "value11", "section1");
string val = test.GetString ("key1", "section1");

The original Windows API handles only two data types for INI files: strings and integer numbers (GetPrivateProfileString and GetPrivateProfileInt functions). I thought it was useful to extend these functions to additional data types and also add some utility functions. This is not an extension of the file format; it is just an extension of the API for accessing these files. Here are some of these functions:

  • PutInt and GetInt for integer values
  • PutDouble and GetDouble for floating point values
  • PutBool and GetBool for Boolean variables (when reading, the code understands things like “on” or “0” or “OFF”)
  • PutColor and GetColor for RGB color representations
  • PutFont and GetFont to save and retrieve font settings
  • HasKey and HasSection to check if a key or a section exists in the INI file

Points of interest #

There is no in-memory buffering for the INI files. Everything is written out to disk as quickly as possible. This was a design decision because

  1. that’s what Windows does and I wanted to be as compatible as possible and
  2. it is quite annoying when parameters don’t get saved if the application has crashed or otherwise unexpectedly ended. The drawback is that INI files become less efficient but are not meant to be general data files.

Moreover, every time a key is written in an INI file, the whole file gets re-written; as I was saying, efficiency was not a design goal.

The code shown in this article makes it easy to keep the application settings in an INI file using UTF-8 encoding.

History #

  • 02-Apr-2020 Initial version