Code Layout #
Currently, it is very common to use 10-20 different open source projects to leverage existing code when you want to build a new application. As an example, if you want to do something related to geo-referenced images, you will most probably use a JPEG library, a PNG library and a TIFF library. Some of those may already use some compression library like zlib. Add to that a map projection library like PROJ that uses a database like SQLITE3 (not to mention TIFF, and zlib) and some JSON parsing library. You arrived close to 10 packages before really diving in.
Each one of these packages has it’s own code layout rules and build mechanisms. Making them work together is not child’s play and you’ll find yourself searching through support forums, asking questions and in general, spending a lot of time that could be better spent doing something productive.
I am not the only one who suffered such frustrations. That is why people have come up with other code layout proposals like the Pitchfork Layout.
The Elevator Pitch #
Here I propose a solution made of two parts:
- A code layout standard called “Uniform C/C++ Code Layout”(UCCL)
- A tool for managing code dependencies called “C/C++ Package Manager”(CPM)
If you use UCCL, different projects/modules are separate entities. Dependencies between projects are handled using symbolic links to their respective include folders and described using a simple JSON file. The package manager insures that the required branch of each repository is fetched and compiled.
Do I believe that this solution is a silver bullet that solves all problems? No, I don’t. It is however useful enough to warrant a close look.
How should we link in 21st century #
If we are to build our software from many pieces, we need to find a way to link those pieces together. Over the years there have been three ways to link these modules.
Static linking #
That was the first method discovered in the ‘60-es and ‘70-es. A linker program would take the compiled modules and adjust the different call instructions depending on the position of each module in the resulting program. Once created, this binary program could be directly loaded and it had no external dependencies. The only problem was that software was growing faster than the available memory.
Dynamic linking #
It was quickly observed that some modules, like compiler’s run-time library, were used by almost all programs and there could be important memory savings if these modules were shared among all programs. The solution is called dynamic linking and the job of adjusting the function calls between modules is passed to the loader program. When a program is loaded in memory, if one of the dynamically linked modules is already loaded, the loader arranges to map the loaded module in the address space of the new program. The problem is that the different modules have to be really well matched. If a program is loading version 3 of a library and another program needs version 5 of the same library, it is easy to have things mixed up and land in what is popularly known as “Dependency Hell”, or, more specific for Windows programmers “DLL Hell”.
Another problem is that, unless special precautions are taken, all modules need to use the same version, or compatible versions of runtime library. Imagine a situation where a dynamic library allocates some memory and returns it to the calling program. If the calling program uses a different memory manager and releases that memory, you end up with a heap corruption1.
Header-only libraries #
Some C and C++ programmers got so frustrated with “Dependency Hell” that started creating libraries that are completely included in header files. The work of assembling the program is thus passed to the compiler. This is not a scalable solution as compilation times get longer and longer.
Back to static linking #
Out of the three linking solutions, static linking seems the one with less disadvantages2. My strong recommendation is to use static linking whenever possible and keep dynamic linking only for truly ubiquitous modules like the compiler run-time libraries. Header only libraries should be only really small modules or those where template usage doesn’t allow you to create a static library.
What is here #
This is a series of articles about UCCL - Uniform C Code Layout - a code layout method for C/C++ projects. These articles were written at different points in time and show how these ideas evolved.
I have used symbolic links for bringing together different projects for quite some time. The way to do it is described in Eat Your Own Dogfood. As the number of projects grew, maintaining and updating them became more and more challenging. That’s how the need for a package manager appeared.
Not long after, I discovered the Pitchfork Layout and thought it is a good match with my ideas. Their combination is described in The Dogfood and the Pitchfork.
You can avoid this situation by adding an API function in the dynamic library that releases the memory or passing to the dynamic library, the memory manager functions that it should use. ↩︎
This can be also a mixed solution where static linking is used for most modules and dynamic linking is reserved only for compiler runtime libraries. ↩︎