Contents
A Quickstart Guide
We’re going to look at how to create and use libraries on Linux and try to gain some insight on how libraries work behind the scenes.
Decisions Decisions!
Often when working with 3rd party code you may be limited on the options available. Some well known open-source projects have dual-licensed binaries that dictate different terms for static or dynamic linking.
Writing a library is a good way to provide an interface to customers, get code reuse and can be a major source of headaches!. To understand what’s best for your usecase it’s worth looking at what each type provides.
Static libraries (.a files) are precompiled object code which is linked into other executables at compile time and become part of that final application. These libraries load quickly, have less indirection and don’t run the risk of dependency hell which can beset their dynamic peers.
Static libraries incur an overhead of space and memory whenever they are used due to their nature of being part of the executable but, due to their inclusion at build time, unused code can be optimised out.
On the downside, if you want to upgrade a part of your interface you will need to ask all your customers to rebuild their executables against the updated library whereas dynamic libraries push this to a load/runtime issue.
Dynamically linked shared object libraries (.so files) work alongside the link loader to allow external symbols referenced in executables to be resolved at load time and can be used in one of two ways:
- Loaded in at run time by the linker and must be available for compile/link phase for symbol checking.
- Dynamically loaded by dlopen() – used by plugins and on-demand situations.
Using dynamic linking is encouraged on Linux systems to reduce the number of copies of code and allow management of the different libraries (often by a package manager on Linux).
To illustrate some of our later points and to make it clear what’s happening we’ll use the trivial code snippets as follows:
/* lib1.c */ #include void f1() { printf("In library 1\n"); } /* main.c */ #include void f1(); void main() { f1(); printf("In main\n"); }
Only our main.c will have the main function as this is our C program’s entry point. Regardless of how many libraries we have otherwise you’ll see the error:
multiple definition of `main' when you link.
Static Libraries
Static libraries are object files that are later combined with another object to form a final executable. By convention they have the prefix lib and the suffix .a – for example, libpthread.a
To create a static library using GCC we need to compile our library code into an object file so we tell GCC to do this using -static and -c
$ gcc -static -c -o lib1.o lib1.c
Once we have an object file (or files! we could have many we wish to combine into a single library) we use the GNU ar command to create our final library/archive
$ ar rcs libfoo.a lib1.o lib2.o
This tells ar to create an archive (c), insert the objects, replacing older files where needed (r) and to write out an index (s).
To use this library in future executables you use something like the following:
$ gcc main.c -o test -lfoo $ ./test In library 1 In main
Note how the lib prefix and .a suffix are omitted. If you had the library files outside of the standard library search path (we’ll talk about this later) you could use -L /path/to/other/libs to make the linker aware.
Dynamic Libraries
Dynamic libraries are slightly more interesting from the perspective of symbol resolution and actual loading so we’ll look at that once we have some binaries to work with.
Dynamic, or shared, libraries have the same lib prefix as static libraries but the suffix becomes .so indicating they are shared objects.
$ gcc -shared -fPIC -o lib1.so lib1.c
The -shared is used to indicate it’s a shared object and the -fPIC is used to tell GCC to produce position independent code. The concept of position independent code is fundamental for dynamic libraries as they could be loaded into memory at any location so things like jumps in code are alterered to use relative offsetting rather than absolute.
To link against our library lib1.so we could use the following snippet, again we omit the suffix and lib prefix.
$ gcc -L$(pwd) -o test main.c -l1
The $(pwd) expansion simply allows the linker to search the current working directory for the shared object library.
This probably won’t result in a runnable executable off the bat though…
$ ./test ./test: error while loading shared libraries: lib1.so: cannot open shared object file: No such file or directory
Let’s look at why this doesn’t quite work and learn more about what happens behind the scenes!
The role of ld-linux.so
Linux uses ELF binaries for executables, libraries and coredumps and has done so since around 1999 – Modern systems use the application /lib/ld-linux.so.2 to locate, load and map all the necessary dynamic libraries into your applications address space on behalf of your executable.
ld-linux.so.2 will search for library files in the following ways:
- Using the directories specified in DT_RPATH (unless DT_RUNPATH is specified)
- Search the enviromental path LD_LIBRARY_PATH for library locations
- Using the directories specified as DT_RUNPATH in the dynamic section of the binary if present. (see The One True Path section later).
- From information in the cache file /etc/ld.so.cache which is generated by ldconfig and the contents of /etc/ld.so.conf and /etc/ld.so.conf.d/*
- Searching the default path of /lib and /usr/lib (unless you do something crazy like using -z nodeflib to prevent default library usage)
To find out which libraries your program depends on we can use the tool readelf, which is provided as part of the binutils package, to examine the dynamic section of the binary using the -d/–dynamic flag.
$ readelf -d test Dynamic section at offset 0xf0c contains 25 entries: Tag Type Name/Value 0x00000001 (NEEDED) Shared library: [lib1.so] 0x00000001 (NEEDED) Shared library: [libc.so.6] 0x0000000c (INIT) 0x80483f0 0x0000000d (FINI) 0x8048614 0x00000019 (INIT_ARRAY) 0x8049f00 0x0000001b (INIT_ARRAYSZ) 4 (bytes)
We can see that our library depends on libc and our library lib1.so. We can use the -l/–segment flag to examine the segment headers to see our binaries call out to the link loader:
$ readelf -l test Elf file type is EXEC (Executable file) Entry point 0x8048470 There are 9 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4 INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x08048000 0x08048000 0x00718 0x00718 R E 0x1000 LOAD 0x000f00 0x08049f00 0x08049f00 0x00120 0x00124 RW 0x1000 DYNAMIC 0x000f0c 0x08049f0c 0x08049f0c 0x000f0 0x000f0 RW 0x4 ...
If we’re really curious we can dig deeper and use the -r/–relocs flag to examine the relocation section of our binary. We can use this to find out what functions need to be provided to the executable to function correctly; libc.so provides most of the requests in this table but we can see our call out to f1 which is why we’re seeing that error.
$ readelf -r test Relocation section '.rel.dyn' at offset 0x3c8 contains 1 entries: Offset Info Type Sym.Value Sym. Name 08049ffc 00000306 R_386_GLOB_DAT 00000000 __gmon_start__ Relocation section '.rel.plt' at offset 0x3d0 contains 4 entries: Offset Info Type Sym.Value Sym. Name 0804a00c 00000207 R_386_JUMP_SLOT 00000000 puts 0804a010 00000307 R_386_JUMP_SLOT 00000000 __gmon_start__ 0804a014 00000407 R_386_JUMP_SLOT 00000000 f1 0804a018 00000507 R_386_JUMP_SLOT 00000000 __libc_start_main
So as expected, everything has worked and our library is being requested but it’s not currently being found. The easy way to test your project works is simply to override the LD_LIBRARY_PATH to point it to our libraries, a dot (.) will suffice to include the present working directory:
$ LD_LIBRARY_PATH=”.” ./test In library 1 In main
At this point you would typically package up your library to install to /lib or /usr/lib and voila or add the entry to the files under /etc/ld.so.conf.d/ and run ldconfig to regenerate the cache based on your new settings. If you’re doing things “properly” then that’s pretty much the end of your journey. Congratulations! However, it’s never quite that simple…
The One True path
There are times in shipping software where you need to deviate a little from the standard, perhaps you want to ship a self-contained package that doesn’t depend on ldconfig or ensure your provided library gets used before the system tries to locate it elsewhere; To that end we can leverage something called the runpath and rpath.
rpath vs runpath
When DT_RPATH was introduced as a concept, you’ll note from the list above that it has precedence over all other options.
This made it impossible to override the libraries search path so the powers that be decided to implement a newer parameter known as DT_RUNPATH which has lower precedence than LD_LIBRARY_PATH and as such, you can override the former with the latter.
rpath and runpath are conceptually the same thing, a list of alternate locations embedded within the executable that will be searched by the linker at runtime; Lacking a runpath, the older rpath will be used.
To specify both a rpath and runpath, you must tell the linker, using -Wl,-rpath=/my/location, the list of alternate locations and should also specify --enable-new-dtags
. This last parameter causes it to embed the value as the runpath, and rpath, as only the rpath is used by default which causes the no-override problem mentioned above.
$ gcc -L$(pwd) -Wl,-rpath=$(pwd) --enable-new-dtags -o test main.c -l1 -l2
To view the embedded paths we can use trusty old readelf again to examine the dynamic section of our binary and verify they match expectations:
$ readelf -d test Dynamic section at offset 0xefc contains 27 entries: Tag Type Name/Value 0x00000001 (NEEDED) Shared library: [lib1.so] 0x00000001 (NEEDED) Shared library: [lib2.so] 0x00000001 (NEEDED) Shared library: [libc.so.6] 0x0000000f (RPATH) Library rpath: [/home/nick/Development/libraryTutorial] 0x0000001d (RUNPATH) Library runpath: [/home/nick/Development/libraryTutorial] 0x0000000c (INIT) 0x804843c 0x0000000d (FINI) 0x8048674 0x00000019 (INIT_ARRAY) 0x8049ef0 0x0000001b (INIT_ARRAYSZ) 4 (bytes) 0x0000001a (FINI_ARRAY) 0x8049ef4 0x0000001c (FINI_ARRAYSZ) 4 (bytes) ...
Libraries on Linux… nothing to it eh?
- Navigating Memory in C++: A Guide to Using std::uintptr_t for Address Handling - February 22, 2024
- Embedded Expertise: Beyond Fixed-Size Integers; Exploring Fast and Least Types - January 15, 2024
- Disassembling a Cortex-M raw binary file with Ghidra - December 20, 2022
Co-Founder and Director of Feabhas since 1995.
Niall has been designing and programming embedded systems for over 30 years. He has worked in different sectors, including aerospace, telecomms, government and banking.
His current interest lie in IoT Security and Agile for Embedded Systems.