Static and Dynamic Libraries on Linux

A Quickstart Guide

We’re going to look at how to create and use libraries on Linux and try to gain some insight on how libraries work behind the scenes.

Decisions Decisions!

Often when working with 3rd party code you may be limited on the options available. Some well known open-source projects have dual-licensed binaries that dictate different terms for static or dynamic linking.

Writing a library is a good way to provide an interface to customers, get code reuse and can be a major source of headaches!. To understand what’s best for your usecase it’s worth looking at what each type provides.
Static libraries (.a files) are precompiled object code which is linked into other executables at compile time and become part of that final application. These libraries load quickly, have less indirection and don’t run the risk of dependency hell which can beset their dynamic peers.

Library

Static libraries incur an overhead of space and memory whenever they are used due to their nature of being part of the executable but, due to their inclusion at build time, unused code can be optimised out.
On the downside, if you want to upgrade a part of your interface you will need to ask all your customers to rebuild their executables against the updated library whereas dynamic libraries push this to a load/runtime issue.

Dynamically linked shared object libraries (.so files) work alongside the link loader to allow external symbols referenced in executables to be resolved at load time and can be used in one of two ways:

  1. Loaded in at run time by the linker and must be available for compile/link phase for symbol checking.
  2. Dynamically loaded by dlopen() – used by plugins and on-demand situations.

Using dynamic linking is encouraged on Linux systems to reduce the number of copies of code and allow management of the different libraries (often by a package manager on Linux).

To illustrate some of our later points and to make it clear what’s happening we’ll use the trivial code snippets as follows:

/* lib1.c */
#include 
void f1()
{
  printf("In library 1\n");
}

/* main.c */
#include 
void f1();
void main()
{
  f1();
  printf("In main\n");
}

Only our main.c will have the main function as this is our C program’s entry point. Regardless of how many libraries we have otherwise you’ll see the error:

	 multiple definition of `main' when you link.

Static Libraries

Static libraries are object files that are later combined with another object to form a final executable. By convention they have the prefix lib and the suffix .a – for example, libpthread.a

To create a static library using GCC we need to compile our library code into an object file so we tell GCC to do this using -static and -c

	$ gcc -static -c -o lib1.o lib1.c

Once we have an object file (or files! we could have many we wish to combine into a single library) we use the GNU ar command to create our final library/archive

	$ ar rcs libfoo.a lib1.o lib2.o

This tells ar to create an archive (c), insert the objects, replacing older files where needed (r) and to write out an index (s).

To use this library in future executables you use something like the following:

	$ gcc main.c -o test -lfoo
	$ ./test
	In library 1
	In main

Note how the lib prefix and .a suffix are omitted. If you had the library files outside of the standard library search path (we’ll talk about this later) you could use -L /path/to/other/libs to make the linker aware.

Dynamic Libraries

Dynamic libraries are slightly more interesting from the perspective of symbol resolution and actual loading so we’ll look at that once we have some binaries to work with.
Dynamic, or shared, libraries have the same lib prefix as static libraries but the suffix becomes .so indicating they are shared objects.

	$ gcc -shared -fPIC -o lib1.so lib1.c

The -shared is used to indicate it’s a shared object and the -fPIC is used to tell GCC to produce position independent code. The concept of position independent code is fundamental for dynamic libraries as they could be loaded into memory at any location so things like jumps in code are alterered to use relative offsetting rather than absolute.

To link against our library lib1.so we could use the following snippet, again we omit the suffix and lib prefix.

	$ gcc -L$(pwd) -o test main.c -l1

The $(pwd) expansion simply allows the linker to search the current working directory for the shared object library.
This probably won’t result in a runnable executable off the bat though…

	$ ./test 
	./test: error while loading shared libraries: lib1.so: 
	cannot open shared object file: No such file or directory

Let’s look at why this doesn’t quite work and learn more about what happens behind the scenes!

The role of ld-linux.so

Linux uses ELF binaries for executables, libraries and coredumps and has done so since around 1999 – Modern systems use the application /lib/ld-linux.so.2 to locate, load and map all the necessary dynamic libraries into your applications address space on behalf of your executable.

ld-linux.so.2 will search for library files in the following ways:

  • Using the directories specified in DT_RPATH (unless DT_RUNPATH is specified)
  • Search the enviromental path LD_LIBRARY_PATH for library locations
  • Using the directories specified as DT_RUNPATH in the dynamic section of the binary if present. (see The One True Path section later).
  • From information in the cache file /etc/ld.so.cache which is generated by ldconfig and the contents of /etc/ld.so.conf and /etc/ld.so.conf.d/*
  • Searching the default path of /lib and /usr/lib (unless you do something crazy like using -z nodeflib to prevent default library usage)

To find out which libraries your program depends on we can use the tool readelf, which is provided as part of the binutils package, to examine the dynamic section of the binary using the -d/–dynamic flag.

	$ readelf -d test
	Dynamic section at offset 0xf0c contains 25 entries:
	Tag Type Name/Value
	0x00000001 (NEEDED) Shared library: [lib1.so]
	0x00000001 (NEEDED) Shared library: [libc.so.6]
	0x0000000c (INIT) 0x80483f0
	0x0000000d (FINI) 0x8048614
	0x00000019 (INIT_ARRAY) 0x8049f00
	0x0000001b (INIT_ARRAYSZ) 4 (bytes)

We can see that our library depends on libc and our library lib1.so. We can use the -l/–segment flag to examine the segment headers to see our binaries call out to the link loader:

	$ readelf -l test
	Elf file type is EXEC (Executable file)
	Entry point 0x8048470
	There are 9 program headers, starting at offset 52
	Program Headers:
	Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
	PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
	INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
	[Requesting program interpreter: /lib/ld-linux.so.2]
	LOAD 0x000000 0x08048000 0x08048000 0x00718 0x00718 R E 0x1000
	LOAD 0x000f00 0x08049f00 0x08049f00 0x00120 0x00124 RW 0x1000
	DYNAMIC 0x000f0c 0x08049f0c 0x08049f0c 0x000f0 0x000f0 RW 0x4
	...

If we’re really curious we can dig deeper and use the -r/–relocs flag to examine the relocation section of our binary. We can use this to find out what functions need to be provided to the executable to function correctly; libc.so provides most of the requests in this table but we can see our call out to f1 which is why we’re seeing that error.

	$ readelf -r test
	Relocation section '.rel.dyn' at offset 0x3c8 contains 1 entries:
	Offset Info Type Sym.Value Sym. Name
	08049ffc 00000306 R_386_GLOB_DAT 00000000 __gmon_start__
	Relocation section '.rel.plt' at offset 0x3d0 contains 4 entries:
	Offset Info Type Sym.Value Sym. Name
	0804a00c 00000207 R_386_JUMP_SLOT 00000000 puts
	0804a010 00000307 R_386_JUMP_SLOT 00000000 __gmon_start__
	0804a014 00000407 R_386_JUMP_SLOT 00000000 f1
	0804a018 00000507 R_386_JUMP_SLOT 00000000 __libc_start_main

So as expected, everything has worked and our library is being requested but it’s not currently being found. The easy way to test your project works is simply to override the LD_LIBRARY_PATH to point it to our libraries, a dot (.) will suffice to include the present working directory:

	$ LD_LIBRARY_PATH=”.” ./test
	In library 1
	In main

At this point you would typically package up your library to install to /lib or /usr/lib and voila or add the entry to the files under /etc/ld.so.conf.d/ and run ldconfig to regenerate the cache based on your new settings. If you’re doing things “properly” then that’s pretty much the end of your journey. Congratulations! However, it’s never quite that simple…

The One True path

There are times in shipping software where you need to deviate a little from the standard, perhaps you want to ship a self-contained package that doesn’t depend on ldconfig or ensure your provided library gets used before the system tries to locate it elsewhere; To that end we can leverage something called the runpath and rpath.

rpath vs runpath

When DT_RPATH was introduced as a concept, you’ll note from the list above that it has precedence over all other options.

This made it impossible to override the libraries search path so the powers that be decided to implement a newer parameter known as DT_RUNPATH which has lower precedence than LD_LIBRARY_PATH and as such, you can override the former with the latter.

rpath and runpath are conceptually the same thing, a list of alternate locations embedded within the executable that will be searched by the linker at runtime; Lacking a runpath, the older rpath will be used.

To specify both a rpath and runpath, you must tell the linker, using -Wl,-rpath=/my/location, the list of alternate locations and should also specify --enable-new-dtags. This last parameter causes it to embed the value as the runpath, and rpath, as only the rpath is used by default which causes the no-override problem mentioned above.

	$ gcc -L$(pwd) -Wl,-rpath=$(pwd) --enable-new-dtags -o test main.c -l1 -l2

To view the embedded paths we can use trusty old readelf again to examine the dynamic section of our binary and verify they match expectations:

	$ readelf -d test
	Dynamic section at offset 0xefc contains 27 entries:
	  Tag        Type                         Name/Value
	 0x00000001 (NEEDED)                     Shared library: [lib1.so]
	 0x00000001 (NEEDED)                     Shared library: [lib2.so]
	 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
	 0x0000000f (RPATH)                      Library rpath: [/home/nick/Development/libraryTutorial]
	 0x0000001d (RUNPATH)                    Library runpath: [/home/nick/Development/libraryTutorial]
	 0x0000000c (INIT)                       0x804843c
	 0x0000000d (FINI)                       0x8048674
	 0x00000019 (INIT_ARRAY)                 0x8049ef0
	 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
	 0x0000001a (FINI_ARRAY)                 0x8049ef4
	 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
	...

Libraries on Linux… nothing to it eh?

Dislike (3)
This entry was posted in Linux and tagged , , , , , . Bookmark the permalink.

Leave a Reply