Modules and Libraries in C

Usually, for smaller programs, we always write one source file, compile and run it. But for a big codebase, we would rather prefer to have a modular approach. It means we would group statements which does similar work under one file and import those into the main source file. This is called modular programming. According to Wikipedia:

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality.

What are the advantages?

  1. Easy to understand. The entire code which was initially in one single file is now separated among different modules each having its own logic.
  2. Easier to test and debug. (Anything to get rid of those segfaults XD)
  3. Easier to reuse code.
  4. Easier to make changes.
  5. Easier to maintain code.

But how to bring those files together?

So. when we divide the program into separate source files, some questions arises:

  1. How can a function in one file called a function that’s defined in another file?
  2. How can a function access an external variaable in another file?
  3. How can those files share the same macro definition, type definition?

The solution is the #include preprocessor directive. What does it do? In simple words, it opens the specified file and insert its contents into the current file. Write a simple C progam like this and save it:

//test.c
#include<stdio.h>
int main(){
	printf("Hello World!");
	return 0;
}

Now run this command:

$ gcc -E test.c

This will expand all the macros and the header files and print a very large file. The extra lines are due to the stdio header that we included. Similarly, if we write separate modules and import them using the #include directive, it will be expanded automatically for us during the compilation. These modules are called header files. In this case, the header file is stdio.h. One noticeable thing is their extension which is not .c rather is .h.

Now there are two ways to use the #include directive.

  1. #include <filename>
  2. #include "filename"

The only difference is how the compiler locates these header files.

In case 1: it will search the standard directory where all the header files reside. Generally it is the /usr/include.

But in case 2: it will search the current directory of the source file.

TIP: You can use relative paths in the header file to specify the particular location of the module. Why relative path? Because it supports portability.

In the same way we can import macros and typedefs as well.

Let’s try our hands at a simple project.

Sample Project 1: Bool Library

In C we don’t have the bool type. But we can often define the true and false values. Now instead of declaring in every file, we can take a modular approach to define the values.

  1. Create a file boolean.h and write the definitions for bool:
#define TRUE 1
#define FALSE 0

typedef int BOOL; // Bool is nothing but an int value
  1. Now create a c program to check even number but include the boolean.h.
#include <stdio.h>
#include "boolean.h"

BOOL isEven(int number) {
    return (number % 2 == 0);
}
int main() {
    int num;
    scanf("Enter a number: %d", &num);
    if (isEven(num) == TRUE) {
        printf("EVEN!\n");
    } else {
        printf("ODD!\n");
    }
    return 0;
}

Now compile and run the code. You can see the benefits of using a modular approach now:

  1. Let’s say you have used this macro in many files, and you have to change something, previously you will have to manually make the changes at all positions. But now with modular approach you can just change the header file and changes will get reflected everywhere.
  2. Code resuablilty is another big advantage.

NOTE: In C99 and later versions we have a stdbool.h which does pretty much this same thing. Here is the source: stdbool.h. Looking into the code you will notice it just contains a couple of preprocessor directives i.e some #include .

But how to separate and share functions?

Sharing functions prototypes

Let’s say you have a file foo.c which has a function isEven(). Now you want to use it in the file bar.c. It is important that you specify the function prototype. Other wise, the compiler is forced to assume that functions’s return type is int and other stuffs, like number of parameters, type of each parameter. The assumption that compiler makes might not match with the original and can result to error. So what to do in this situation?

Put the function’s prototype in a separate header file foo.h, then include the header file in all the places, where isEven()is called. In addition to including the header in the bar.c ( source file ) we also need to include it to foo.c as well.

Sometimes we need to protect header file using conditional preprocessort macros. More about it is discussed below

NOTE: If there is a common library to be included put it in foo.h rather than in both foo.c and bar.c

Okay now you can compile these files in this way:

$ gcc foo.c bar.c -o bar.o

This will generate a binary bar.o.

But there is a catch!

Let’s say foo.c is really big. From the previous approach everytime we compile bar.c it also compiles foo.c. This can make the compilation process really slow. If there are no changes in foo.c then compiling it repeatedly makes no sense. What we can do is create a static library of foo.c, and every time we compile bar.c we will just link the library. This saves us from compiling foo.c frequently.

Creating Static Libraries

Static libraries, also called “archives”, they are just collections of object files that contain functions. Basically, all the functions within the library are organised and indexed with a symbol and address. Static libraries are joined to the main module of a program during the linking stage of compilation before creating the binary executable file. After a successful link of a static library to the main module of a program, the executable file will contain both the main program and the library.

Creating static library

To create the library we use ar. It is just an archiving software. It takes object files, zips them up and generates an archive file (file that ends with .a) which is the static library.

But before that we need an object file ( that ends with .o ). We can use the gcc compiler to generate one but we need to stop compiling after the assembling stage. So we use the -c flag to run this command:-

$ gcc -c foo.c # will produce a foo.o object file

Now we can use the ar command.

$ ar -rc libfoo.a foo.o # will produce a libfoo.a library

Once the staic library is ready, you can check the symbols and the index table using the nm like this:

$ nm -Cs libfoo.a

Or you can use the ar -t command to display a table listing the contents of archive:

$ ar -t libfoo.a

NOTE: the name of the library should start with “lib” and end with “.a” extension. Reason is mentioned below.

Compiling the source and linking the generated library

Now, we want to compile the bar.c with the libfoo.a library. So, we need to run this command.

$ gcc bar.c -L. -lfoo -o bar

Let’s break it

  • -L will look for the library in the same directory
  • -l will use the library “foo”. It will automatically add lib before it and .a at the end and then look. In this case it is looking for the libfoo.a which explains why we named it that way before.
  • -o the final output binary name and path

This command will compile it and link the bar.c with static librarylibfoo.a and generate the binary bar.

Protecting Header files using conditional Macros

In C, we have some predefined macros like #ifdef, #else, #elif (python vibes ikr!). But why do we need these?

Apart from aid in compilation, this is required to protect header files from multiple declarations. Let’s say a header file is included more than once, then it wil be declared more than once as well. Now, inlcuding the same header file more than once doesn’t cause any compilation error (usually, if it contains only macros but for type definitions it might cause error). But it is always a good practice to protect the header file. It also saves from recompiling the same headerfile.

The standard way to prevent this is to enclose the entire real contents of the file in a conditional:

/* File foo.h  */
#ifndef FOO_H
#define FOO_H

//the
//code
//body
//here

#endif /* FOO_H */

This construct is commonly known as a wrapper #ifndef. When the header is included again, the conditional will be false, because FOO_H is defined. The preprocessor will skip over the entire contents of the file, and the compiler will not see it twice.

The comment following the #endif is not required, but it is a good practice if there is a lot of controlled text, because it helps people match the #endif to the corresponding #ifdef.

Trying to understand why this works!

StatementEquivalentDescription
#ifndef FOO_Hif not defined FOO_HIt is checking if the macro FOO_H is defined or not
#define FOO_Hdefine FOO_HSince it was prev not defined it will enter the wrapper, and then execute this step which is basically defining a macro
code-bodycode-bodythis will be the entire body of the code
#endif /* FOO_H */endthis is the ending of the wrapper

Now, if again the same header file is included, the compiler will not reach the inside of the wrapper because the macro is already defined. Thus the contents will be declared only once.