Windows Library Code
Intro
I thought I will make a guide about windows library code.. The target audience are beginners that want to understand more about windows reverse engineering, development and compilation. I tried to make this guide as simple as possible.
A “Library” is a term used in computer science for a collection of pre-written code / variables. Libraries are pretty useful for developers because it saves development time.
There are 2 types of libraries:
1) Static Libraries - Library code that is added to the client executable at link time.
2) Dynamic Libraries (DLLs) - Library code that is loaded from another file at runtime.
In this guide, I will explain both of these types, explain how we can reverse engineer the two types of libraries and how does it work.
The Empty Binary
Let’s look at the simplest example. This code is an empty main() function:
// SimplestExecutable.c
int main() {
return 0;
}
This code does not depend on any function. Using visual studio we can remove the usage of the standard library (C Runtime Library) the resulting binary does not have an import table. This is the only function in the binary:
.text:0000000140001000 main proc near
.text:0000000140001000 xor eax, eax
.text:0000000140001002 retn
.text:0000000140001002 main endp
Although this binary does not have an import table, several DLLs will be loaded:
- ntdll.dll: This DLL must be loaded into every process in the system. (excluding WSL processes because of Pico).
- kernel32.dll: This DLL is loaded automatically into every win32 subsystem process
- kernelbase.dll: This DLL contains some functions imported by kernel32.dll
The Bare Program: Many Object Files
Let’s look at another simple example. This example contains 2 object files:
// SimpleCalculator.h
// Header file of the SimpleCalculator functions
//
#pragma once
int add(int x, int y);
int sub(int x, int y);
.
// SimpleCalculator.c
// The implementation of the SimpleCalculator functions
#include "SimpleCalculator.h"
int add(int x, int y) {
return x + y;
}
int sub(int x, int y) {
return x - y;
}
.
// SimpleCalculatorMain.c
#include "SimpleCalculator.h"
int main() {
int x = add(10, 20);
int y = sub(30, 40);
return add(x, y);
}
This simple program still does not use any external function. Each C file is compiled into an object file with the code from the C source file. The format of an object file is COFF, PE is based on COFF - PE simply wraps COFF with the MS-DOS header, PE signature and “abuses” the optional header of COFF (_IMAGE_OPTIONAL_HEADER).
To disassemble object files we can use the dumpbin tool or IDA pro.
Let’s look at the contents of the object files:
The functions in SimpleCalculator.c are compiled into SimpleCalculator.obj as expected:
>dumpbin /disasm SimpleCalculator.obj
Dump of file SimpleCalculator.obj
File Type: COFF OBJECT
add:
0000000000000000: 89 54 24 10 mov dword ptr [rsp+10h],edx
0000000000000004: 89 4C 24 08 mov dword ptr [rsp+8],ecx
0000000000000008: 55 push rbp
0000000000000009: 48 83 EC 40 sub rsp,40h
000000000000000D: 48 8B EC mov rbp,rsp
0000000000000010: 8B 45 58 mov eax,dword ptr [rbp+58h]
0000000000000013: 8B 4D 50 mov ecx,dword ptr [rbp+50h]
0000000000000016: 03 C8 add ecx,eax
0000000000000018: 8B C1 mov eax,ecx
000000000000001A: 48 8D 65 40 lea rsp,[rbp+40h]
000000000000001E: 5D pop rbp
000000000000001F: C3 ret
sub:
0000000000000000: 89 54 24 10 mov dword ptr [rsp+10h],edx
0000000000000004: 89 4C 24 08 mov dword ptr [rsp+8],ecx
0000000000000008: 55 push rbp
0000000000000009: 48 83 EC 40 sub rsp,40h
000000000000000D: 48 8B EC mov rbp,rsp
0000000000000010: 8B 45 58 mov eax,dword ptr [rbp+58h]
0000000000000013: 8B 4D 50 mov ecx,dword ptr [rbp+50h]
0000000000000016: 2B C8 sub ecx,eax
0000000000000018: 8B C1 mov eax,ecx
000000000000001A: 48 8D 65 40 lea rsp,[rbp+40h]
000000000000001E: 5D pop rbp
000000000000001F: C3 ret
Let’s look at SimpleCalculatorMain.obj:
>dumpbin /disasm SimpleCalculatorMain.obj
Dump of file SimpleCalculatorMain.obj
File Type: COFF OBJECT
main:
0000000000000000: 40 55 push rbp
0000000000000002: 48 83 EC 70 sub rsp,70h
0000000000000006: 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000000000000B: BA 14 00 00 00 mov edx,14h
0000000000000010: B9 0A 00 00 00 mov ecx,0Ah
0000000000000015: E8 00 00 00 00 call add ; <----------
000000000000001A: 89 45 00 mov dword ptr [rbp],eax
000000000000001D: BA 28 00 00 00 mov edx,28h
0000000000000022: B9 1E 00 00 00 mov ecx,1Eh
0000000000000027: E8 00 00 00 00 call sub ; <-------------
000000000000002C: 89 45 04 mov dword ptr [rbp+4],eax
000000000000002F: 8B 55 04 mov edx,dword ptr [rbp+4]
0000000000000032: 8B 4D 00 mov ecx,dword ptr [rbp]
0000000000000035: E8 00 00 00 00 call add ; <-------------
000000000000003A: 48 8D 65 50 lea rsp,[rbp+50h]
000000000000003E: 5D pop rbp
000000000000003F: C3 ret
Ok, we can see the implementation of the main() function in this object file. As you can see - there are references to the add and sub functions. Sharp readers will notice something weird about these call instructions:
0000000000000015: E8 00 00 00 00 call add
The “E8” opcode is a call instruction but the offset is 0 - The reason the offset is 0 is because the compiler does not know where the “add” function is located. This information is known only later during link time - the linker then replaces the zeros with the actual offsets.
How does the linker know which bytes should be replaced?
Each object file has a symbol table that contains imported and exported symbols.
Let’s examine this table inside SimpleCalculatorMain.obj:
>dumpbin /symbols SimpleCalculatorMain.obj
Dump of file SimpleCalculatorMain.obj
File Type: COFF OBJECT
COFF SYMBOL TABLE
..... (truncated)
.....
00E 00000000 UNDEF notype () External | add
00F 00000000 UNDEF notype () External | sub
010 00000000 SECT3 notype () External | main
....
.... (truncated)
We can see that the table contains “add”, “sub” and “main”. The “main” function is located in SECTION 3. The “add” and “sub” functions are declared as “UNDEF” - this means these symbols will be looked up in the global namespace later during linkage.
After finding the functions in the global namespace, the linker needs to update all the references to these symbols (the calls we saw before) with the updated offsets. This is why the object file also contains a relocation table:
>dumpbin /relocations SimpleCalculatorMain.obj
Dump of file SimpleCalculatorMain.obj
File Type: COFF OBJECT
RELOCATIONS #3
Symbol Symbol
Offset Type Applied To Index Name
-------- ---------------- ----------------- -------- ------
00000016 REL32 00000000 E add
00000028 REL32 00000000 F sub
00000036 REL32 00000000 E add
...
...
As you can see, Each reference to a symbol is added to the relocation table so the linker will be able to fix the offset to the actual location of the function.
After the linker fixes the offsets, the main function looks like this:
>dumpbin /disasm SimpleCalculator.exe
Dump of file SimpleCalculator.exe
File Type: EXECUTABLE IMAGE
add:
...
... (truncated)
...
sub:
...
... (truncated)
...
main:
0000000140001040: 40 55 push rbp
0000000140001042: 48 83 EC 70 sub rsp,70h
0000000140001046: 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000014000104B: BA 14 00 00 00 mov edx,14h
0000000140001050: B9 0A 00 00 00 mov ecx,0Ah
0000000140001055: E8 A6 FF FF FF call add ; <-------
000000014000105A: 89 45 00 mov dword ptr [rbp],eax
000000014000105D: BA 28 00 00 00 mov edx,28h
0000000140001062: B9 1E 00 00 00 mov ecx,1Eh
0000000140001067: E8 B4 FF FF FF call sub ; <-------
000000014000106C: 89 45 04 mov dword ptr [rbp+4],eax
000000014000106F: 8B 55 04 mov edx,dword ptr [rbp+4]
0000000140001072: 8B 4D 00 mov ecx,dword ptr [rbp]
0000000140001075: E8 86 FF FF FF call add ; <-------
000000014000107A: 48 8D 65 50 lea rsp,[rbp+50h]
000000014000107E: 5D pop rbp
000000014000107F: C3 ret
As you can see, the functions look exactly the same - but the offsets are fixed.
Whole Program Optimization
The “add” and “sub” functions are pretty small, why not inline them?
Inlining small functions has runtime benefits because it does not require the CPU to perform a CALL instruction.
In C and C++, A “link time optimization” lets the linker perform all sorts of optimizations. The issue with normal object files is that the linker does not have enough information to perform the optimizations. Optimization is typically performed on an internal compiler intermidiate representation of the program (
think of it like the middle code between your source code and machine code) So it is not possible using normal object files because they contain machine
code. the linker is not clever enough to modify the machine code after it was generated.
We can achive link time optimization by using the /GL compiler flag. This flag instructs the compiler to generate more information and add it to the object file. This “more information” is highly compiler version dependent, because it typically means that the compiler will not omit machine code but the intermidiate representation code. This means the object file will not be valid with tools like dumpbin and IDA Pro anymore. This flag should only be used in case the code is compiled and linked on the same computer.
In this case, link time optimizations can even evaluate these functions in compile time - this results in this binary:
public main
main proc near
mov eax, 0x14
retn
main endp
This is simply beautiful.
Static Library Development
Let’s start by exploring the simpler type of libraries called “static libraries”. As we saw earlier, the linker is responsible to take the object files produced by the compiler and gather them into the final executable file. Each object file represents a compiled C file. Let’s say I wrote a library that implements a calculator (just like SimpleCalculator from before..) and I want to give this library to my friend. Theoretically, if the implementation were a single file I could compile it and give my friend the object file so he could add it as a linker input. This would add the code of the library to his executable, just like the example from before. The problem is: What if it’s more than 1 file?
Let’s say I have this “library”:
// Add.c
int add(int x, int y) {
return x + y;
}
.
// Sub.c
int sub(int x, int y) {
return x - y;
}
After compilation there are 2 object files: add.obj, sub.obj. I could give them to my friend, but it’s not scalable - What if I had 100 files?
Maybe there is a way to merge these object files into 1 file somehow..
It turns out there is - it’s called a static library (.LIB file in windows). Static libraries allows the developer to add a bunch of symbols to the linker namespace easily. This can be done by changing the Configuration Type in Visual Studio to “Static Library”. This will instruct the linker to produce a “.LIB” file from the object files (instead of generating an executable file)
The .LIB file is simply an archive file of object files. The format of the archive is AR format. In Unix, linkers can decompress AR archives and extract object files from them. In Windows the idea is similar, we add the LIB file as an input to the linker and it simply adds the object files from the LIB file. (In windows the format is a bit different though..)
To reverse engineer LIB files, you can use “dumpbin /disasm” or load the binary into IDA pro. This is example output of dumpbin /disasm:
>dumpbin /disasm StaticCalculator.lib
Microsoft (R) COFF/PE Dumper Version 14.22.27905.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file StaticCalculator.lib
File Type: LIBRARY
sub:
0000000000000000: 89 54 24 10 mov dword ptr [rsp+10h],edx
0000000000000004: 89 4C 24 08 mov dword ptr [rsp+8],ecx
0000000000000008: 57 push rdi
0000000000000009: 8B 44 24 18 mov eax,dword ptr [rsp+18h]
000000000000000D: 8B 4C 24 10 mov ecx,dword ptr [rsp+10h]
0000000000000011: 2B C8 sub ecx,eax
0000000000000013: 8B C1 mov eax,ecx
0000000000000015: 5F pop rdi
0000000000000016: C3 ret
add:
0000000000000000: 89 54 24 10 mov dword ptr [rsp+10h],edx
0000000000000004: 89 4C 24 08 mov dword ptr [rsp+8],ecx
0000000000000008: 57 push rdi
0000000000000009: 8B 44 24 18 mov eax,dword ptr [rsp+18h]
000000000000000D: 8B 4C 24 10 mov ecx,dword ptr [rsp+10h]
0000000000000011: 03 C8 add ecx,eax
0000000000000013: 8B C1 mov eax,ecx
0000000000000015: 5F pop rdi
0000000000000016: C3 ret
LIB files have one more use: They can be used to add imports to the import table (“import library”) - we will introduce this concept after explaining about dynamic libraries later.
Do not use the “Whole Program Optimization” with lib files unless you are compiling and linking on the same computer. If you compile something with the /GL flag and give the lib file to someone else with a different MSVC version undefined behavior is in your nose
Dynamic Library Development
As we said before, Dynamic Libraries are libraries that are loaded from another file (typically the DLL file). There are couple advantages to dynamic libraries in contrast to static libraries:
- Save space in disk for shared libraries - Many executables can use the same DLL file (for example, kernel32.dll)
- Save space in memory for shared libraries - Utilize the Copy On Write machanism of the virtual memory manager to share physical pages if possible.
- Minimize load time for shared libraries - Because pages can be reused (if they are already in memory), the load time may be better than a bigger executable.
- Allow updating the library code without recompilation of the client executables - simply replace the DLL on disk
The main disadvantages of dynamic libraries:
- DLL Hell - could cause “DLL not found” errors and compatability issues
- Runtime speed (in certain cases) - In static libraries, the linker handles the linking of the library. Because the code of the library is embedded inside the executable, the linker can optimize stuff - for example: inline functions.
In windows, DLLs can be loaded in 2 ways:
- Import Table - The PE file format has an “import table”. This is a a table that contains references to dynamic libraries that will be loaded at runtime by the windows loader.
- Calling LoadLibrary(string path) - The LoadLibrary function can be called to load a DLL file from the file system.
Creating a dynamic library
Let’s create a simple DLL. This DLL implements the Calculator interface (ahhh again..)
// DynamicCalculator.c
#include <Windows.h>
BOOL APIENTRY DllMain( HMODULE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
switch (ul_reason_for_call)
{
case DLL_PROCESS_ATTACH:
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
case DLL_PROCESS_DETACH:
break;
}
return TRUE;
}
__declspec(dllexport) int add(int x, int y) {
return x + y;
}
__declspec(dllexport) int sub(int x, int y) {
return x - y;
}
This is the simple implementation of our DLL. This DLL has a DllMain function. This function is called by the windows loader in several cases:
- DLL_PROCESS_ATTACH: The DLL is loaded
- DLL_PROCESS_DETACH: The DLL is unloaded
- DLL_THREAD_ATTACH: A new thread is created. The call is made in the context of the new thread
- DLL_THREAD_DETACH: A thread is exiting cleanly.
It’s funny, but the initialization of a DLL is not always performed inside the DllMain function. The reason is that there are many restrictions on the DllMain function. The easiest thing to do is to expose a “Initialize” function that will perform the actual initialization of the state of the DLL.
Each DLL has an export table. The export table is a table that contains all the exported functions of the DLL. (name and location of the function. Functions can be exported by an “Ordinal” number, we won’t talk about this)
“__declspec” is a microsoft specific attribute that allows the developer so specify a “storage class” to declarations. “dllexport” is a storage class that exports the function in the export table of a DLL. The developer can also use a “.def” file to define the names of the exports, but using dllexport is somtimes easier.
So, after compiling this code it will result in a DLL file that contains:
- C runtime library code
- “add” export
- “sub” export
A DLL file is in the PE format. The only difference between an executable PE and a DLL is a simple flag. Most of the tools that can be used to explore an executable file can also be used to explore DLL files, including:
- IDA
- CFF Explorer
- pestudio
- dumpbin
- ….
Let’s see how we can use the DynamicCalculator DLL.
Using a DLL with LoadLibrary() and GetProcAddress()
The win32 api provides 2 useful functions to clients of DLLs:
- HMODULE LoadLibrary(WSTR DllPath) - Load a specific library from the file system. (kernelbase!LoadLibraryW -> kernelbase!LoadLibraryExW -> ntdll!LdrLoadDll) Calling this function also invokes the “DllMain” procedure of the DLL we are loading.
- PVOID GetProcAddress(HMODULE ModuleHandle, STR FunctionName) - Look at the export table of a module and get the address of an exported function.
Example Client Program (Ignoring errors for simplicity..)
// DynamicCalculatorClient.c
#include <windows.h>
// define pointer types
typedef int (*ptr_add)(int x, int y);
typedef int (*ptr_sub)(int x, int y);
// pointers to export functions
ptr_add add;
ptr_sub sub;
HMODULE dynamicCalculator;
int main() {
// Load the library
dynamicCalculator = LoadLibraryA("DynamicCalculator.dll");
add = (ptr_add)GetProcAddress(dynamicCalculator, "add");
sub = (ptr_sub)GetProcAddress(dynamicCalculator, "sub");
// Actual logic of the client program
int x = add(10, 20);
int y = sub(30, 20);
int sum = add(x, y);
// Free the library
FreeLibrary(dynamicCalculator);
return sum;
}
Notice that this method requires us to:
- Declare the prototype of the declared functions
- Call “LoadLibrary” with the name of the DLL
- Call GetProcAddress() with the function names we are about to use and save the function pointers in a variable (global or local)
- Call the functions through these pointers
Let’s look at the disassembly of our program:
>dumpbin DynamicCalculatorClient.exe /disasm
...
...
main:
00000001400116C0: 40 55 push rbp
00000001400116C2: 57 push rdi
00000001400116C3: 48 81 EC 48 01 00 sub rsp,148h
00
00000001400116CA: 48 8D 6C 24 20 lea rbp,[rsp+20h]
00000001400116CF: 48 8B FC mov rdi,rsp
00000001400116D2: B9 52 00 00 00 mov ecx,52h
00000001400116D7: B8 CC CC CC CC mov eax,0CCCCCCCCh
00000001400116DC: F3 AB rep stos dword ptr [rdi]
00000001400116DE: 48 8D 0D CB 84 00 lea rcx,[??_C@_0BG@JAKNIHCK@DynamicCalculator?4dll@] ; "DynamicCalculator.dll"
00
00000001400116E5: FF 15 25 D9 00 00 call qword ptr [__imp_LoadLibraryA]
00000001400116EB: 48 89 05 D6 B1 00 mov qword ptr [dynamicCalculator],rax
00
00000001400116F2: 48 8D 15 D3 84 00 lea rdx,[??_C@_03BDGOHNNK@add@] ; "add"
00
00000001400116F9: 48 8B 0D C8 B1 00 mov rcx,qword ptr [dynamicCalculator]
00
0000000140011700: FF 15 02 D9 00 00 call qword ptr [__imp_GetProcAddress]
0000000140011706: 48 89 05 DB B1 00 mov qword ptr [add],rax
00
000000014001170D: 48 8D 15 BC 84 00 lea rdx,[??_C@_03KCMAIMAP@sub@] ; "sub"
00
0000000140011714: 48 8B 0D AD B1 00 mov rcx,qword ptr [dynamicCalculator]
00
000000014001171B: FF 15 E7 D8 00 00 call qword ptr [__imp_GetProcAddress]
0000000140011721: 48 89 05 B8 B1 00 mov qword ptr [sub],rax
00
0000000140011728: BA 14 00 00 00 mov edx,14h
000000014001172D: B9 0A 00 00 00 mov ecx,0Ah
0000000140011732: FF 15 B0 B1 00 00 call qword ptr [add]
0000000140011738: 89 45 04 mov dword ptr [rbp+4],eax
000000014001173B: BA 14 00 00 00 mov edx,14h
0000000140011740: B9 1E 00 00 00 mov ecx,1Eh
0000000140011745: FF 15 95 B1 00 00 call qword ptr [sub]
000000014001174B: 89 45 24 mov dword ptr [rbp+24h],eax
000000014001174E: 8B 55 24 mov edx,dword ptr [rbp+24h]
0000000140011751: 8B 4D 04 mov ecx,dword ptr [rbp+4]
0000000140011754: FF 15 8E B1 00 00 call qword ptr [add]
000000014001175A: 89 45 44 mov dword ptr [rbp+44h],eax
000000014001175D: 48 8B 0D 64 B1 00 mov rcx,qword ptr [dynamicCalculator]
00
0000000140011764: FF 15 96 D8 00 00 call qword ptr [__imp_FreeLibrary]
000000014001176A: 8B 45 44 mov eax,dword ptr [rbp+44h]
000000014001176D: 48 8D A5 28 01 00 lea rsp,[rbp+128h]
00
0000000140011774: 5F pop rdi
0000000140011775: 5D pop rbp
0000000140011776: C3 ret
...
...
As you can see in this code, the calls to functions from other DLLs are made through pointers:
0000000140011732: FF 15 B0 B1 00 00 call qword ptr [add]
This makes sense because the compiler (and even the linker) cannot know the address of the “add” function (it is loaded in runtime) - That’s why calls to DLL functions have to be made using pointers. The opcode “FF 15 <32bit offset>” means: Load the value from *(RIP+offset) then perform a “call” instruction to this address.
Using the import table to load DLLs
All this work with LoadLibrary() and GetProcAddress() can be very annoying to maintain for real systems. This is why the PE format header has a structure called the import table. The import table can be used to instruct the windows loader to load certain DLLs. For each DLL, there’s a list of names (or ordinals) of functions that need to be imported. After the windows loader loads these DLLs, it enumerates the list of imported functions and saves the function pointers in a global location (actually patches the import table with the function pointers). The linker knows the offset of the import table from the beginning of the image in memory (the RVA) so it can fix the offset of call instructions to the imported function pointers.
So, how can we add new imports to the import table?
If you compile the DynamicCalculator project (the Dynamic Library we have created before) you will see that it creates the DynamicCalculator.dll file (as expected) BUT then you will see another file called DynamicCalculator.lib - this may look pretty weird because we created a dynamic library (not a static library).
The DynamicCalculator.lib file is an “import library”. This .LIB file does not contain any object file (You can decompress and look) - It simply allows developers to reference imported functions as linker symbols and get the linker resolve the addresses conveniently. After adding the DynamicCalculator.lib file as a linker input, A developer can write the following code:
// DynamicCalculatorClientImportTable.c
#include <windows.h>
__declspec(dllimport) int add(int x, int y);
__declspec(dllimport) int sub(int x, int y);
int main() {
int x = add(10, 20);
int y = sub(30, 20);
int sum = add(x, y);
return sum;
}
Looking at the import table of the generated executable, we see the following:
>dumpbin /imports DynamicCalculatorClientImportTable.exe
Dump of file DynamicCalculatorClientImportTable.exe
File Type: EXECUTABLE IMAGE
Section contains the following imports:
DynamicCalculator.dll
140016000 Import Address Table
140016090 Import Name Table
0 time date stamp
0 Index of first forwarder reference
1 sub
0 add
This means the only DLL that is imported is DynamicCalculator.dll and the imported functions are “sub” and “add”. This is the generated assembly code:
main:
0000000140011030: 40 55 push rbp
0000000140011032: 48 83 EC 70 sub rsp,70h
0000000140011036: 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000014001103B: BA 14 00 00 00 mov edx,14h
0000000140011040: B9 0A 00 00 00 mov ecx,0Ah
0000000140011045: FF 15 BD 4F 00 00 call qword ptr [__imp_add] ; <-----
000000014001104B: 89 45 00 mov dword ptr [rbp],eax
000000014001104E: BA 14 00 00 00 mov edx,14h
0000000140011053: B9 1E 00 00 00 mov ecx,1Eh
0000000140011058: FF 15 A2 4F 00 00 call qword ptr [__imp_sub] ; <----
000000014001105E: 89 45 04 mov dword ptr [rbp+4],eax
0000000140011061: 8B 55 04 mov edx,dword ptr [rbp+4]
0000000140011064: 8B 4D 00 mov ecx,dword ptr [rbp]
0000000140011067: FF 15 9B 4F 00 00 call qword ptr [__imp_add] ; <----
000000014001106D: 89 45 08 mov dword ptr [rbp+8],eax
0000000140011070: 8B 45 08 mov eax,dword ptr [rbp+8]
0000000140011073: 48 8D 65 50 lea rsp,[rbp+50h]
0000000140011077: 5D pop rbp
0000000140011078: C3 ret
As you can see calls are made through pointers:
call qword ptr [__imp_add]
“__imp_add” is a linker symbol that refers to the function pointer in the import table. When the windows loader loads the DLL, it stores the function pointer in the import table.
This is equivalent to using GetProcAddress() and storing the address in a global variable, then using the global variable as a function pointer.
The windows loader does not use LoadLibrary() function directly, it uses a lower level function to perform the load. Eventually both arrive to ntdll!NtMapViewOfSection with the SEC_IMAGE flag. I will probably explain this sometime..
What does __declspec(dllimport) do?
To call the ‘add’ / ‘sub’ function, the call has to be made through a pointer that resides in the import table. Let’s see what I mean:
Say I declare the add / sub functions this way:
// DynamicCalculatorClientStub.c
int add(int x, int y);
int sub(int x, int y);
int main() {
int x = add(10, 20);
int y = sub(30, 20);
int sum = add(x, y);
return sum;
}
My object file will look like this:
>dumpbin /disasm DynamicCalculatorClientStub.obj
Microsoft (R) COFF/PE Dumper Version 14.22.27905.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file DynamicCalculatorClientStub.obj
File Type: COFF OBJECT
main:
0000000000000000: 40 55 push rbp
0000000000000002: 48 83 EC 70 sub rsp,70h
0000000000000006: 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000000000000B: BA 14 00 00 00 mov edx,14h
0000000000000010: B9 0A 00 00 00 mov ecx,0Ah
0000000000000015: E8 00 00 00 00 call add ; <-----
000000000000001A: 89 45 00 mov dword ptr [rbp],eax
000000000000001D: BA 14 00 00 00 mov edx,14h
0000000000000022: B9 1E 00 00 00 mov ecx,1Eh
0000000000000027: E8 00 00 00 00 call sub ; <-----
000000000000002C: 89 45 04 mov dword ptr [rbp+4],eax
000000000000002F: 8B 55 04 mov edx,dword ptr [rbp+4]
0000000000000032: 8B 4D 00 mov ecx,dword ptr [rbp]
0000000000000035: E8 00 00 00 00 call add ; <-----
000000000000003A: 89 45 08 mov dword ptr [rbp+8],eax
000000000000003D: 8B 45 08 mov eax,dword ptr [rbp+8]
0000000000000040: 48 8D 65 50 lea rsp,[rbp+50h]
0000000000000044: 5D pop rbp
0000000000000045: C3 ret
As you can see the calls are made with the E8 opcode, which as we said before, expects an offset to the function. The linker cannot know the offset to the actual function because it is known only in runtime. The linker does know the offset to the import table which contains a pointer to the imported function in runtime, but it cannot change the opcode of the instruction from a relative call to an indirect call because typically the length of the opcodes are different, you can see here:
E8 00 00 00 00 call add ; 5 bytes
FF 15 00 00 00 00 call qword ptr [__imp_add] ; 6 bytes
Oh man what a mess. The linker cannot move all other instructions and fix that much stuff by itself..
Delaring a function with “__declspec(dllimport)” instructs the compiler to generate the second option - calling through a function pointer.
BUT typically in real libraries we want to share the header files with the clients of the library. As a reminder, our header file looks like this:
#pragma once
int add(int x, int y);
int sub(int x, int y);
This means the declarations won’t have any __declspec(dllimport) in the client’s code. So, how can the linker deal with this situation?
Stubs! Let’s see what happens after the linkage of the last example:
>dumpbin /disasm DynamicCalculatorClientStub.exe
Microsoft (R) COFF/PE Dumper Version 14.22.27905.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file DynamicCalculatorClientStub.exe
File Type: EXECUTABLE IMAGE
main:
0000000140001000: 40 55 push rbp
0000000140001002: 48 83 EC 70 sub rsp,70h
0000000140001006: 48 8D 6C 24 20 lea rbp,[rsp+20h]
000000014000100B: BA 14 00 00 00 mov edx,14h
0000000140001010: B9 0A 00 00 00 mov ecx,0Ah
0000000140001015: E8 2C 00 00 00 call add ; <----- A call to the stub below
000000014000101A: 89 45 00 mov dword ptr [rbp],eax
000000014000101D: BA 14 00 00 00 mov edx,14h
0000000140001022: B9 1E 00 00 00 mov ecx,1Eh
0000000140001027: E8 20 00 00 00 call sub ; <----- A call to the stub below
000000014000102C: 89 45 04 mov dword ptr [rbp+4],eax
000000014000102F: 8B 55 04 mov edx,dword ptr [rbp+4]
0000000140001032: 8B 4D 00 mov ecx,dword ptr [rbp]
0000000140001035: E8 0C 00 00 00 call add ; <----- A call to the stub below
000000014000103A: 89 45 08 mov dword ptr [rbp+8],eax
000000014000103D: 8B 45 08 mov eax,dword ptr [rbp+8]
0000000140001040: 48 8D 65 50 lea rsp,[rbp+50h]
0000000140001044: 5D pop rbp
0000000140001045: C3 ret
add:
0000000140001046: FF 25 BC 0F 00 00 jmp qword ptr [__imp_add] ; the stub for add
sub:
000000014000104C: FF 25 AE 0F 00 00 jmp qword ptr [__imp_sub] ; the stub for sub
So the DynamicCalculator.lib import library contains the following symbols:
- __imp_add / __imp_sub: Pointers in the import table that point to the actual library code in runtime.
- add/sub: Stubs that contains “jmp” instructions to the import table pointers. If a client does not use __declspec(dllimport) the relative offset of the opcode will be resolved to these functions.
One of the main advantages of not using __declspec(dllimport) is we can replace the dynamic library to a static library without changing our code! (even without changing the object file actually)
Revisiting the whole program optimization
Remember the whole program optimization? It is a link time optimization. If we turn on the whole program optimization the call to the stub can be converted to a call to the import table, here:
>dumpbin /disasm DynamicCalculatorClientStub.exe
Microsoft (R) COFF/PE Dumper Version 14.22.27905.0
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file DynamicCalculatorClientStub.exe
File Type: EXECUTABLE IMAGE
add:
0000000140001000: FF 25 02 10 00 00 jmp qword ptr [__imp_add]
sub:
0000000140001006: FF 25 F4 0F 00 00 jmp qword ptr [__imp_sub]
000000014000100C: CC CC CC CC ÌÌÌÌ
main:
0000000140001010: 48 83 EC 38 sub rsp,38h
0000000140001014: BA 14 00 00 00 mov edx,14h
0000000140001019: B9 0A 00 00 00 mov ecx,0Ah
000000014000101E: FF 15 E4 0F 00 00 call qword ptr [__imp_add]
0000000140001024: 89 44 24 24 mov dword ptr [rsp+24h],eax
0000000140001028: BA 14 00 00 00 mov edx,14h
000000014000102D: B9 1E 00 00 00 mov ecx,1Eh
0000000140001032: FF 15 C8 0F 00 00 call qword ptr [__imp_sub]
0000000140001038: 89 44 24 20 mov dword ptr [rsp+20h],eax
000000014000103C: 8B 54 24 20 mov edx,dword ptr [rsp+20h]
0000000140001040: 8B 4C 24 24 mov ecx,dword ptr [rsp+24h]
0000000140001044: FF 15 BE 0F 00 00 call qword ptr [__imp_add]
000000014000104A: 89 44 24 28 mov dword ptr [rsp+28h],eax
000000014000104E: 8B 44 24 28 mov eax,dword ptr [rsp+28h]
0000000140001052: 48 83 C4 38 add rsp,38h
0000000140001056: C3 ret
That’s it! I hope you learned about the compilation model in windows libraries. If you have any questions or found a mistake in the article, Send me a twitter message: @0xrepnz