Project 2: Custom system call and signal handling for insane memory read; and ELF analysis

Due date

Nov 15, 2017, 11:59pm (please demo to me 3-5pm Nov 16).

Goals

The goals of this project are (1) to practice basic kernel programming, (2) to practice signal handling, and (3) to enhance the understanding of ELF file format and virtual memory.

Details

Implement a user space function int readAddr(void *p, int *v). Given a user space virtual address p, read the word at that address and store the value at v. You should iterate through every address in user space at an interval of PAGE_SIZE * 1024.

char *p = 0;
unsigned long validPages = 0, invalidPages = 0;
for( ; (unsigned long)p < TASK_SIZE; p += PAGE_SIZE * 1024 ) {
	...
	int a = 0;
	int r = readAddr(p, &a);
	if( r == -1 ) // return value -1 means the read was invalid
		invalidPages++;
	else {
		validPages++;
		printf("%p: %d\n", p, a);
	}	
	...
}
printf("%lu out of %lu pages are valid", validPages, validPages + invalidPages);

The first two sub-projects should implement the readAddr above in the following two ways, respectively. The third sub-project is about static and dynamic ELF file analysis.

  1. Please implement a system call isReadable(char*) and make use of it to implement the user-space function readAddr for safe reading. The system call isReadable(char* p) returns whether the address passed as p falls into a readable memory area. If not, the system call should return -1; otherwise, 0. You should make use of the mm field in task_struct. Specifically, mm is of type mm_struct, which contains a field mmap pointing to a list of nodes, each describing a virtual memory area (VMA) vm_area_struct. A VMA describes a range of virtual address space and the allowed access operations (VM_READ, VM_WRITE, VM_EXEC, etc.) in user space. Note that VM_EXEC also implies it is readable. Also, note that if it is VM_IO, it is not safe to read, as such a region maps a device's I/O space.
  2. Make use of signal handling to handle invalid read. Reading a memory cell at an arbitrary address may be just fine or may trigger a SIGSEGV signal. To deal with the latter case, you can install a signal handler using sigaction. Whenever a SIGSEGV signal occurs, capture it and recover your program execution using sigsetjmp/siglongjmp. Specifically, in readAddr, before each (un)safe read, call sigsetjmp; if the subsequent read triggers a SIGSEGV, the signal handler calls siglongjmp to recover the control flow and hint readAddr to return -1.
  3. For the third sub-project, you only need to write a piece of code and then write a report describing how different program elements (e.g., functions, global variables, local variables, string literals) in your code are stored in different sections (such as .text, .data, .bss) of the ELF file; and how they are stored in different segments (such as code, data, call stack) of the memory address space. You report should include at least five different sections and five different segments. For example, in the report you can point out that the global variable int g = 100; is stored in the .data section in the ELF file and in the data segment during execution. You should use various tools (such as readelf, objdump, gdb, nm, etc.) to collect evidence (in the form of screenshots) to support your description. Assuming you report covers X different sections and Y segments, You will get bonus/penalty points (X+Y-10).

Tips

To obtain PAGE_SIZE and TASK_SIZE, please refer to the code below

#include <unistd.h>
unsigned long PAGE_SIZE = 0, TASK_SIZE = 0;

PAGE_SIZE = sysconf(_SC_PAGESIZE);
if(sizeof (void*) == sizeof (int)) // 32-bit system
	TASK_SIZE = 0xc0000000UL;
else // 64-bit system
	TASK_SIZE = (1UL << 47) - PAGE_SIZE;

Submission

Your submission should include the code (the kernel code modification should be submitted as a kernel patch), a readme file describing your design, how to compile / use your code and the contribution in the case of group programming, and a report which consists of the following parts:

Environment

Linux (any kernel version >= 2.6 is fine) and C/C++.

How to create a kernel patch

Assume the original kernel code linux-2.6.18-old and the modified code linux-2.6.18-new are in the same directory src. Pay attention to the current working directory (I omitted the cd command), and make sure you backup your code before running the commands below.

// Remove all intermediate files, such as the object files and configuration files
src/linux-2.6.18-new$ make distclean 
// Generate the patch. -r: recursive, -u: a unified output wrt the difference, -N: handle new files 
src$ diff -urN linux-2.6.18-old linux-2.6.18-new >patchfile
// Now you can verify the patch by applying it to the original code
// "-p1" makes the "patch" command ignore "linux-2.6.18-old/" inside the generated patch
src/linux-2.6.18-old$ patch -p1 <../patchfile
// You should see no difference
src$ diff -r linux-2.6.18-old linux-2.6.18-new

References

You may find the following articles useful

.