I’m trying to make the Code Segment Execute-Only (Not Readable).
But I FAILED after I tried everything the Manual told me to. Here is what I did to make the code segment unreadable.
>uname -a Linux Emmet-VM 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:18:00 UTC 2015 i686 i686 i686 GNU/Linux >lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.3 LTS Release: 14.04 Codename: trusty
First, I’ve found this in “Intel(R)64 and IA-32 Architectures Software Developer’s Manual(Combined Volumes 1,2A,2B,2C,2D,3A,3B,3C and 3D)“: Set read-enable bit to enable read and Segment Types.(Sorry, I’m still not allowed to embed pictures in my posts, so links instead)
So, I guess if I change %CS, and let it point to a Segment Descriptor which has read-enable bit set as 0, I should make the Code Segment not readable.
Then, I use the code below to insert a new Segment into LDT.entry[2], and I do set the code segment type to 8, aka 1000B, which means “Execute-Only” according to “Segment Types” link posted above:
typedef struct user_desc UserDesc; UserDesc *seg = (UserDesc*)malloc(sizeof(UserDesc)); seg->entry_number = 0x2; seg->base_addr = 0x00000000; seg->limit = 0xffffffff; seg->seg_32bit = 0x1; seg->contents = 0x02; seg->read_exec_only = 0x1; seg->limit_in_pages = 0x1; seg->seg_not_present = 0x0; seg->useable = 0x0; int ret = modify_ldt(1, (void*)seg, sizeof(UserDesc));
After that, I change %CS to 0x17(00010111B, meaning the entry 2 in LDT) with ljmp
.
asm("ljmp $0x17, $reload_csn" "reload_cs:");
But, even with this, I still can read the byte code in code segment:
void foo() {printf("foon");} void test(){ char* a = (char*)foo; printf("0x%xn", (unsigned int)a[0]);// This prints 0x55 }
If the code segment is unreadable, code above should throw a segment fault
error. But it prints 0x55
successfully.
So, I wonder, is there any mistake I’ve made during my test? Or is this just a mistake in Intel’s Manual?
Advertisement
Answer
You are still accessing the code through DS
when doing (unsigned int)a[0]
.
Write only segments don’t exist (and if they did, it would be a bad idea to set DS
write only).
If you did everything correctly mov eax, [cs:...]
(NASM syntax) will fail (but mov eax, [ds:...]
won’t).
After a quick glance at the Intel Manual execute only pages should not exist (at least directly), so using mprotect with PROT_EXEC
may be of limited use (the code would still be readable).
Worth a shot, though.
There are three ways around this.
None of which can be implemented without the aid of the OS though, so they are more theoretical than practical.
Protection keys
If the CPU supports them (See section 4.6.2 of the Intel manual 3), they introduce an asymmetry in how code and data are read.
Reading data is subject to the key protection. Fetching however is not:
How a linear address’s protection key controls access to the address depends on the mode of a linear address:
- A linear address’s protection controls only data accesses to the address. It does not in any way affect instructions fetches from the address.
So it’s possible to set a protection key for the code pages that your application don’t have in its PKRU
register.
You would still be allowed to execute the code but not to read it.
Desync the TLBs
If your application has never touched the code pages for reading, they will occupy some entries in the ITLB but not in the DTLB.
If then, the OS map them as supervisor-only without flushing the TLBs, access to them is prevented when accessed as data (since no DTLB entries for those pages are present, forcing a walk on the memory) but thanks to the ITLB the code can still be fetched.
This is more involved in practice as code span multiple pages and is actually read as data by the OS.
EPT
The Extended Data Pages are used during virtualization to translate Guest physical addresses to Host physical addresses.
Though they seems just another level of indirection, they have separate Read, Write and Execute control bits.
A paper has been written about preventing the leakage of the kernel code (to counteract dynamic Return Oriented Programming).