1.10 Introducing SHELF Loading (Part 2)
4. SHELF Loading
This generated ELF flavor gives us some interesting capabilities that other ELF types are not able to provide. For the sake of simplicity, we have labelled this type of ELF binary as SHELF, and we will be referencing it throughout the rest of this paper. The following is an updated diagram of the loading stages needed for SHELF loading:
As we can see in the diagram above, the process of loading SHELF files is highly reduced in complexity compared to conventional ELF loading schemes.
To illustrate the reduced set of constraints to load these types of files, a snippet of a minimalistic SHELF User-Land-Exec approach is as follows:
By using this approach, a subject SHELF file would look as follows in memory and on disk:
As we can observe, the ELF header and Program Headers are missing from the process image. This is a feature that this flavor of ELF enables us to implement and is discussed in the following section.
4.1 Anti-Forensic Capabilities
This new approach to User-Land-Exec has also two optional stages useful for anti-forensic purposes. Since the dl_relocate_static_pie function will obtain all of the required fields for relocation from the Auxiliary Vector, this leaves us room to play with how the subject SHELF file structure may look in memory and on disk.
Removing the ELF header will directly impact reconstruction capabilities, because most Linux-based scanners will scan process memory for existing ELF images by first identifying ELF headers. The ELF header will be parsed and will contain further information on where to locate the Program Header Table and consequently the rest of the mapped artifacts of the file.
Removal of the ELF header is trivial since this artifact is not really needed by the loader – all required information in the subject file will be retrieved from the Auxiliary Vector as previously mentioned.
An additional artifact that can be hidden is the Program Header Table. This is a slightly different case when compared with the ELF Header. The Auxiliary Vector needs to locate the Program Header Table in order for the RTLD to successfully load the file by applying the needed runtime relocations. Regardless, there are many approaches to obfuscating the PHT. The simplest approach is to remove the original Program Header Table location, and relocate it somewhere in the file that is only known by the Auxiliary Vector.
We can precompute the location of each of the Auxiliary Vector entries and define each entry as a macro in an include file, tailoring our loader to every subject SHELF file at compile-time. The following is an example of how these macros can be generated:
As we can observe, we have parsed the subject SHELF file for its e_entry and e_phnum fields, creating corresponding macros to hold those values. We also have to choose a random base image to load the file. Finally, we locate the PHT and convert it to an array, then remove it from its original location. Applying these modifications allows us to completely remove the ELF header and change the default location of the subject SHELF file PHT both on disk and in memory(!)
Without successful retrieval of the Program Header Table, reconstruction capabilities may be strictly limited and further heuristics will have to be applied for successful process image reconstruction.
An additional approach to make the reconstruction of the Program Header Table much harder is by instrumenting the way glibc implements the resolution of the Auxiliary Vector fields.
4.2 Obscuring SHELF features by PT_TLS patching
Even after modifying the default location of the Program Header Table by choosing a new arbitrary location when crafting the Auxiliary Vector, the Program Header Table would still reside in memory and could be found with some effort. To obscure ourselves even further we can cover how the startup code reads the Auxiliary Vector fields.
The code that does this is in elf/dl_support.c in the function _dl_aux_init. In abstract, the code iterates over all the auxv_t entries, and each of these entries initialize internal variables from glibc:
The only reason the Auxiliary Vector is required is to initialize internal _dl_* variables. Knowing this, we can bypass the creation of the Auxiliary Vector entirely and do the same job that _dl_aux_init would do before passing control of execution to the subject SHELF file.
The only entries which are critical are AT_PHDR, AT_PHNUM, and AT_RANDOM. Therefore, we only need to patch the respective _dl_* variables that depend on these fields. As an example of how to retrieve these values, we can use the following one-liner to generate an include file with precomputed macros holding the offset to every dl_* variable:
With the offset to these variables located, we only need to patch them in the same way the original startup code would do so using the Auxiliary Vector. As a way to illustrate this technique, the following code will initialize the addresses of the Program Headers to new_address, and the number of program headers to the correct number:
At this point we have a working program without supplying the Auxiliary Vector. Because the subject binary is statically linked, and the code that will load the SHELF file is our loader, we can neglect every other segment in the Auxiliary Vector's AT_PHDR and AT_PHNUM or dl_phdr and dl_phnum respectively. There is an exception, which is the PT_TLS segment which is the interface in which Thread Local Storage is implemented in the ELF file format.
The following code which resides in csu/libc-tls.c on function __libc_setup_tls show the type of information that gets retrieved from the PT_TLS segment:
In the code snippet above, we can see that TLS initialization relies on the presence of the PT_TLS segment. We have several approaches that can obfuscate this artifact, such as patching the __libc_setup_tls function to just return and then initialize the TLS with our own code. Here, we'll choose to implement a quick patch to glibc instead as a PoC.
To avoid the need of the PT_TLS Program Header we have added a global variable to hold the values from PT_TLS and set the values inside __libc_setup_tls, reading from our global variable instead of the subject SHELF file Program Header Table. With this small change we finally strip all the program headers:
Using the following script to generate _phdr.h:
We can apply our patches in the following way after including _phdr.h:
Applying the methodology shown above, we gain a high level of evasiveness by loading and executing our SHELF file without an ELF header, Program Header Table, and Auxiliary Vector – just as shellcode gets loaded. The following diagram illustrates how straightforward the loading process of SHELF files is:
5. Conclusion
We have covered the internals of Reflective Loading of ELF files, explaining previous implementations of User-Land-Exec, along with its benefits and drawbacks. We then explained the latest patches in the GCC code base that implemented support for static-pie binaries, discussing our desired outcome, and the approaches we followed to achieve the generation of static-pie ELF files with one single PT_LOAD segment. Finally, we discussed the anti-forensic features that SHELF loading can provide, which we think to be a considerable enhancement when compared with previous versions of ELF Reflective Loading.
We think this could be the next generation of ELF Reflective Loading, and it may benefit readers to understand the extent of offensive capabilities that the ELF file format can provide. If you would like access to the source code, contact @sblip or @ulexec.
6. References
[1] (support static pie) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498
[2] (first patch gcc) https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00638.html
[3] (gcc patch) https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=252034
[4] (glibc --enable-static-pie) https://sourceware.org/git/?p=glibc.git;a=commit; \ h=9d7a3741c9e59eba87fb3ca6b9f979befce07826
[5] (ldscript doc) https://sourceware.org/binutils/docs/ld/PHDRS.html#PHDRS
[6] https://sourceware.org/binutils/docs/ld/Output-Section-Phdr.html#Output-Section-Phdr
[7] https://www.akkadia.org/drepper/tls.pdf
[8] (why ld doesn't allow -static -pie -N)
https://sourceware.org/git \ /gitweb.cgi?p=binutils-gdb.git;a=blob;f=ld/ldmain.c; \ h=c4af10f4e9121949b1b66df6428e95e66ce3eed4;hb=HEAD#l345
[9] (grugq ul_exec paper) https://grugq.github.io/docs/ul_exec.txt
[10] (ELF UPX internals) https://ulexec.github.io/ulexec.github.io/article \ /2017/11/17/UnPacking_a_Linux_Tsunami_Sample.html