PECompact v2

THIS DOCUMENT IS BEING REPLACED BY A NEW ONE. THE NEW ONE IS WORK IN PROGRESS, BUT CAN BE VIEWED HERE

PECompact v2.0 Anti-Virus Interoperability Technical Document
Revision 1 04/25/05
Author: Bitsum Technologies / Jeremy Collake
http://www.bitsum.com
Email: support@bitsum.com

PECompact is a win32 Portable Executable compressor.

PECompact has been entirely redeveloped for version 2.0, so any code or signatures for PECompact 1.x will no longer work.

The purpose of this document is to aid anti-virus companies in achieving interoperability with PECompact v2.x. Typical problems are false alarms and inabilities to scan inside compressed modules.

For testing, a freeware version of PECompact is available here: http://www.bitsum.com/files/pec2student.zip.

I.) Attachment of decompression stub (Loader) to host modules
II.) Identification of PECompact'd modules
III.) Scanning inside compressed modules

I.) Attachment of decompression stub (we call 'Loader') to host modules

PECompact's loader is stored in 3 components. The table below describes each in detail, but it may be helpful if they are defined now:

Component 1 (SEH Entry): This code resides at the entry point of compressed modules. It transfers control to component 2, the Loader Decoder. It is called 'SEH Entry' for reasons described below.
Component 2 (Loader Decoder): This code has the responsibility of decompressing (decoding) and invoking component 3, the Primary Loader.
Component 3 (Primary Loader): Resides in the last section in its compressed form. Its uncompressed form exists at runtime in dynamically allocated memory.

The entry point address of compressed modules is where?

It can be in any one of three possible locations. If the original entry point of the host module falls within the bounds of the compressed physical image, PECompact overwrites the code there with the first component of its loader and does not change the entry point address. If the original entry point no longer exists in the compressed image and the module is NOT a DLL, PECompact overwrites the first few bytes of the code section with the first component of its loader and changes the entry point to that address. If the original entry point no longer exists in the compressed image and the modules IS a DLL, PECompact places the first component of its loader in a newly created section (named .rsrc). We have found that this case is the most likely to cause false alarms with many anti-virus heuristics, since many heuristic algorithms seem to rely heavily on if the entry point is not in the first code section (or in last section).

SEH Entry?

Since most anti-virus software seems to emulate the code at the entry point of modules, we found that we needed a way to hide the transfer of control to the last section (or second to last when base relocs are preserved), where the other two components of the loader code commonly reside. To accomplish this, we set up an SEH frame and cause an exception. The exception handler then modifies the code at the exception address to a JMP (short) and continues execution.

Loader components:

1.) SEH Entry (new entry point)

This is the code that resides at the entry point address of a compressed module. It has the single responsibility of transferring control to the Loader Decoder in a way that most anti-virus products are unable to emulate/trace through. This is done by setting up an SEH frame and causing an exception. The exception handler then changes the code at the exception address to a JMP (0xE9) whose destination is the start of the Loader Deoder. The exception handler then continues excecution at the exception location (invokes the JMP).

This component is stored in one of three possible locations:

      1.) At original entry point (unchanged). Code at original entry point is replaced and then restored at runtime.
      2.) At beginning of first code section. Code at beginning of first code section is replaced and then restored at runtime.
      3.) In newly created last section (or second to last section in the case of relocations preserved).

x86 asm at this location:

; --------------------------------------------
; move address of exception handler into EAX.
; This register may, in some variations,
; receive instead an address that requires addition
; or subtraction by the instructions following.
; This is done to obfuscate the address itself.
; The operand here has a base relocation entry
; if the base relocation directory exists.
; --------------------------------------------
mov eax,xxxxxxxxx

; --------------------------------------------
; WARNING: variable code here
; This location may contain any number of instructions
; (or none) prior to the next listed instruction.
; As described above, in some variations, code
; here will correct the address loaded into EAX.
; --------------------------------------------
; i.e... add eax,12345678h
;

; --------------------------------------------
; Set up SEH frame on stack. Typical code.
; --------------------------------------------
push eax
;assume fs:nothing
push dword ptr fs:[0]
mov fs:[0],esp

; --------------------------------------------
; The following instructions cause an exception.
; The address of the exception is modified
; by the exception handler so that it is a JMP
; to the Loader Decoder.
; These instruction(s) may vary when modified
; by end users or Loader Plug-in authors, but
; will always cause some sort of exception. We
; will keep these next two instructions as
; they are shown here in the default PECompact2
; loader as well as all other loaders authored
; by Bitsum Technologies.
; --------------------------------------------
xor eax,eax
mov [eax],ecx

; --------------------------------------------
; PECompact2 signature (may be 'PEC2' in some loaders)
; --------------------------------------------
db 'PECompact2',0 ; or db 'PEC2',0

2.) Loader Decoder

Responsible for decoding the primary loader into dynamically allocated memory (VirtualAlloc). This component calls the Primary Loader after it decodes it. The Primary Loader returns the RVA of the original host entry point, which is then jumped to. This component is stored in the last section, or second to last section if relocations were preserved.

This code and its location may vary.

The exception handler used above is at the beginning of this component.
Immediatley before the exception handler is a LOADER_DECODER_INFO object.

3.) Primary Loader (compressed)

Responsible for decoding and reconstructing the host module. This component is stored in the last section, or second to last section if relocations were preserved.

Physical image is compressed. Uncompressed image (only in dynamically allocated memory) is preceeded by a PEC_HOST_INFO object.

Sections:

PECompact creates one or two new sections during compression, and may delete several sections. The first added section is named '.rsrc' and has the attributes 0xE0000020. The second added section is for DLL(s) and contains the new fixup table. It is named '.reloc' and has the attributes 0xC0000040.

II.) Identification of PECompact'd modules

PECompact itself identifies modules it has compressed by checking to see if the first section's header field PointerToRelocations is set to '2CEP' (displays as 'PEC2' stored big endian). Of course, this is not sufficient for most anti-virus purposes. Before we give our recommended identification procedure, we must stress the following:

PECompact allows for alternate Loaders to be used via a Loader plug-in. Since these loaders may differ in code and function, our recommended identificaiton procedure targets parts not likely to be substantially changed.

Identification procedure (basically, find SEH entry exception handler and see if a LOADER_DECODER_OBJECT preceeds it):

1.) Check code at entry point for SEH frame creation on the stack (allow for any variation of the code if possible):
    a.) If not found, assume is not compressed with PECompact2. Exit procedure.
2.) Get exception handler address set by SEH frame.
    a.) Is the first instruction: MOV REG32, XXXXXXXX (i.e. mov eax,12038192h). This is the calculated delta offset.
        i.) If not, assume not PECompact2 compressed module. Exit procedure.
    b.) Does a LOADER_DECODER_INFO object appear to preceed the exception handler?
        i.) If not, assume not PECompact2 compressed module. Exit procedure.
3.) PECompact2 Module Identified!

III.) Scanning inside compressed modules

Tracing through the loader to get the decompressed image:

Get the address of the exception handler set up by the SEH frame. This exception handler's code ends with 'XOR EAX,EAX' then 'RET'. The byte following the 'RET' instruction is the beginning of the Loader Decoder. Therefore, a breakpoint can be placed there. An exception will be generated as part of the jump mechanism described above, so be prepared to ignore (pass on to next handler) that exception if possible and wait for your breakpoint. The next non-API call out of the current section will be to the Primary Loader in dynamically allocated memory. Remember that the important PEC_HOST_INFO object immediately preceeds the Primary Loader's start address in dynamically allocated memory. When EIP returns back from the dynamically allocated memory (and not before), the next time it leaves the current section (other than API call(s)) will be the jump to the original host entry point.

Obtaining Information About The Module (i.e. original import directory RVA):

A LOADER_DECODER_INFO object resides just before the exception handler used by the SEH Entry component of the loader (@EP):
C++:

typedef struct _PEC2_DECODER_EXTRA
{
DWORD m_dwLoadLibraryA; // rva in LOADER_DECODER_INFO, pointer else -- image based
DWORD m_dwGetProcAddressA; // rva in LOADER_DECODER_INFO, pointer else -- added to at runtime
} PEC2_DECODER_EXTRA, *PPEC2_DECODER_EXTRA;

typedef struct _LOADER_DECODER_INFO
{
DWORD rvaCompressedLoader;
DWORD dwUncompressedLoaderSize;
DWORD dwOffsetToLoaderEntry;
DWORD rvaDecoder;
DWORD rvaVirtualAlloc;
DWORD rvaVirtualFree;
DWORD dwRealImageBase;
PEC2_DECODER_EXTRA extraDecoderInfo;
} LOADER_DECODER_INFO, *PLOADER_DECODER_INFO;

ASM:

PEC2_DECODER_EXTRA struct
pLoadLibraryA dd ?
pGetProcAddress dd ?
PEC2_DECODER_EXTRA ends

LOADER_DECODER_INFO struct
rvaCompressedLoader dd ?
dwUncompressedLoaderSize dd ?
dwOffsetOfLoaderEntry dd ?
rvaDecoder dd ?
rvaVirtualAlloc dd ?
rvaVirtualFree dd ?
dwRealImageBase dd ?
extraDecoderInfo PEC2_DECODER_EXTRA <?,?>
LOADER_DECODER_INFO ends

After the Primary Loader has been decompressed by the Loader Decoder into dynamically allocated memory (via VirtualAlloc), an (implied negative) offset from the start of the Primary Loader to a PEC_HOST_INFO object is stored as a DWORD immediately preceeding the Primary Loader's code/entry. That is to say:

Offset to PEC_HOST_INFO = [ Primary Loader Address - sizeof(DWORD) ]
Primary Loader Address - Offset To PEC_HOST_INFO = Pointer to PEC_HOST_INFO

You can identify the Primary Loader by the 'call eax' instruction in the Loader Decoder that invokes it.

C++:

typedef struct _PEC_HOST_INFO
{
WORD m_wStructSize; // size of this structure
WORD m_wTotalDecoders; // total decoders in decoder array
DWORD m_dwDefaultImageBase; // the default image base of the module
DWORD m_dwActualImageBase; // fixup entry for this..
DWORD m_dwOriginalEntryPoint; // the original entry point RVA
DWORD m_dwReserved0;
DWORD m_RVALoaderDecoder; // the RVA of the decoder for the loader
DWORD m_dwReserved1;
DWORD m_ppLoadLibrary; // **LoadLibraryA
DWORD m_ppGetProcAddress; // **GetProcAddress
DWORD m_ppVirtualAlloc; // **VirtualAlloc
DWORD m_ppVirtualFree; // **VirtualFree
DWORD m_dwWorkingMemoryRequired; // reqd size of temporary working memory for reconstruction
DWORD m_dwDecodeFuncArrayOffset; // offset from beginning of PEC_HOST_INFO to decoder array
DWORD m_dwRVAOriginalImportDirectory; // RVA of original import dir. may have been modified to proprietary structs ! see sample code
DWORD m_dwRVAOriginalRelocDirectory; // RVA of original base reloc dir. may have been modified to proprietary structs ! see sample code
DWORD m_wNumberOfPecBlocks; // number of PEC_BLOCK descriptors in block array.
DWORD m_dwStubRVA; // RVA of loader stub 0 (SEH entry)
DWORD m_dwRVAOriginalBytes; // RVA of original bytes overwritten by loader stub 0
DWORD m_dwNewEntryInLastSection; //
DWORD m_dwExtraBlockDataArrayOffset;// offset to array of relocated data (overkill/extra data) descriptors (expanded by encoders beyond section limits)
} PEC_HOST_INFO, *PPEC_HOST_INFO;

ASM:

PEC_HOST_INFO struct 1
m_wStructSize dw ?
m_wTotalDecoders dw ?
m_dwDefaultImageBase dd ?
m_dwActualImageBase dd ?
m_dwOriginalEntryPoint dd ?
m_dwReserved0 dd ?
m_RVALoaderDecoder dd ?
m_dwReserved1 dd ?
m_rvapLoadLibrary dd ?
m_rvapGetProcAddress dd ?
m_rvapVirtualAlloc dd ?
m_rvapVirtualFree dd ?
m_dwWorkingMemoryRequired dd ?
m_dwDecodeFuncArrayOffset dd ?
m_dwRVAOriginalImportDirectory dd ?
m_dwRVAOriginalRelocDirectory dd ?
m_wNumberOfPecBlocks dd ?
m_dwStubRVA dd ?
m_dwRVAOldBytes dd ?
m_dwNewEntryInLastSection dd ?
m_dwExtraBlockDataArrayOffset dd ?
PEC_HOST_INFO ends