THIS DOCUMENT IS BEING REPLACED BY A NEW ONE. THE NEW ONE IS WORK IN PROGRESS, BUT CAN BE VIEWED HERE
PECompact v2.0 Anti-Virus Interoperability Technical Document
Revision 1 04/25/05
Author: Bitsum Technologies / Jeremy Collake
PECompact is a win32 Portable Executable compressor.
PECompact has been entirely redeveloped for version 2.0, so any code or signatures for PECompact 1.x will no longer work.
The purpose of this document is to aid anti-virus companies in achieving interoperability with PECompact v2.x. Typical problems are false alarms and inabilities to scan inside compressed modules.
For testing, a freeware version of PECompact is available here: http://www.bitsum.com/files/pec2student.zip.
I.) Attachment of decompression stub (Loader) to host modules
II.) Identification of PECompact'd modules
III.) Scanning inside compressed modules
I.) Attachment of decompression stub (we call 'Loader') to host modules
PECompact's loader is stored in 3 components. The
table below describes each in detail, but it
may be helpful if they are defined now:
Component 1 (SEH Entry): This code resides at the entry point of compressed modules. It transfers control to component 2, the Loader Decoder. It is called 'SEH Entry' for reasons described below.
Component 2 (Loader Decoder): This code has the responsibility of decompressing (decoding) and invoking component 3, the Primary Loader.
Component 3 (Primary Loader): Resides in the last section in its compressed form. Its uncompressed form exists at runtime in dynamically allocated memory.
The entry point address of compressed modules is where?
It can be in any one of three possible locations. If the original entry point of the host module falls within the bounds of the compressed physical image, PECompact overwrites the code there with the first component of its loader and does not change the entry point address. If the original entry point no longer exists in the compressed image and the module is NOT a DLL, PECompact overwrites the first few bytes of the code section with the first component of its loader and changes the entry point to that address. If the original entry point no longer exists in the compressed image and the modules IS a DLL, PECompact places the first component of its loader in a newly created section (named .rsrc). We have found that this case is the most likely to cause false alarms with many anti-virus heuristics, since many heuristic algorithms seem to rely heavily on if the entry point is not in the first code section (or in last section).
Since most anti-virus software seems to emulate the code at the entry point of modules, we found that we needed a way to hide the transfer of control to the last section (or second to last when base relocs are preserved), where the other two components of the loader code commonly reside. To accomplish this, we set up an SEH frame and cause an exception. The exception handler then modifies the code at the exception address to a JMP (short) and continues execution.
|1.) SEH Entry (new entry point)||This is the code that resides at the entry point address of a
compressed module. It has the single responsibility of transferring
control to the Loader Decoder in a way that most anti-virus products are
unable to emulate/trace through. This is done by setting up an SEH frame
and causing an exception. The
exception handler then changes the code at the exception address to a JMP (0xE9) whose destination is the
start of the Loader Deoder. The exception handler then continues excecution at
the exception location (invokes the JMP).
This component is stored in one of three
|2.) Loader Decoder||Responsible for decoding the primary loader into dynamically
allocated memory (VirtualAlloc). This component calls the Primary Loader
after it decodes it. The Primary Loader returns the RVA of the original
host entry point, which is then jumped to. This component is stored in
the last section, or second to last section if relocations were
This code and its location may vary.
|3.) Primary Loader (compressed)||Responsible for decoding and reconstructing the host module. This
component is stored in the last section, or second to last section if
relocations were preserved.
Physical image is compressed. Uncompressed image (only in dynamically allocated memory) is preceeded by a PEC_HOST_INFO object.
PECompact creates one or two new sections during compression, and may delete several sections. The first added section is named '.rsrc' and has the attributes 0xE0000020. The second added section is for DLL(s) and contains the new fixup table. It is named '.reloc' and has the attributes 0xC0000040.
II.) Identification of PECompact'd modules
PECompact itself identifies modules it has compressed by checking to see if the first section's header field PointerToRelocations is set to '2CEP' (displays as 'PEC2' stored big endian). Of course, this is not sufficient for most anti-virus purposes. Before we give our recommended identification procedure, we must stress the following:
Scanning inside compressed modules
Tracing through the loader to get the decompressed image:
Get the address of the exception handler set up by the SEH frame. This exception handler's code ends with 'XOR EAX,EAX' then 'RET'. The byte following the 'RET' instruction is the beginning of the Loader Decoder. Therefore, a breakpoint can be placed there. An exception will be generated as part of the jump mechanism described above, so be prepared to ignore (pass on to next handler) that exception if possible and wait for your breakpoint. The next non-API call out of the current section will be to the Primary Loader in dynamically allocated memory. Remember that the important PEC_HOST_INFO object immediately preceeds the Primary Loader's start address in dynamically allocated memory. When EIP returns back from the dynamically allocated memory (and not before), the next time it leaves the current section (other than API call(s)) will be the jump to the original host entry point.
Obtaining Information About The Module (i.e. original import directory RVA):
A LOADER_DECODER_INFO object resides just before the exception handler used by the SEH Entry component of the loader (@EP):
typedef struct _PEC2_DECODER_EXTRA
DWORD m_dwLoadLibraryA; // rva in LOADER_DECODER_INFO, pointer else -- image based
DWORD m_dwGetProcAddressA; // rva in LOADER_DECODER_INFO, pointer else -- added to at runtime
} PEC2_DECODER_EXTRA, *PPEC2_DECODER_EXTRA;
typedef struct _LOADER_DECODER_INFO
} LOADER_DECODER_INFO, *PLOADER_DECODER_INFO;
pLoadLibraryA dd ?
pGetProcAddress dd ?
rvaCompressedLoader dd ?
dwUncompressedLoaderSize dd ?
dwOffsetOfLoaderEntry dd ?
rvaDecoder dd ?
rvaVirtualAlloc dd ?
rvaVirtualFree dd ?
dwRealImageBase dd ?
extraDecoderInfo PEC2_DECODER_EXTRA <?,?>
After the Primary Loader has been decompressed by the Loader Decoder into dynamically allocated memory (via VirtualAlloc), an (implied negative) offset from the start of the Primary Loader to a PEC_HOST_INFO object is stored as a DWORD immediately preceeding the Primary Loader's code/entry. That is to say:
Offset to PEC_HOST_INFO = [ Primary Loader Address - sizeof(DWORD) ]
Primary Loader Address - Offset To PEC_HOST_INFO = Pointer to PEC_HOST_INFO
You can identify the Primary Loader by the 'call eax' instruction in the Loader Decoder that invokes it.
typedef struct _PEC_HOST_INFO
WORD m_wStructSize; // size of this structure
WORD m_wTotalDecoders; // total decoders in decoder array
DWORD m_dwDefaultImageBase; // the default image base of the module
DWORD m_dwActualImageBase; // fixup entry for this..
DWORD m_dwOriginalEntryPoint; // the original entry point RVA
DWORD m_RVALoaderDecoder; // the RVA of the decoder for the loader
DWORD m_ppLoadLibrary; // **LoadLibraryA
DWORD m_ppGetProcAddress; // **GetProcAddress
DWORD m_ppVirtualAlloc; // **VirtualAlloc
DWORD m_ppVirtualFree; // **VirtualFree
DWORD m_dwWorkingMemoryRequired; // reqd size of temporary working memory for reconstruction
DWORD m_dwDecodeFuncArrayOffset; // offset from beginning of PEC_HOST_INFO to decoder array
DWORD m_dwRVAOriginalImportDirectory; // RVA of original import dir. may have been modified to proprietary structs ! see sample code
DWORD m_dwRVAOriginalRelocDirectory; // RVA of original base reloc dir. may have been modified to proprietary structs ! see sample code
DWORD m_wNumberOfPecBlocks; // number of PEC_BLOCK descriptors in block array.
DWORD m_dwStubRVA; // RVA of loader stub 0 (SEH entry)
DWORD m_dwRVAOriginalBytes; // RVA of original bytes overwritten by loader stub 0
DWORD m_dwNewEntryInLastSection; //
DWORD m_dwExtraBlockDataArrayOffset;// offset to array of relocated data (overkill/extra data) descriptors (expanded by encoders beyond section limits)
} PEC_HOST_INFO, *PPEC_HOST_INFO;
PEC_HOST_INFO struct 1
m_wStructSize dw ?
m_wTotalDecoders dw ?
m_dwDefaultImageBase dd ?
m_dwActualImageBase dd ?
m_dwOriginalEntryPoint dd ?
m_dwReserved0 dd ?
m_RVALoaderDecoder dd ?
m_dwReserved1 dd ?
m_rvapLoadLibrary dd ?
m_rvapGetProcAddress dd ?
m_rvapVirtualAlloc dd ?
m_rvapVirtualFree dd ?
m_dwWorkingMemoryRequired dd ?
m_dwDecodeFuncArrayOffset dd ?
m_dwRVAOriginalImportDirectory dd ?
m_dwRVAOriginalRelocDirectory dd ?
m_wNumberOfPecBlocks dd ?
m_dwStubRVA dd ?
m_dwRVAOldBytes dd ?
m_dwNewEntryInLastSection dd ?
m_dwExtraBlockDataArrayOffset dd ?