THIS DOCUMENT IS BEING REPLACED BY A NEW ONE. THE NEW ONE IS WORK IN PROGRESS, BUT CAN BE VIEWED HERE
PECompact v2.0 Anti-Virus Interoperability Technical Document
Revision 1 04/25/05
Author: Bitsum Technologies / Jeremy Collake
http://www.bitsum.com
Email: support@bitsum.com
PECompact is a win32 Portable Executable compressor.
PECompact has been entirely redeveloped for version 2.0, so any code or
signatures for PECompact 1.x will no longer work.
The purpose of this document is to aid anti-virus companies in achieving
interoperability with PECompact v2.x. Typical problems are false alarms and
inabilities to scan inside compressed modules.
For testing, a freeware version of PECompact is available here:
http://www.bitsum.com/files/pec2student.zip.
I.) Attachment of decompression
stub (Loader) to host modules
II.) Identification of
PECompact'd modules
III.) Scanning inside
compressed modules
I.) Attachment of decompression stub (we call 'Loader') to host modules
PECompact's loader is stored in 3 components. The
table below describes each in detail, but it
may be helpful if they are defined now:
Component 1 (SEH Entry): This code resides at the entry point of compressed
modules. It transfers control to component 2, the Loader Decoder. It is called
'SEH Entry' for reasons described below.
Component 2 (Loader Decoder): This code has the responsibility of decompressing
(decoding) and invoking component 3, the Primary Loader.
Component 3 (Primary Loader): Resides in the last section in its compressed
form. Its uncompressed form exists at runtime in dynamically allocated memory.
The entry point address of compressed modules is where?
It can be in any one of three possible locations. If the original entry point of
the host module falls within the bounds of the compressed physical image,
PECompact overwrites the code there with the first
component of its loader and does not change the entry point address. If the
original entry point no longer exists in the compressed image and the module is
NOT a DLL, PECompact overwrites the first few bytes of the code section with the
first component of its loader and changes the entry point to that address. If
the original entry point no longer exists in the compressed image and the
modules IS a DLL, PECompact places the first component of its loader in a newly
created section (named .rsrc). We have found that this case is the most likely
to cause false alarms with many anti-virus heuristics, since many heuristic
algorithms seem to rely heavily on if the entry point is not in the first code
section (or in last section).
SEH Entry?
Since most anti-virus software seems to emulate the code at the entry point of
modules, we found that we needed a way to hide the transfer of control to the
last section (or second to last when base relocs are preserved), where the other
two components of the loader code commonly reside. To accomplish this, we set up
an SEH frame and cause an exception. The exception handler then modifies the
code at the exception address to a JMP (short) and continues execution.
Loader components:
1.) SEH Entry (new entry point) | This is the code that resides at the entry point address of a
compressed module. It has the single responsibility of transferring
control to the Loader Decoder in a way that most anti-virus products are
unable to emulate/trace through. This is done by setting up an SEH frame
and causing an exception. The
exception handler then changes the code at the exception address to a JMP (0xE9) whose destination is the
start of the Loader Deoder. The exception handler then continues excecution at
the exception location (invokes the JMP).
This component is stored in one of three
possible locations: ;
--------------------------------------------
|
2.) Loader Decoder | Responsible for decoding the primary loader into dynamically
allocated memory (VirtualAlloc). This component calls the Primary Loader
after it decodes it. The Primary Loader returns the RVA of the original
host entry point, which is then jumped to. This component is stored in
the last section, or second to last section if relocations were
preserved. This code and its location may vary. |
3.) Primary Loader (compressed) | Responsible for decoding and reconstructing the host module. This
component is stored in the last section, or second to last section if
relocations were preserved. Physical image is compressed. Uncompressed image (only in dynamically allocated memory) is preceeded by a PEC_HOST_INFO object. |
Sections:
PECompact creates one or two new sections during compression, and may delete
several sections. The first added section is named '.rsrc' and has the
attributes 0xE0000020. The second added section is for DLL(s) and contains the
new fixup table. It is named '.reloc' and has the attributes 0xC0000040.
II.) Identification of PECompact'd modules
PECompact itself identifies modules it has compressed by checking to see if the
first section's header field PointerToRelocations is set to '2CEP' (displays as
'PEC2' stored big endian). Of course, this is not sufficient for most anti-virus
purposes. Before we give our recommended identification procedure, we must
stress the following:
III.)
Scanning inside compressed modules
Tracing through the loader to get the decompressed image:
Get the address of the exception handler set up by the SEH frame. This
exception handler's code ends with 'XOR EAX,EAX' then 'RET'. The byte following
the 'RET' instruction is the beginning of the Loader
Decoder. Therefore, a breakpoint can be placed there. An exception will be
generated as part of the jump mechanism described above,
so be prepared to ignore (pass on to next handler) that exception if possible
and wait for your breakpoint. The next non-API call out of the current section
will be to the Primary Loader in dynamically allocated memory. Remember that the
important PEC_HOST_INFO object immediately preceeds
the Primary Loader's start address in dynamically allocated memory. When EIP
returns back from the dynamically allocated memory (and not before), the
next time it leaves the current section (other than API call(s)) will be the
jump to the original host entry point.
Obtaining Information About The Module (i.e. original import directory RVA):
A LOADER_DECODER_INFO object resides just before the exception handler
used by the SEH Entry component of the loader (@EP):
C++:
typedef struct _PEC2_DECODER_EXTRA
{
DWORD m_dwLoadLibraryA; // rva in LOADER_DECODER_INFO, pointer else -- image
based
DWORD m_dwGetProcAddressA; // rva in LOADER_DECODER_INFO, pointer else -- added
to at runtime
} PEC2_DECODER_EXTRA, *PPEC2_DECODER_EXTRA;
typedef struct _LOADER_DECODER_INFO
{
DWORD rvaCompressedLoader;
DWORD dwUncompressedLoaderSize;
DWORD dwOffsetToLoaderEntry;
DWORD rvaDecoder;
DWORD rvaVirtualAlloc;
DWORD rvaVirtualFree;
DWORD dwRealImageBase;
PEC2_DECODER_EXTRA extraDecoderInfo;
} LOADER_DECODER_INFO, *PLOADER_DECODER_INFO;
ASM:
PEC2_DECODER_EXTRA struct
pLoadLibraryA dd ?
pGetProcAddress dd ?
PEC2_DECODER_EXTRA ends
LOADER_DECODER_INFO struct
rvaCompressedLoader dd ?
dwUncompressedLoaderSize dd ?
dwOffsetOfLoaderEntry dd ?
rvaDecoder dd ?
rvaVirtualAlloc dd ?
rvaVirtualFree dd ?
dwRealImageBase dd ?
extraDecoderInfo PEC2_DECODER_EXTRA <?,?>
LOADER_DECODER_INFO ends
After the Primary Loader has been decompressed by the Loader Decoder into
dynamically allocated memory (via VirtualAlloc), an (implied negative) offset
from the start of the Primary Loader to a PEC_HOST_INFO object is stored as a
DWORD immediately preceeding the Primary Loader's code/entry. That is to say:
Offset to PEC_HOST_INFO = [ Primary Loader Address - sizeof(DWORD) ]
Primary Loader Address - Offset To PEC_HOST_INFO = Pointer to PEC_HOST_INFO
You can identify the Primary Loader by the 'call eax' instruction in the Loader
Decoder that invokes it.
C++:
typedef struct _PEC_HOST_INFO
{
WORD m_wStructSize; // size of this structure
WORD m_wTotalDecoders; // total decoders in decoder array
DWORD m_dwDefaultImageBase; // the default image base of the module
DWORD m_dwActualImageBase; // fixup entry for this..
DWORD m_dwOriginalEntryPoint; // the original entry point RVA
DWORD m_dwReserved0;
DWORD m_RVALoaderDecoder; // the RVA of the decoder for the loader
DWORD m_dwReserved1;
DWORD m_ppLoadLibrary; // **LoadLibraryA
DWORD m_ppGetProcAddress; // **GetProcAddress
DWORD m_ppVirtualAlloc; // **VirtualAlloc
DWORD m_ppVirtualFree; // **VirtualFree
DWORD m_dwWorkingMemoryRequired; // reqd size of temporary working memory for
reconstruction
DWORD m_dwDecodeFuncArrayOffset; // offset from beginning of PEC_HOST_INFO to
decoder array
DWORD m_dwRVAOriginalImportDirectory; // RVA of original import dir. may have
been modified to proprietary structs ! see sample code
DWORD m_dwRVAOriginalRelocDirectory; // RVA of original base reloc dir. may have
been modified to proprietary structs ! see sample code
DWORD m_wNumberOfPecBlocks; // number of PEC_BLOCK descriptors in block array.
DWORD m_dwStubRVA; // RVA of loader stub 0 (SEH entry)
DWORD m_dwRVAOriginalBytes; // RVA of original bytes overwritten by loader stub
0
DWORD m_dwNewEntryInLastSection; //
DWORD m_dwExtraBlockDataArrayOffset;// offset to array of relocated data
(overkill/extra data) descriptors (expanded by encoders beyond section limits)
} PEC_HOST_INFO, *PPEC_HOST_INFO;
ASM:
PEC_HOST_INFO struct 1
m_wStructSize dw ?
m_wTotalDecoders dw ?
m_dwDefaultImageBase dd ?
m_dwActualImageBase dd ?
m_dwOriginalEntryPoint dd ?
m_dwReserved0 dd ?
m_RVALoaderDecoder dd ?
m_dwReserved1 dd ?
m_rvapLoadLibrary dd ?
m_rvapGetProcAddress dd ?
m_rvapVirtualAlloc dd ?
m_rvapVirtualFree dd ?
m_dwWorkingMemoryRequired dd ?
m_dwDecodeFuncArrayOffset dd ?
m_dwRVAOriginalImportDirectory dd ?
m_dwRVAOriginalRelocDirectory dd ?
m_wNumberOfPecBlocks dd ?
m_dwStubRVA dd ?
m_dwRVAOldBytes dd ?
m_dwNewEntryInLastSection dd ?
m_dwExtraBlockDataArrayOffset dd ?
PEC_HOST_INFO ends