Malware Analysis Lesson 8; The final boss of the assembly.

Last lesson was pretty heavy, covering even more instruction but also seeing how they come together to create the standard instruction sets that form the structures we expect to see from “regular” programming language like if statements, for and while loops. This time we are going to go into arrays, switch statements, windows API calls and some other bits and pieces as we try to round off these notes.

Switch statements are similar to IF statements only when we assign the variable a value and that value is used to select from a large number of switch options. It sounds abit weird but generally any menu we see where we chose from a selection of options uses switches. There are two common ways to implement switches, “if style” and “jump tables”.

If Style

If style switch statements are the standard one we think about where a variable is received, usually from the user, and the program goes through each switch case until a match is found and the code in that case is executed. If no match is found generally there is a default case that is used.

The above image shows the standard syntax for a high level switch statement. In Assembly this looks like;

Here we can see the variable i which is moved to the register EAX. We then use CMP command to try and find a match. If a match is found we execute the code, if its not we JMP to the next bit of assembly. On a side note this is actually quite a cool moment for me, reading through the assembly for this switch statement is now super easy. After the first 3 RE blog posts I am now able to recognize and understand the assembly commands used and the logic behind the flow of the code. If your getting lost when I mention EAX, JMP and CMP read my previous blog posts as they go through it all in detail. 🙂 Back on track now though. The JE command is new and means jump if equal. There is a good thread on it here. What JE does is it jumps to a given point if the cmp values match/are true. If they are not true than the code moves to the next line, in this case our JMP. At the end of this code block is an unconditional JMP to break out of the switch statement. By using obfuscated variables to match and a large number of switch cases this could be another area to obfuscate malware.

Jump Table

If styles work well with a limited number of options but as the number of cases grow performance can degrade. To prevent this and optimize the code we create jump tables. The jump tables define the offsets to the memory location of each separate case statements, acting as an index table with the switch variable as the search term. This allows for less comparisons needed by the code.

Switch jump table
High level vies of jump tables

In this example we can see a few things happening. Firstly the variable is subtracted by 1. This is because jump table index starts at 0. So in this case we see jump table cases 0-3 and not 1-4. Next we can see the code cmp each case label to see if it is out of range using JA, which is Jump if variable is above/greater than (description of this is here.).

Arrays

Arrays are made up similarly to code we have seen before. The main thing about arrays are how the values are assigned to the indices. In assembly Arrays are accessed using a base address as a starting point. Since these are both arrays of integers, each element is of size 4 and so each subsequent element can be accessed by a multiple of 4. This is because each entry is 4 bytes in size.

Arrays are simple data structures used to store similar data. In the example above we can see the square brackets used in the assembly to reference an array value. They are important for us to know though as malware is sometimes written to use and array of pointers to strings that contain multiple host names that are used as options for connections.

Calling conventions

Calling conventions are how we call functions. These function calls can appear differently in assembly code and calling conventions govern the way the function call occurs. Conventions include the order parameters are placed on the stacks and in the registers, as well as who is responsible for cleaning up the stack when the function is complete. There are different conventions and the convention used is specific to the compiler. Despite these differences we need to use certain universal conventions when using the Windows API.

CDECL

CDECL is one of the most common function call conventions and pushes parameters onto the stack from right to left. It is the responsibility of the caller to clean the stack after the function completes and the return value is stored in EAX. This is the convention thats been used in this blog. The way it cleans the stack is by using an ADD instruction to dereference the parameters pushed into stack by adding 8 bytes to the stack pointer.

STDCALL

This convention is similar to CDECL except it requires the callee to clean the stack after the function complete. The way this works is that the ADD instruction is not needed to clean the stack. STDCALL is used for Windows API calls. Any code calling these API functions will not need to clean up the stack after as the DLLs do this instead.

Windows

With that last bit done we are mostly finished with Assembly but there are a few tidbits on malware analysis I am going to include here. These bits are important as to understand malware functionality we also need to know the key components of the target OS outside of the assembly code that’s run. Lets focus on Windows APIs, The Registry and Networking API’s. The windows API is a broad set of functionality that governs the way that programs, including malware, can interact with the Microsoft libraries.

Hungarian Notation

Windows uses its own names to represent c value types, and some types like INT, SHORT and LONG are not used at all. We see this best in the windows registry where we have types like DWORD (32bit) and WORD(16bit) unsigned integers. Windows also uses Hungarian notation for API function identifiers. What this means is that a prefex is added to help identify the variable type such as dwSize as a parameter where the dw stands for DWORD.

Filesystem functions

One of the most common IoC’s for malware is when they create or modify files. When we are looking at an infection the file system and changes to the file system should be monitored and reviewed. We can uses some tools like Process Monitor for this. Some of the functions that can be called are;

  • CreateFile – This function is used to open files, pipes, streams and I/O devices as well as to create new files. The parameter dwCreationDisposition controls whether the call creats a new file or opens an existing one. Remember the dw prefix means ins a 32bit unsigned integer.
  • ReadFile and WriteFile – These functions are used for reading from and writing to files and operate on files as a stream. Both functions contain a parameter that signifies the number of bytes to read or write. That parameter controls the size of the data chunk that is read or written.
  • CreateFileMapping and MapViewOfFile – File mapping are commonly used by malware as they allow a file to be loaded into memory and manipulated. Creating a mapping loads a file from disk into memory while viewing a mapping returns a pointer to the base address of the mapping which can be used to access the file in memory. The program calling these function can use the pointer to read and write anywhere in the file.

The registry

The windows registry is used to store OS and program configuration information like settings or options. You are possibly familiar with it if you have used the regedit application. The registry is a hierarchical database of information and most windows configuration information is stored there including networking information, driver configuration, startup settings and useraccounts. The amount stored is quite vast and anyone looking for a career in IT should spend time going through it and learning the different categories. Knowing it is particularly helpful to malware analysts and malware uses the registry to gain persistance or to access or modify configuration data. The malware often adds entries into the registry that will allow it to run automatically when the computer boots.

The hierarchy

  • Root key – the registry has 5 top level sections called root keys or HKEY’s. Each root has a specific purpose or target.
  • Subkey – is a subfolder of the root key.
  • Key – is a folder in the registry that can contain additional folders or values. Root keys and sub keys are both “keys”.
  • Value entry – is an ordered name-value pair.
  • value or data – is the data stored in a registry entry.

Root keys

  • HKEY_LOCAL_MACHINE (HKLM) Stores settings that are global to the local machine.
  • HKEY_CURRENT_USER (HKCU) Stores settings specific to the current user.
  • HKEY_CLASSES_ROOT Stores information defining types.
  • HKEY_CURRENT_CONFIG Stores settings about the current hardware configuration, specifically differences between the current and standard configuration.
  • HKEY_USERS Defines setting s for the default user, new users and current users.

Common registry functions

Malware often uses registry functions that are part of the windows API in order to modify the registry to run automatically when the system boots. The most common registry functions used are;

  • RegOpenKeyEx – Opens a registry for editing and querying. There are functions that allow you to query and edit a registry key without opening it first, but most use RegOpenKeyEx.
  • RegSetValueEx – Adds a new value to the registry and sets it data.
  • RegGetValue – returns the data for a value entry in the registry.

These functions are used by malware and when we see them we should investigate the keys they are trying to access. There are also keys that will allow the malware to run at startup but many more deal with a systems security and settings so persistence may not be the only reason the malware is accessing the registry.

Networking API

Malware can also make use of network functions using API calls. There are many potential functions malware could take advantage of but of these option malware most commonly uses Berkeley Compatible Sockets. This allows malware to work across both windows and unix/linux systems. In windows this functionality is implemented in the Winsock Libraries, primarily in ws2_32.dll. Common functions include;

  • Socket – creates a socket.
  • Bind – Attaches a socket to a particular port, prior to accepting a call.
  • Listen – indicates that a socket will listen for incoming connections.
  • Accept – Opens a connection to a remote socket and accepts a connection
  • Connect – Opens a connection to a remote socket; the remote socket must be waiting for the connection.
  • Recv – Recieves data from the remote socket.
  • Send – Sends data to the remote socket.

The WSAStartup function must be called before any other networking functions in order to allocate resources for the networking libraries. When investigating network activity we need to consider both local and remote sides of the connection. The remote side usually maintains an open socket that is listening for incoming connections while the local side connects to that waiting port. Malware can use either side’s functionality and can act as a client(sending information to a C2) or a server (receiving instructions from a C2).

For a client side/local side application that connects to a remote socket we will see the socket call followed by the connect call, followed by send and recv.

For a server application/remote side that listens for incoming connections we will see a socket call, followed by a bind call, followed by a listen call and finally an accept call in this order. After this accept we will see send and recv calls.

Dynamic Link Libraries (DLL)

DLL’s are windows code libraries used by multiple applications. A DLL is an executable file that does not run alone but exports functions that can be used by other application. Static libraries still exist in windows but DLL’s are more common as they allow for code reuse and sharing. Sharing is possible as the single instance of the DLL loaded in memory can be accessed by multiple processes. #Optimized!

Unfortunatly Malware authors take advantage of DLLs in 3 primary ways;

  1. They use the basic Windows DLLs found on every system to interact with the OS. As malware analysts we can see what DLLs are being used by malware to gain an understanding for what is being achieved. With the static analysis we did previously we could get the functions called using strings.exe.
  2. They store malicious code in the DLL. Similar to how trojans work this can allow the author to attach their malware to multiple processes.
  3. Using third party DLLs the malware can also interact with other programs. We can see this when malware imports functions from a third party DLL and this can help us identify what the malwares goal is.

Processes

Malware can also execute code outside of whichever program it is in by creating new processes or modifying existing ones. In the past malware has been a stand alone process but now adays malware tries to avoid detection by running as part of another process, a process we trust. Windows uses processes as containers to manage resources and keep separate programs from interfering with each other. Malware can use CreateProcess to spawn a new process. With CreateProcess malware has a high degree of control over how the new process is created. For example the malware could use this to create a process to execute malicious code, or to create an instance of Internet Explorer(They still exist!) and direct the browser to access malicious sites and content. The most common thing for malware to do with a new process is to create a simple remote shell that the malware author can use to gain access to the machine without any other tools. Scary.

From endgame.com

Summary

It has taken us a few months but our series on Malware Analysis has finally come to an end. Its been an incredible journey for me growing from having no knowledge to where i am now. Through these blog posts we should now understand what IoC’s can suggest a malware infection, how to safely carry out static and dynamic analysis(including the tools to use!), how malware can hide from detection(and why signature based antivirus’ are no longer sufficient) and finally we spent a lot of time going through reverse engineering, expanding our under standing of assembly and the most common instructions we will see.

At this point if you have been following the blog you, like me, should now have strong entry level understanding of malware analysis and how malware analysts carry out their day to day jobs. From here the next step is practice. Grab malware samples, put them onto your lab and start analysis them and trying to reverse engineer them to understand what the code is doing. This is the only way to get familiar with the concepts and who knows, you might end up like Marcus Hutchins and stop the spread of the next WannaCry.

Best of luck!

Malware Analysis Lesson 7; More assembly!

Last week we made good headway into assembly. This week I am going to go through variables, a few more assembly operations and finally start looking at code constructs; loops, and branch statements. One of the biggest challenges to reverse engineers is that it can be impossible to step through an executables disassembled files due to the sheer amount of assembly instructions we would need to read through. I used to work with a bank and often spoke with their CTI team. On one occasion they gave me advice on how to handle navigating the amount of assembly instructions is to keep in mind the overall picture, the high level understanding of what the code does by looking at the groups of instructions, rather that panicking and trying to figure out and trace what malicious action “mov eax, ebx” is actually part of unless it is needed.

Most malware is written in C or C++ and we can see the coding constructs like loops, if statements, arrays, goto statements, switch statements and more in the assembly code as well as in the high level code itself. This blog is going to look at some more assembly instructions but also how these standard constructs look in assembly.

We already spoke about some of the registers in a previous blog but we still have some more to review;

  • ECX – Counter for string and loop operations
  • EDX – I/O pointer
  • ESI – Source pointer for stream/string operations
  • EDI – Destination pointer for stream/string operations
  • EAX (AX, AH, AL)
  • EBX (BX, BH, BL)
  • ECX (CX, CH, CL)
  • EDX (DX, DH, DL)

Global vs local variables

Like with high level languages global variables can be accessed by any function in a program while local variables can be accessed only by the function where its defined. While the declaration of each is similar in C, in assembly they look completely different;

Global and local variables – how it looks in C

We can see the main difference is where the line “uint8_t global_1;” is called. But with assembly global variables are referenced by memory addresses while local variables are referenced by stack addresses.

Global and local variables – how it looks in Assembly

Arithmetic

Aside from the ADD and SUB operations we looked at the other arithmetic operations are;

  • INC – increment a destination value eg INC EAX
  • DEC – decrement a destination value eg. DEC EAX
  • MUL – multiply EAX register by a value eg. MUL $VALUE. The result is stored as a 64 bit value across EDX and EAX.
  • DIV – Divide 64 bits across EDX and EAX by value. The result is stored in EAX and the remainder is stored in EDX.

Logical operators

Logical operators consist of OR, AND, NOT and XOR can all be used in x86 architecture. These instructions operate similarly to the ADD and SUB instructions with the syntax XOR SRC,DEST with the result stored in the destination. The XOR instruction is frequently encountered in disassembly. For example XOR EAX,EAX is a quick way to set the EAX register to zero. This is done for optimization as the instruction requires less bytes and cpu cycle than MOV.

AND, OR, XOR and NOR logical bitwise operations.
  • AND – Destination operand can be r/m32 or a register. The source operand can be r/m32 or a register too, or even an immediate value(i.e. no source and destination as r/m32’s)
  • OR – Destination operand can be r/m32 or a register. The source operand can be r/m32 or a register too, or even an immediate value(i.e. no source and destination as r/m32’s)
  • XOR – Destination operand can be r/m32 or a register. The source operand can be r/m32 or a register too, or even an immediate value(i.e. no source and destination as r/m32’s)
  • NOT – Ones complement Negation(remember that?). The sing source/destination operand can be r/m32.

Shifting

In order to shift registers we use the SHR and SHL instructions; “SHR/SHL destination, count”. These instructions shift the bits in the destination to the right or left and the number of shifts is the “count” field. If bits are shifted beyond the destinations boundary they are first shifted in the CF Flag. Zero Bits are filled in during the shift. At the end of the shift instruction the CF flag contains the last bit shifted out of the destination operand. Shifting is often used in place of multiplication as an optimisation. Shifting is simplet and faster than multipication as you dont need to mess around with registers or moving data around. The way we use shifting in place of multiplication is “shl eax, 1” is the same as multiplying EAX by 2. To figure out what you are multiplying by remember CCNA subnetting; https://www.9tut.com/subnetting-tutorial/2

Rotation

Rotation is similar to shifting only instead of the bits disappearing when they fall off the edge of the destination the bits reappear on the opposite side, like a conveyor belt. ROR allows us to Rotate Right, while ROL allows us to rotate left.

Some of the items we discussed; Shifting, Rotation, and XOR/OR/AND are all encountered by analysts when we encounter encryption or compression. They will often look random and be in repeated a large number of times. Its one of the reasons we try and gain and overview of what the code does rather than investigating individual functions. When we do find and encrypted function we make a note of this and move on.

Branch statements

Branch statement, like if-else, are conditionally executed depending on the flow of the program. The most popular way of seeing this in assembly is through jump or JMP instructions. The format is “jmp location” and causes the next instruction executed to be the one specified by the jump. This is known as unconditional jumping as the execution will always execute such as with procedure calls, GoTo statements, exceptions and interrupts.

An always on jmp doesnt always fulfil our needs however. If-else isnt possible with jmp. We need some way to add conditions and this comes to us through conditional jumps usings flags to decide when to jump. There are more than 30 different jumps that can be used;

Before we can do a conditional jump we need to set the condition flags first. Typically this is done with CMP, TEST or whatever we have the sets flags.

CMP

CMP Compares two operands by subtracting the second operand from the first. It differers from the SUB operand in that the result is not stored. CMP computes the result, sets the flag then discards the result. This way it only sets the flag without impacting registers.

TEST

Like CMP, TEST sets flags and discards the results. It computes the AND of value 1 and value 2, then sets the SF, ZF and PF flags according to the result.

If statements

If statements alter the programs execution based on a set condition (ie if (1=1)). Most languages have these but we will see how basics and nested if statements look in assembly. It is good to know that all if statements need a conditional jump, but not all conditional jumps are if statements. We can see an example of an if statement here;


The If statement itself is seen in the “Mov[move], Cmp[compare], jne[conditional jump if ZF flag is 0/FALSE] and jmp to L2 to skip the else execution.

For Nested if statements to code is the same as the above only additional if statements have been included within the initial if statements. This should be understandable if you do any coding, if not play with python! 😀 Make a game, its great fun! But in assembly the code looks more complicated and difficult to follow.

We can see there are 3 conditional checks in this “x==y”, “z==0” and “X=!y”. We can see reading through the code it can get complicated fast, and this is before we encountered a malware authors intentional obfuscation, htis is why its important to focus on the overall flow of what the program is doing rather that identifying what is happening at each step.

Loops

Loops, like the for loop, are ways of repeating the same piece of code according to some parameters. In assembly this is achieved through the use of conditional jmps such as JGE. Having trouble finding a For loop example online but the basic principle would be similar to;

CLR a // clear register and start at 0
~some action~ // carry out whatever action we want
INC a // increment a
cjne a, b, $jump-address // compares the first two operands and branches to $jumpaddress if their values are not equal, giving us our loop.

Its interesting that in assembly we seem to be looking for an exact match. In the above example if a is > than b.. what would happen? Must remember to test this later. Its probably the case that the next relevent instruction in the code is executed(ie the instruction after the for loop).

While loops are similar but have the condition set at the beginning of the loop and in order to execute the loop this condition must be true. To avoid an infinite loop occuring we must make sure there is some change to the condition(such as an increment of the condition value) within the loop. Malware authors tend to use while loops to monitor for some action before executing malicious code, such as recieving a connection from a C2 server. This allows the malware to continuously listen for this.

We can see in this sample how the jmp at the end keeps bring the code back to the cmp instructions at the start. Once x is greater than or equal to 10 the jge instruction kicks in the let us skip the loop and go to the xor at the end.

Summary

I need a break after all this assembly. Theres alot of information and ive found going through the actual code, and code samples to be the best way to figure out what is happening. I think there will be one more lesson in Malware and then we will take a random piece of malware that has not been analysed yet and start putting it all together to come out with an awesome analysis report. 🙂

Malware Analysis Lesson 6; Intel x86 Architecture and Assembly

At this stage in our series we know what malware is, we know how to use Dynamic and Static analysis tools, we know how malware tries to avoid detection and we have edited the obfuscation methods ourselves. The next step in our path to becoming malware analyst’s is to gain an understanding of Reverse Engineering. The first thing we must know for this is the basics of assembly and Intel x86. By reverse engineering malicious code we can delve deeper into the structure and behavior of suspect files and gain a greater understanding into the codes authors.

Generally a computer system can be represented as several layers of abstraction in order to allow cross layer integration. A good example of the reason for this is how Windows or Linux OS can run off many different types of hardware. Similarly malware authors tend to create the program in a high level language like C, C++, C# or Python, which is then compiled into machine code to be executed by the CPU. As analysts we generally dont have access to the source code, though we can certainly try to decompile the application we usually need to rely on our under standing of low level languages like Assembly to figure out how a program operates.

Sounds though right? Fortunately assembly isn’t as bad as you might think. Its estimated that 14 assembly instructions account for 90% of all code. With the top 5 instruction accounting for 64% of the total code. Here we can see that the number of assembly instructions we need to know is very accessible. For the curious, those 5 instructions are; MOV, PUSH, POP, CALL and CMP.

x86 Architecture

Malware is usually stored in binary, when we disassemble malware we take the binary is an input into our disassembled to output the assembly language code that we can review. This can be difficult as Assembly is a category of different languages depending on the processor in use. x86 Architecture and its associated language are the most common and what we will learn about here but others include; x64, SPARC, PowerPC, MIPS and ARM.

Most modern computer architectures including x86 use the Von Neumann architecture which has 3 components – the CPU that executes the code, fast and volatile Main Memory that stores all data and code that has been called and an I/O system that interfaces with hard drives, monitors and peripherals.

  • The Control Unit get instructions from the ram using one of the CPU’s Register’s, which act as an instruction pointer to store the address of the instruction to execute.
  • The registers act as basic data storage units and are very fast compared to RAM. It allows the CPU to fetch and store instructions faster.
  • The Arithmetic Logic Unit executes the instructions and store the results.

Main memory can be divided into 4 main sections; DATA – which holds alues that are put in place when the program is initially loaded such as static values or global variables; CODE – which includes the instructions fetched by the CPU to be executed, this controls what the program does; HEAP which is used for dynamic memory allocation and elimination where the contents change frequently during execution; STACK – which is used for local variables, parameters and is used to control the program flow.

CISC vs RISC

CISC and RISC are two types of processor. Intel uses a software centric ISA called CISC, Complex Instruction Set Computer which has many special purpose instructions and a given compiler we may never use. We just need to know how to use the manual. It has variable length instructions between 1 and 15 bytes long. RISC ISA’s such as ARM on the other hand is Hardware Centric with more registers and fewer, fixed-size instructions.

Endian

Endianess comes in two flavours, Little Endian where bytes are stored with the little end first. This can be seen with the byte 0x12345678 which would be stored 0x78563412. Intel uses Little Endian. Big Endian on the other hand would stored 0x12345678 as is. This can be important to be aware of as malware changes from Big to little Endian during its life time as over the network Big Endian tends to be used and on the OS(Intel), little Endian is used.

Registers

Registers are small memory storage areas built into processors. They are faster than ram and volatile. We have 8 general purpose registers and an instruction pointer which points at the next instruction to execute. On x86-32, registers are 32 bits long and on x86-64, they are 64 bits long. While the registers are general purpose Intel has a suggested convention to follow for compiler developers and assembly coders. While this convention does not have to be used in general it is followed;

  • EAX – Stores function return values
  • EBX – Base pointer to the data section
  • ECX – Counter for string and loop operations
  • ESP – Stack pointer
  • EBP – Stack frame base pointer
  • EIP – Pointer to next instruction to execute (“instruction pointer”)
  • Caller-save registers – eax, edx, ecx
  • Callee-save registers – ebp, ebx, esi, edi

EFLAGS

EFLAGS is a register that holds many single bit flags. There are two we need to be aware of; ZERO FLAG(ZF) – Set if the result of some instruction is zero, and SIGN FLAG (SF) – Set equal to the most significant bit of a result. There is a good rundown on flags here; https://en.wikipedia.org/wiki/FLAGS_register

So what instructions can we use?

First: The Stack

The stack is a conceptual area of main memory which is designated by the OS when a program is started. By general convention different OS start the stack at different addresses. Generally stacks follow a Last-In-First-Out (LIFO/FILO) data structure where data is pushed on to the top of the stack and popped off the top (we will talk more about these operations shortly). The stack is used normally for temporary storage space. By convention the stack grows towards lower memory addresses so that by adding something to the stack the top of the stack is now at a lower memory address. The ESP points to the top of the stack, which is the lowest address in use. Data that exists at addresses beyond the top of the stack are considered as being undefined. The stack keeps track of which functions were called before the current one, holds local variables and is frequently used to pass arguments to the next function to be called. We need to keep in mind what is happening on the stack in order to understand any programs operation.

The stack’s LIFO

NOP

NOP, or No Operation, indicates to registers and no values. It existance is to pad/align bytes, delay time or, as we discussed in Lesson 5 obfuscation and to confuse malware detectors. A one-byte NOP instruction is an alias mnemonic for “XCHG EAX, EAX” instruction.

PUSH

Push is the simplest instruction that lets us add something to the stack. This can be a Word, Double/Dword or QuadWord, but usually a Dword. It can be an immediate value(a numeric constant), hte value in a register or a register segment. The push instruction automatically decrements the stack pointer ESP by 4.

POP

To then remove a value from the stack we must use the POP instruction which takes the DWORD off the stack, puts it in a register and increments the ESP by 4.

CALL

The call procedures job is to transfer control to a different function in a way that control can later be resumed where it left off. This allows separate programmers to share code and develop libraries for use by many programs What this means is the value of the instruction pointer is pushed into the stack, which at that time points to the instruction following the CALL instruction. First it pushes the address of the next instruction onto the stack for use by the RET (which is discussed next) for when the procedure is done. Then it changes the EIP to the address in the instruction.

RET

There are two forms of the RET function;
1. It pops the top of the stack into the EIP, which also increments the ESP. In this form it is written as “RET”
2. It pops the top of the stack and EIP and add a constant number of bytes to ESP. In this form it is written as “ret 0x8”, “ret 0x20” and so on.

MOV

The move instruction can move a register value to another register, a memory value to a register, a register value to memory, an immediate value to a register and an immediate value to memory. BUT it can never move a memory value to memory.

r/m32 Addressing Forms

Anywhere we see r/m32 it means the code could be taking a value from a register or a memory address. In Intel processors most of the time square brackets [] tell us to treat the value within as a memory address, and fetch the value at the address.

LEA – Load Effectiveness Address

Frequently used with pointer arithmetic, and sometimes for arithmetic in general. It uses the r/m32 form but is more the exception to the rule that the square brackets [] syntax means dereference. For example that in a piece of arithmetic the resulting value stored is the values address, and not the value itself. This can be useful when passing the address of an array element to a subroutine. It may also be a slightly sneaky way of doing more calculations than normal in one instruction This is where its confusing me, we will have to do some examples later to get clarification.

ADD and SUB

These commands do what you think, they add and subtract values. The destination operand can be r/m32 or a register and the source operand can be r/m32 or a register or an immediate. It evaluates the operation and sets flags as appropriate. Instructions modify OF, SF, ZF, AF, PF and CF flags.

Summary

With the basics of assembly and its instructions under our belt the outputs of many tools like IDA pro and Ghidra are more clear and easier to understand but we will need more time, and one more blog post, before we fully digest assembly but already the output of our malware lesson 5 the debugger output is much more clear.

Until next time,

Malware Analysis Lesson 5; Malware obfuscation techniques

Malware, which can be any type of malicious code[4][, and detectors or anti-virus tools are in a continual arms race. With malware developing ever more advanced and sophisticated obfuscation techniques and detectors researching more complex detection mechanisms to identify the malware. This arms race has been ongoing for decades and traditionally detectors have relied on large databases of known signatures [3], or hashes, of the malware. However, malware often uses a range of techniques to change this signature from one infection (or generation) to the next. This makes it more challenging for detectors and malware analysts to identify the behaviour of malware in a timely manner, or even to identify the code as being malicious at all. We are going to look at some obfuscation techniques and describe how detection can be carried out for those techniques. Some methods can be highly complex, such as the malware being interwoven into a targeted host file, while others are simpler like changing the packer used. In all cases it makes detection more time consuming and resource intensive to isolate and identify the malware signature. To compound the woes of signature-based detector’s, this method is not effective against new malware using unknown vulnerabilities (think zero days). This relentless development of new malware variants has made signature-based detection less effective. However, with the reduction in the effectiveness of signatures, behavioural, heuristic and sandbox-based detection have been developed. To understand how all this works we first need to understand how the different types of malware and how they obfuscate themselves.


4 Categories of malware

Due to the diverse methods malware uses to obfuscate itself, it is necessary to categorize them and there are four main types of obfuscated malware; Encrypted, Oligomorphic, Polymorphic and Metamorphic. Let’s go through these now.

Encrypted malware

There are two types of encrypted malware, malware using encryption and malware using packers. With encryption the malware uses encryption to conceal itself from detection. This type of malware is usually composed of the decryptor and its encrypted main body.[1] This method is effective for two reasons, firstly by encrypting the malicious code it executes the malware cannot identify the payloads signature; secondly by changing the encryption key it uses the signature of the encrypted code itself changes.[4] To ensure the malware remains obfuscated throughout multiple generations, and in order to avoid its encrypted signature from being static – and thus identifiable by signature based detectors; Every time the malware is run it can generate and uses a new encryption key to keep its signature unique. For best effect this new key should be generated in a random and unpredictable manner. However, the decryptor portion of the malware cannot be encrypted as it needs to be executed and it retains a static signature. Due to this detection methods that focus on the malware decryptor signature are usually successful. [1]

Packers[3]

Packers are usually legitimate tools to decrease the size of an application while it is stored or transported, like compressing documents but in a way that still lets the application be executed. Even small changes to the underlying application can drastically change the signature of the resulting packed executable. There are multiple packing applications and research on which packers are most effective for evading detectors. One example of this is “Jon Oberheide and his colleagues at the University of Michigan wrote PolyPack, a Web-based application that supports 10 packers and 10 malware detection engines (like virus total)”[3]. This research and similar applications can help malware authors identify which packer would be best for their malware to avoid detection.

One-way packed malware can be detected is by having a database of all possible signatures a packed malware can produce. This is very inefficient, and a better option is to use what is called “Entropy Analysis”[3] to identify the packed malware. This can detect packed files but cannot detect the packer used, which can cause difficulties for deeper analysis. PHAD, PE-Probe and MRC all use Entropy analysis. Without unpacking the file, it can be difficult to know if its malware or a legitimate application, especially as we need to identify the right packer to unpack the file. This scan be difficult, packers are commonly used to spread malware.

Oligomorphic and Polimorphic

Malware that can mutate their decryptor’s from one generation to the next have been designed to fix the shortcomings of purely encrypted malware. The first example of this was the oligomorphic malware which was able to change its decryptor. [1] However oligomorphic malware was initially very limited in the maximum number of decryptor versions it could produce, allowing the signatures of all possibilities to eventually be calculated. This catalogue of signatures allowed detectors to identify all variants of the malware.

Fig. 1 Oligomorphic malware

Polymorphic malware is an encryption method that mutates its static binary code. [3] It was developed to attempt to take the ideas of Oligomorphic malware and further improve them by being able to generate an incalculable number of potential decryptor variants so that no single signature sequence will match all possible variants of this malware. It achieves this by using several very cool obfuscation methods we will talk about later including dead code insertion, register reassignment, Code Transposition and Instruction Substitution [2]. Each time the code is run it mutates itself by using a different key. To make things even more challenging for malware analysts, there are many tools out there such as The Mutation engine that automates the process; allowing regular, non-obfuscated malware to be converted into polymorphic malware.

To detect these types of malware the detectors make use of tools like sandboxing. With sandboxing the detector executes the malware in a secure emulator. We then execute the malware and wait for its constant body (the payload) to be decrypted in RAM after execution and try to match a signature. [1] This works as the polymorphic engine does not significantly change the native opcode that runs in memory. [3] Another way to detect polymorphic malware is by using Neural Pattern Recognition, which has shown a high detection rate, based on a small sample set. [3]

Malware obfuscation is a fast-paced arms race that continuously results in more dangerous malware that is harder to detect. Malware authors attempt to counter sandboxed execution by creating malware that detects when it is running in a virtualised environment and not decrypt it payload. Other malware authors create malware that may wait for some event that does not usually occur when executed in a sandbox, before decrypting it payload. Detectors are improving all the time and are incorporating features to defeat this type of malware with advanced techniques. [1] The decrypted code is essentially the same in each case, thus RAM/memory-based signature detection is possible. Block hashing can also be effective in identifying memory-based remnants.

Metamorphic malware

With the previous class of malware, we discussed how the decryptor was changed with each generation of the malware to avoid detection. Metamorphic malware takes this approach and builds on it by incorporating multiple obfuscation techniques into its payload rather than, or as well as, its decryptor. This way it may not need to use encryption or packing and still can be difficult to detect due to its ever-changing signature. It can maintain its behaviour without ever needed to repeat the same set of native opcodes in memory. [3] It needs to be able to recognize, parse and mutate its own body whenever it propagates. [1]  

There are two types of metamorphic malware, open-world and close-world. Open-world, as shown in the Conficker Worm, leverages a command and control structure – with the malware connecting to its controlling master server to download updates and functionality after the initial infection. Closed-world malware from each generation to the next uses self-mutating code via a binary transformer which modifies the binary code itself to avoid detection. [3]. Win32/Apparition was the first example to demonstrate these techniques. [3] The methods used to achieve this level of obfuscation are discussed below.

fig 2. Metamorphic malware

Obfuscation techniques

Polymorphic and Metamorphic malware take advantage of several techniques to obfuscate their code. We are going to go through several methods now .

Garbage/Dead Code Insertion; Dead code insertion pads out the code in some way with garbage, to change the files signature. This garbage could be randomly generated strings; or it could be new instruction sets that don’t do anything, or just don’t change the malicious operation of the code. NOP or CLC instructions can be used to fill out the code no operation instructions. Using Push and Pop operations on registers is another way. These garbage insertions can be defeated by modern detectors which identify the garbage, such as operations that do nothing, and then deletes it from the code before analysing and comparing the malwares signature. [8]

Register Reassignment/Swapping; In assembly, all programs work from a limited set of instructions and have a limited set of memory space for storing and fetching values. These memory spaces are known as CPU Registers. The number of registers a CPU has can vary. i386, for example, has 4 main registers; EAX, EBX, ECX and EDX. Malware can take advantage of these multiple registers for obfuscation. By switching the registers called and used the malware can change its code, such as from EAX to EBX and vice versa, from generation to generation while keeping the behaviour the same. [5][1]

Changing flow control/Subroutine Reordering; By changing the order of the program’s subroutines malware can produce an exponential number of potential variations. This involves changing jumps in the assembly code and reordering the call sequence by adding subroutines.[3] By changing the order of these jumps, and the order in which different functions are called – combined with other obfuscation methods, such as dead code function insertions, we not only change the signature and make it difficult for automated detectors to identify the malware, but we also increase the challenge of identifying what the malware does through static analysis. Block hashing and heuristic analysis can be the best ways for detectors to identify malware of this type.

Code/Instruction Substitution; Malware, like all code, is made up of a sequence of functions. With most programming languages there are multiple functions that can carry out the same behaviour. In x86, for example, XOR can be replaced by SUB and MOV can be replaced with PUSH. [1] This change results in a new generation of malware, with its own signature that is difficult for detectors to pick up on, even when detecting the instruction set used by the malware. Heuristic and behavioural detection are best placed to identify malware using this form of obfuscation. [8]

Code Transposition; Code transposition is reordering the code in a way the does not impact functionality. This can be through shuffling the order of the instructions and then calling them when needed in the main body. with unconditional branching statements or jumps [1]. The original malware can still be recovered by removing those statements and jumps. This obfuscation, because the malware is so complex, can be difficult and time consuming to both create it, and to detect it. Block hashing is one way to detect this form of malware, where the detectors hash segments, or blocks, of the malicious code are hashed and then checked by an algorithm for similarities with known malware.

Code Integration/Insertion; This is one of the most difficult malware obfuscation techniques to both implement and to detection or analyse. it involves the malware inserting it code within a legitimate program. It does this by decompiling the target executables into manageable objects and inserting itself in between those objects and finally reassembling the entire executable. Once reassembled we see the new generation of the malware. This changes the target programs signature and makes the malware difficult to detect. The best way to detect this malware is by keeping a database of legitimate/white-listed applications and their corresponding baseline signature and treat any applications that deviate from this baseline as malicious. Block hashing and heuristics detection can also be used.

Fileless malware; A new trend in malware obfuscation that has come to the fore over the past 2 years is fileless malware. This obfuscation technique has the malware forgo having a copy of itself stored on the target machines HDD or SDD completely and lives entirely in the RAM. Detectors can have a hard time detecting the malicious function, especially if it combines some armouring techniques, such as relying on external events before acting maliciously, and even when it is detected it can be difficult for to analyse as once the machine is shut down the malware is gone. A live image of the ram is needed to analyse it.

Lets try to obfuscate some malware!

let’s put this theory into practice. We are taking the sample malware from Das Malwerk http://dasmalwerk.eu we have chosen Filename: 25786c51-414b-11e8-a472-80e65024849a.file as we will obfuscate. This malware has a hash of 36E79238CF645F38FA9CE671A850CC3E29338B65 with a detection rate of 50 / 63 engines picking it up.

Initial output from Virus Total

Here we can see 50 engines detect our malware, so we are going to now try a few ways to reduce the detection rate. From static analysis we could identify that this file is written in .NET. By using a .NET packer, Netshrink, we get the new hash of; 87755627F18616749F257524152B1C60F036C6EF when checking this hash in VirusTotal, success! It does not exist.

Virus Total doesnt have the files hash value

This is good, but next let’s upload the file to virus total. For the hash to be detected the hash must be in the VirusTotal hash database, by uploading the malware we can check

So just by changing the packer we can reduce the detection rate from 50 to 25! The detectors that did identify the malware we can see their comments like “Behaves Like”, “Heuristic”, and “Suspicious” this suggests that some form of dynamic analysis was used to identify the file as malicious. Let’s try now to play with the source code. We will decompile this .net application with dotPEEK. This gives us the source code in an exported visual basic file. Opening this in VB Studio we can see the complexity of the malware we selected. First we are going to add a function that will add two numbers, then recompile and get a hash, then compare results.

Decompiling the code with DotPeek

Opening the file in Visual studio then we see we cannot compile it again. DotPEEK seems to have decompiled it with errors such as “base.\u002Ector();” instead of “base.ctor();” 261 errors different errors to fix in all. With this fixed and compiling successfully we have a full understanding of what this malware – Orcus does. Complete with allowing partial Remote Code Execution, setting up FTP servers, allowing DDOS, stealing password and logging keystrokes, we must proceed with the utmost caution. Like a big game hunter about to take out his first sealion. Unfortunately after fixing these 261 errors we get an additional 400 errors, such as “The type or namespace name ‘Shared’ does not exist in the namespace ‘Orcus’ (are you missing an assembly reference?) -using Orcus.Shared.Communication;” which is beyond our understanding of computer programming. This could be the result of the decompile not catching all of the source code

Debugging and trying to obfuscate the decompiled code in Visual Studios

Instead of a decompiler lets try using a debugger to walk through the assembly and see if there are some changes we can make at that level. We can see the malware author has done extensive obfuscation already. We saw this when investigating the source code above where we found functions that did nothing. Here at assembly level we see dead code insertion via nop and padding at the base of the file;

NOP Dead code insertion in Orcus
End of file padding

One thing we could do for an easy demonstration is change the register from the padding at the end from EAX to EBX but let us try something more challenging. One thing I am worried about is damaging the code functionality so I am going to replace some of the NOP commands with push EAX; pop EAX which should serve the same function, this demonstrates dead code insertion and instruction substitution. This was done for multiple pairs of NOP commands found.

Replaced NOP with Push/Pop obfuscation

After this small change we have a hash of; 5AED9A880DB19E1EC35E8A63C09EEF45EC50A2C7 lets see if this, before packing it, makes a difference to out detection rate. As expected this file hash has nothing found on Virus Total. When uploading the file itself we get 42/70 detection rate. This is somewhat better than the initial 50/68 we got initially.

After our obfuscation

If we pack this malware after the changes we get a hash of 2AB951E7904EBBF355954C5501E6D5EE356120AF and this hash still has no matches on Virus Total. Interestingly when we upload this file to Virus Total we get 32/68 detections. Which is higher than our initial packed file. This could be due to the frequency of uploads we have done and the time the detectors have had to analyse our files.

As one final test lets try register swapping at the end of file padding to see if there is any difference. We will also change 1 registry used for an actual instruction. As this is a complex code and to avoid breaking it we will change the registry used at the beginning of the binary.

If we swap the EAX registers used to EBX we can then assess the results.

With this process complete and the resulting file dumped into an exe, we pack it with NetShrink again and have a hash of 5AED9A880DB19E1EC35E8A63C09EEF45EC50A2C7. The result here is unexpected with 42 detectors identifying the malware;

When we upload the file, itself we get the same result. Three conclusions that we could draw from this are; VirusTotal and its detectors are learning from our uploads each time we obfuscate the malware to become more accurate at detecting its malicious nature. The packer we used, NetShrink could be relatively obscure and the detectors had to spend time analysing it(in this case over a 2 week period). Finally, it could be the obfuscation methods we used towards the end, where we focused on changing small segments of the code in the debugger were insufficient to fool the detectors – in this case it is highly possible block hashing was used. While our obfuscation efforts gave us mixed results, we were able to go through several obfuscation

References and bibliography

[1] Ilsun You, Kangbin Yim. (2010) ‘Malware Obfuscation Techniques: A Brief Survey‘, BWCCA 2010

[2] Philip Okane, Sakir Sezer, Kieran Mclaughlin. (2011) ‘Obfuscation: The Hidden Malware‘, IEEE Security & Privacy

[3] Jian Li, Jun Xu, Ming Xu, HengLi Zhao, Ning Zheng. (2009) ‘Malware Obfuscation Measuring via Evolutionary Similarity (2009)’,First International Conference on Future Information Networks

[4] Lysne O. (2018) ‘Static Detection of Malware’, The Huawei and Snowden Questions. Simula Springer Briefs on Computing, vol 4. Springer

[5] Kristian Iliev (2017) ‘Top 6 Advanced Obfuscation Techniques Hiding Malware on Your Device’,https://sensorstechforum.com/advanced-obfuscation-techniques-malware/ (accessed: 13/03/2019)

[6] Dr. Amit Kumar Bindal, Navroop Kaur. (2016) ‘A complete dynamic malware analysis’, International Journal of Computer Applications

[7] Mario Luca Bernardi, Marta Cimitile, Francesco Mercaldo, Damiano Distante. (2016) ‘A constraint-driven approach for dynamic malware detection’, 14th Annual Conference on Privacy, Security and Trust

Figures 1, 2 & 3: Camouflage In Malware: From Encryption To Metamorphism (2012)         ; Babak Bashari Rad, Maslin Masrom, Suhaimi Ibrahim



Malware Analysis Lesson 4; Dynamic Analysis

Last week we looked at static analysis, investigating malware without running it. We looked at some problems with this, the obfuscated source code, calling libraries dynamically and a multitude of methods malware authors can use to frustrate analysts. When this is successful and we have all the information we can get from static analysis we can go deeper in trying to identify what the malware is doing. The way we do this is with Dynamic Analysis, where we run the malware in a sandbox and monitor what changes it makes to the system, what network calls are made, what it looks like in the RAM etc. By monitoring the malware while it is running we are able to identify its functionality and behavior. For example the function names we found in out static analysis strings dump may not all be called when its run, dynamic analysis lets us confirm what is going on.

By launching the malware in a controlled and monitored sandbox we can observe and document its effects on a system. This is especially useful when its an encrypted or packed file. Once the contents are loaded into memory we can read the contents and observe what it does.

What do we need for our sandbox?

First thing, read Malware Unicorns Reverse Engineering 101 course. Its amazing and gives you the ISO’s to run to setup your sandbox! Link is here. We have to run the file in this sandbox as if the file turns out to be malicious we can’t risk the malware infecting all the machines on our network.

A good malware analysis environment achieves 3 things:

  • It allows us to monitor as much activity from the executed program as possible.
  • It performs this monitoring in such a way that the malware does not detect it – which is important as some malware will not execute some functions if it detects it is being run in a sandbox.
  • Ideally it should be scalable so we can run many samples repeatedly in an isolated and automated way.

What kind of information are we looking for?

We should aim to capture everything the malware does so we can build up a picture of what is going on. We need;

  • All traces of function calls made,
  • What files are created, deleted, downloaded or modified
  • Memory dumps so we get all that juicy information stored in ram (volatility)
  • Network activity and traffic
  • Registry activity for windows.

What are some ways we can create this environment.

Air gapped

Air gapped networks are very simple but difficult to maintain. The first is to have physical machines that are isolated from your network (possibly completed disconnected from any network). Military, power plants, avionics and malware analysts all make heavy use of air gapped networks. This physical isolation is a literal gap of air between the sandbox and the Internet. By using this type of setup we rely on malware(and our sandbox) never needing access to the internet, which is often not the case. It also present difficulties with moving files on and off the machine safely – which makes a breach possible.

There have been a few attacks on air gapped networks. The one I’m going to chat about always gets me excited – its like the plot of a bond movie! STUXNET! This was a piece of military grade malware designed by the US and Israeli to infect and disrupt the Iranian Nuclear program. The Iranian site was air gapped and from what i can tell very well protected. The way the malware was introduced was by dropping infected USB keys where the scientists working on the site were likely to see them. Eventually the malware was able to worm its way into the centrifuges and cause alot of disruptions. Wired has an awesome article on it i recommend; https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/

The attack propagated and infiltrated the air gapped network by the removable media used to transfer files.

Virtualisation

Other than network access another issue with air gapped sandboxes is the need to install the operating system after every malware execution. Such a pain. Visualization is how we can have that isolation without the manual pain of reinstalling. By using Virtual Box or VMware we can snapshot what the sandbox is like before executing the malware, then rollback quickly and easily after to this known good state. Problem with visualization is that the malware can easily detect that its being run on a VM. Some malware once it knows its being run on a VM will not execute, or will not execute in the way it would on a regular device. This is to make it more difficult for us analysts to monitor what it does. Oh how malware authors hate us!

How malware identifies if its in a sandbox can include;

  • Guest additions(VMware tools and the like)
  • Device drives installed
  • Well known malware monitoring tools running in the background.
  • The MAC of the VM.
  • Traces in the registry for windows machines.
  • Process execution times.
  • Lack of host activity like downloaded apps, temporary files, browsing history etc.
  • The presence, or lack there of, of anti-virus applications.
  • The type of system its run on (CPU, Ram, HDD etc)

What are the problems with dynamic analysis?

There are a few problems, or challenges for the optimist, we need to be aware of with dynamic analysis. When we first run the file we don’t know what it will do. It may execute all its malicious code at once, or it may not. It might wait for a trigger event; like a particular time of day, a connection to a network, or for the user to do something. For example if we are analyzing a backdoor Trojan, it may just open a port and wait for the Remote Server to try to connect to it before doing anything else. We may also have to execute the program many times to find out all it does or how it affects the systems.

Dynamic Analysis Tools

File System and Registry monitoring: These tools, like process Monitor, allows us to see how processes read, write and delete registry entries and files.
Process monitoring: Process Explorer and Process Hacker help you observe processes started by an executable, including network ports opened!
Network Monitoring: Like wireshare allows us to observe network traffic for anything malicious the malware is trying to do. Including DNS resolution requests, bot traffic and downloads.
Change Detection: Regshot is a small program that lets us compare the systems state before and after the execution of a file – including registry changes. FCIV lets us compare file hashes of files before and after. Its also possible that puppet could be used for this.
DNS Spoofing: Inetsim is a tool that simulates a network so that malware interacting with a remote host continues to run, allowing is to observe its behavior further.

That said if anyone has some cool alternatives to these let me know! 🙂

Where possible always establish a baseline of what your machine should look like before the malware is executed so we can investigate changes.

Whats next?

The next step is to practice with the tools we have learned. This wont be something ill document as i find its always best to learn by doing.
The Zoo (https://github.com/ytisf/theZoo/tree/master/malwares/Binaries) is where i will get my malware. I have already setup my sandbox from our static analysis session. Next step will be running through all the monitoring tools one by one to explore how they work. Using snapshots build into Virtual Box i am going to assess the tools one at a team and after im finished i will rollback my snapshot and try an new tool.