This blag post describes my though-process during identification of the string deobfuscation method in a sample belonging to the Zloader malware family. Specifically, I wanted to identify the function or functions responsible for string deobfuscation only using static analysis and Ghidra, understand the algorithm, emulate it in Java and implement a Ghidra script to deobfuscate all strings in a binary of this family. The target audience of this post are people that have some experience with static reverse engineering and Ghidra but who always asked themselves how the f those reversing wizards identify specific functionality within a binary without wasting hours, days and weeks. Show me what you got!
This post will explain, how to identify a function responsible for string deobfuscation in a native-PE malware sample. We will use a KpotStealer sample as a concrete example. KpotStealer (aka Khalesi or just Kpot) is a commodity malware family probably circulated in the shadowy parts of the internet since 2018. It got its name from a string publicly present on the Admin-Panel. After we found the function we will understand the data structure it uses and emulate the decryption of a string with CyberChef and Binary Refinery. An interesting detail here is that Ghidra currently does not guess the function signature correctly. Finally, we will develop a Java script (hehe) for Ghidra to automatically deobfuscate all strings given the corresponding obfuscation function. Show me what you got!
This post describes the memory layout as well as the method used by the Sodinokibi (or REvil) ransomware to protect its strings. It will then list a few Java snippets to interact with the Ghidra scripting API and finally explain a working script to deobfuscate all strings within a REvil sample. If you don't care about the explaination, you can find the most recent version of the script you can simply import into Ghidra on github. I want it all.
Since the temperature of scripting in Ghidra is so high at the current point in time, I want to tell you that scripting it in Java is so much better than scripting it in Python. After that I'll randomly motivate why one wants to get the "original bytes" from a sample and how to do it. Show me what you got!
Found another rabbit hole! While reading my daily digest of tubes [an article about the Mozi bot](https://blog.netlab.360.com/mozi-another-botnet-using-dht/) sparked my interest. Peer-to-peer (P2P) botnets are always cool and this one has some worm-like capabilities and seems to hide its traffic within bittorrent communications. Naturally I wanted to take a look at the sample. But you will not believe, what happened next!
While understanding existing code during software development or reverse engineering, it is quite useful to be able to quickly see other instances of the same variable or function in the current code view. To enable this feature in Ghidra, I suggest you perform the following two configuration changes: Under "Edit" → "Tool Options..." 1. select "Listing Fields" → "Cursor Text Highlight" in the tree view on the left and change "Mouse Button To Activate" to "LEFT" 2. select "Key Bindings" in the tree view on the left and assign a key you can easily press to "Highlight Defined Use" ("SPACE" for example) Happy understanding! Update (2019-11-22): Actually "Highlight Defined Use" refered to in item 2. of the above list is not the same as the highlighted parts from item 1 :sadkeanu:.
This post is written for aspiring reverse engineers and will talk about a technique called _API hashing_. The technique is used by malware authors to hinder reverse engineering. We will first discuss the reasons a malware author may even consider using API hashing. Then we will cover the necessary technical details around resolving dynamic imports at load-time and at runtime and finally, will described API hashing and show a Python script that emulates the hashing method used in Sodinokibi/REvil ransomware. Read on
Earlier this year, I was thrilled to hear that my submission for a talk at this year's [FrOSCon](https://www.froscon.de/) (Free and Open Source Software Conference) was accepted. [The talk](https://programm.froscon.de/2019/events/2350.html) is about Ghidra, the reverse engineering tool which was recently release into open source by the NSA. Since I expected a very heterogeneous audience with people from all kinds of industries with all kinds of backgrounds, I decided to give a long introduction with a lot of motivation for reverse engineering and only use the last quarter or so of the talk to actually show Ghidra's capabilities. You can find the [slides here](https://blag.nullteilerfrei.de/wp-content/uploads/2019/08/FrOSConTalk2019-Ghidra.pdf), the source [of the slides on github](https://github.com/larsborn/FrOSCon2019-Ghidra-Talk) and a recording at [media.ccc](https://media.ccc.de/v/froscon2019-2350-ghidra_-_an_open_source_reverse_engineering_tool). Based on feedback after and during the talk, I added a bullet point under Motivation: a lot of people at FrOSCon seemed to be in the position where a wild binary blob appeared and they had to deal with it. Some because they found an old service running with source code not available (or readable) anymore and some because they want to re-implement a protocol that is not documented.
Say you want to write a C program, but you want to avoid including plain strings within the binary. This is something often done by malware authors, for example, to avoid easy extraction of so called indicators of compromise. I can also imagine a legitimate business that uses string obfuscation to make reverse engineering of their software harder to protect their intellectual property. This is often called string obfuscation. I want to use this knowledge to make the world a better place!
a.k.a. "My Bank does not support CSVs". When I asked my bank for "machine readable" versions of my bank statements, they where like: Their website has a CSV-export function. But only data from the last three months can be exported. Of course, it would have been smart to have performed this export every two months or so, but let's talk about something else. Sure!
I started to play around with ArangoDB and used Python to get some data into my first database. Long story short: if you want to set your own key for the documents, do it on the document, not on the initialization data. EDIT: this is only true for the most recent version 1.3.1 release on pypi by the time of writing1. Read the longer story!