# Summary
Malware uses InnoSetup and hides the installer password in obfuscated, compiled PascalScript, but this article is not about this malware.
I abandoned all sanity and wrote an emulator for this arcane language to recover them.
This article is about the process of reverse engineering the runtime:
While IFPS has an open-source interpreter, I found it easier to recover the logic from existing code by assuming that it works.
If you found this because you are interested in Inno Setup stuff and IFPS: The emulator and archive parsing is open source and [implemented in BinaryRefinery](https://binref.github.io/lib/inno/index.html).
Feel free to file issues on GitHub for issues, questions, or feature requests.
# Background
It all starts with malware.
Some malware uses [InnoSetup][], and Inno uses [PascalScript][] for scripting the installers. Some of the malicious logic is implemented in that compiled PascalScript code, and this is why in [Binary Refinery][BR], I have implemented a unit for disassembling this code, called [ifps](https://binref.github.io/#refinery.ifps), and one for dumping embedded strings, called [ifpsstr](https://binref.github.io/#refinery.ifpsstr). The pioneer work on disassembling IFPS was done by [IFPSTools][], which my code is also based on. For context, PascalScript is also referred to as IFPS because
1. that is still the magic 4-byte sequence at the start of every compiled script, and
2. it was named **I**nner**F**use **P**ascal**S**cript at some point before RemObjects took over the project.
Now some malware also uses the PascalScript to protect payloads from static extraction:
The Inno Setup archive can be password-protected, which means that the archived files are encrypted using a derived key.
A static unpacking tool will, by default, not be able to recover the files without knowing the password.
The embedded PascalScript is then utilized to construct and enter the correct password at runtime so that payloads are unpacked and executed without user interaction.
In many cases, dumping the IFPS strings (via [ifpsstr](https://binref.github.io/#refinery.ifpsstr)) will help you find the password, but those are just the easy ones.
Around August 2024, a fellow researcher started digging deeper into this type of malware and used refinery in the process.
I did notice a sequence of issues ([#56][BR56], [#58][BR58],[#70][BR70], [#74][BR74]) being opened about IFPS by that person,
but at the time I was completely unaware of the [full scope of their project][CrackBlog].
In early 2025, I independently started adding InnoSetup unpacking support to refinery at the suggestion of another fellow researcher
(shout out to [@squibblydoo][squibbly] for the idea and to [Malcat][] for the code seed)
and they connected me to [@gdesmar][gdesmar], the author of the aforementioned blog article.
This is how I learned the full story, and the abyss of Inno malware that is out there.
This is when I decided that my next 8 weekends or so would be spent writing an emulator for InnoSetup/IFPS,
in order to recover hidden and obfuscated passwords from even the most obscure sample.
# The Samples
I was working with the following installer executables;
I list them here mainly for reference.
They are all available on [MalShare][] and the hashes in the below table contain the corresponding links:
| SHA-256 Hash | Nr | Password |
|-------------------------------------------------------------------:|:--:|----------|
| [
15f9eca216f9eb92dd70f86b65e4f6b19081113c5675d04c6008d216d54ed7e0
](https://malshare.com/sample.php?action=detail&hash=15f9eca216f9eb92dd70f86b65e4f6b19081113c5675d04c6008d216d54ed7e0) | 1 | 02b60c12469a674bf
|
| [3785065d6ba8a07f248ed63deadbf04f5e35918224d414afe19f0de1bfcb0e84
](https://malshare.com/sample.php?action=detail&hash=3785065d6ba8a07f248ed63deadbf04f5e35918224d414afe19f0de1bfcb0e84) | 2 | NFfB5qf2o1EJOkmBRrMvFcmj4QmKfwyNE5yoMOMRmFE4yfEEEMImIyyEYIQYFiO0
|
| [73f5eee95f0d5250f5d2f7a29702700537ebe6c08861d4ddfefc09d485f0f65e
](https://malshare.com/sample.php?action=detail&hash=73f5eee95f0d5250f5d2f7a29702700537ebe6c08861d4ddfefc09d485f0f65e) | 3 | Zet0
|
| [aeac18c433de1a62b6b9106a9424028d4c2731d3f7b378088e7b305213432a42
](https://malshare.com/sample.php?action=detail&hash=aeac18c433de1a62b6b9106a9424028d4c2731d3f7b378088e7b305213432a42) | 4 | Zet0
|
| [f09c25c1b868baf93b77a7cbb3d57a2848355e495bca470db6dab70adcf73273
](https://malshare.com/sample.php?action=detail&hash=f09c25c1b868baf93b77a7cbb3d57a2848355e495bca470db6dab70adcf73273) | 5 | 5D97BF9AEC584AAF4B7C0AEDF46CF882A5B1645392F958545AB2A7FC8FF8963F8
|
| [f72106284904a0033fa877df21151bbb84b632163fab55789422916ed85b43a1
](https://malshare.com/sample.php?action=detail&hash=f72106284904a0033fa877df21151bbb84b632163fab55789422916ed85b43a1) | 6 | kj2678лоkjfv89цкs75345в00р\5(*&Y&&^^^%##832984ол1мвырам~`ёЁ<>xhvрлджэ^(UJ<:
|
| [fabd429204db75e2ff9fe7fae5dc981b8c392be42a936273c99dcc41eeb0730d
](https://malshare.com/sample.php?action=detail&hash=fabd429204db75e2ff9fe7fae5dc981b8c392be42a936273c99dcc41eeb0730d) | 7 | 7@8#3%5819((4f-=/72\~c0d``a221fdefc6bd&208fe1808d49606<:c89f
|
I did not originally know all the passwords, these have now been recovered using the [innopwd](https://binref.github.io/#refinery.innopwd) unit in refinery.
What I don't know either is if these samples are malware or not, and I have not looked at the payloads at all.
These installers simply represent interesting techniques to compute the Inno Setup password at runtime.
# A Paradigm Shift
As I mentioned earlier, [the runtime for PascalScript is open source][PascalScriptGH].
However, [the runtime code][PSRuntime] is something I consider extremely hard to read.
Simply put, it is 13.2 kLOC of spaghetti code that somehow manages to manifest the nauseating smell of old tobacco.
I took a step back and reconsidered my approach.
Implementing an emulator from this runtime represents the paradigm:
> **Correct first; complete later.**
My code would literally follow the only specification in existence, so everything I implement would be correct.
However, getting it to be complete seemed arduous:
I would not be able to actually emulate any code and test my emulator until I had done labouring through that runtime from start to finish.
The dual paradigm is also viable, though:
> **Complete first, correct later.**
I instead decided to implement each opcode of the IFPS VM by first guessing how it worked, sometimes just based on its name.
I would then start running the emulator against existing PascalScript code.
This would then reveal mistakes in my implementation and lead me closer to a correct one.
The feedback loop for this approach was much faster and I am convinced that it was the more efficient choice;
the downside, of course, is that the emulator as it is implemented today is likely *still incorrect*.
That said, it is correct enough to handle all samples I know, and that's better than having no emulator at all.
I think these two paradigms often apply when solving a problem that is subject to constraints:
The most natural approach might seem to work within the confines of the constraints and just solve the problem,
but sometimes you can instead consider all solutions to the problem,
regardless of whether they satisfy the constraints or not,
and then move within this space towards a solution that does.
The first time I encountered a duality like this was in combinatorial optimization class when discussing the [Ford-Fulkerson][FordFulkerson] and [Push-Relabel][PushRelabel] algorithms for solving the $s$-$t$-Flow problem.
The former always maintains a correct flow and alters it until it is optimal,
but the latter can be seen as always maintaining a maximum but invalid flow, and altering it until it becomes valid.
Ok, so those were my deep thoughts about paradigm shifts in solving constrained problems.
You can stop here unless you deeply care about IFPS.
I will now start talking about IFPS.
You have been warned.
# The InnoSetup API
Emulating the PascalScript VM itself seemed like the hard part at first, but in retrospective it really wasn't.
What ended up eating most of my time was re-implementing large portions of the [InnoSetup API][PSAPI] in Python.
Here are just some examples:
- Nearly all samples use functions like SetArrayLength
, WStrSet
, WStrGet
, etcetera. These are absolutely necessary to implement; otherwise emulation is almost pointless.
- Samples 3 and 4 use GetDateTimeString
to compute the current second twice and subtracts the two values. The result is converted to a string and appended to the password: It has to be the letter 0
.
- Sample 1 uses user32::GetSystemMetrics
with an argument of 44
(SM_SECURE
), which [must return zero][GSM] to trigger a division by zero, which in turn is required to get to the password.
I shed a lot of tears on this part so I did not want to let it go unmentioned,
but at the end this is not really a reverse engineering topic,
and not something that made me think deep thoughts about problem solving strategies either.
You implement the functions according to their documentation, and that's essentially it.
When a new sample needs more API than you are emulating, you implement the missing stuff. Rinse, repeat.
# Local Variables
The first thing that was important to figure out were local variables.
Looking at the disassembly of almost any piece of code quickly reveals how it works, but let's pick a really simple one:
function MAKELANGID(Argument1: U08, Argument2: U08): U16
begin
0x000 0 PushType U08
0x005 1 Assign LocalVar1 := Argument2
0x010 1 Calculate LocalVar1 <<= 10
0x020 1 Calculate LocalVar1 |= Argument1
0x02C 1 Assign ReturnValue := LocalVar1
0x037 1 Pop
0x038 0 Ret
end;
Variables are created on the stack via the PushType
instruction which creates a variable of the given type at the very top of the stack.
Variables can then be referenced later by their offset from the stack *base*, i.e. when another variable would be created in the above example, it would be LocalVar2
.
My emulator therefore implements a stack of objects, each of which represents a variable.
Each variable object knows its own type and handles assignments of other variables or immediate values to itself based on this type information.
# Function Calls
Implementing function calls was a big one for the VM.
The main questions are the following:
- How are arguments passed to a function when called?
- Where and how are return values passed back to the caller?
- Since all of this happens on the stack, who is responsible for cleaning up which parts of it after a call?
To answer these questions, I implemented stack tracking in my IFPS disassembler.
I assumed that, regardless of the execution path taken to reach a given instruction,
the stack depth at that offset always has to be the same.
In other words, the stack cannot grow or shrink uncontrollably across multiple loop iterations, for example.
As this assumption was valid across several analysed scripts,
I was confident that I had correctly mapped how much the stack pointer is modified by each opcode.
I also assumed that within a given function, the stack pointer does not decrease below its initial value,
and it made sense to track the stack depth within a function relative to that base pointer.
The next observation then was that functions don't always return with an empty stack;
some functions hit the Ret
instruction with a stack pointer that is larger than their base pointer.
However, under my first assumption it was evident that the Call
instruction cannot modify the stack pointer of the caller,
so I reached the following conclusion:
- When a function is called, the current top of the stack becomes its base pointer.
- When the function returns, the stack is reset to that base and all excess data on the stack is removed.
For example, consider the following functions from sample **2**.
The first number in each line of disassembly is the instruction offset, the second number (in decimal) is the computed stack depth:
function WITHINXDAYSOFBUILD(Argument1: Integer): Boolean
begin
0x0000 0 PushType Integer
0x0005 1 PushType Integer
0x000A 2 PushType Integer
0x000F 3 PushType Integer
0x0014 4 PushVar LocalVar1
0x001A 5 Call GETJULIANDAYNUMBERTODAY
0x001F 5 Pop
0x0020 4 PushVar LocalVar2
0x0026 5 Call GETMINJULIANDAYNUMBERSYSTEM
0x002B 5 Pop
0x002C 4 PushVar LocalVar3
0x0032 5 Call GETJULIANDAYNUMBERBUILD
0x0037 5 Pop
0x0038 4 Calculate LocalVar3 -= 1
0x0048 4 Assign LocalVar4 := LocalVar3
0x0053 4 Calculate LocalVar4 += Argument1
0x005F 4 Calculate LocalVar4 += 1
0x006F 4 Compare ReturnValue := LocalVar1 >= LocalVar3
0x0080 4 JumpFalse JumpDestination01, ReturnValue
0x008A 4 PushType Boolean
0x008F 5 Compare LocalVar5 := LocalVar1 <= LocalVar4
0x00A0 5 Calculate ReturnValue &= LocalVar5
0x00AC 5 Pop
JumpDestination01:
0x00AD 4 JumpFalse JumpDestination02, ReturnValue
0x00B7 4 PushType Boolean
0x00BC 5 Compare LocalVar5 := LocalVar2 <= LocalVar4
0x00CD 5 Calculate ReturnValue &= LocalVar5
0x00D9 5 Pop
JumpDestination02:
0x00DA 4 Ret
end;
The function WITHINXDAYSOFBUILD
leaves 4 variables on the stack, but it is called later in DEBUGPROMPT
:
procedure DEBUGPROMPT(Argument1: String)
begin
0x0000 0 PushType Boolean
0x0005 1 PushType U32
0x000A 2 Assign LocalVar2 := GlobalVar34
0x0015 2 Calculate LocalVar2 &= 16
0x0025 2 Compare LocalVar1 := LocalVar2 > 0
0x003A 2 Pop
0x003B 1 JumpFalse JumpDestination01, LocalVar1
0x0045 1 PushType Boolean
0x004A 2 PushType Integer
0x004F 3 Assign LocalVar3 := 3
0x005E 3 PushVar LocalVar2
0x0064 4 Call WITHINXDAYSOFBUILD
0x0069 4 Pop
0x006A 3 Pop
0x006B 2 Calculate LocalVar1 &= LocalVar2
0x0077 2 Pop
JumpDestination01:
0x0078 1 SetFlag !LocalVar1
0x007F 1 Pop
..... .. ...
We can see that the call to WITHINXDAYSOFBUILD
must leave the stack depth in the caller unchanged,
since otherwise we would get conflicting stack depth information for the instruction at offset 0x0078
, marked JumpDestination01
.
Now we turn to understanding how values are passed from and to function calls.
The parsing in [IFPSTools][] already helps understand this to some degree:
Variables inside functions can either reference an argument or a local variable.
If it references an argument,
the index 0
represents the return value while indices 1
and up represent the functions in the order as they appear in the declaration.
Again by studying disassembled code,
I could easily deduce that the stack layout for a function call is as follows,
drawing the stack as growing down:
[ Caller LocalVar1 ] <-- Caller Stack Base
[ Caller LocalVar2 ]
...
[ Caller LocalVarK ]
[ Call ArgumentN ]
...
[ Call Argument1 ]
[ Call ReturnVal ]
in other words: Function arguments are referenced with negative indices relative to the function's stack base while local variables are referenced with positive indices.
For procedures (routines without a return value), there is no variable pushed to the stack to store the return.
A good illustration is the above code in WITHINXDAYSOFBUILD
.
With line breaks and comments to make the code easier to read:
0x0000 0 PushType Integer ; LocalVar1: will receive result from GETJULIANDAYNUMBERTODAY
0x0005 1 PushType Integer ; LocalVar2: will receive result from GETMINJULIANDAYNUMBERSYSTEM
0x000A 2 PushType Integer ; LocalVar3: will receive result from GETJULIANDAYNUMBERBUILD
0x000F 3 PushType Integer ; LocalVar4: computation result
0x0014 4 PushVar LocalVar1
0x001A 5 Call GETJULIANDAYNUMBERTODAY
0x001F 5 Pop
0x0020 4 PushVar LocalVar2
0x0026 5 Call GETMINJULIANDAYNUMBERSYSTEM
0x002B 5 Pop
0x002C 4 PushVar LocalVar3
0x0032 5 Call GETJULIANDAYNUMBERBUILD
0x0037 5 Pop
0x0038 4 Calculate LocalVar3 -= 1
0x0048 4 Assign LocalVar4 := LocalVar3
0x0053 4 Calculate LocalVar4 += Argument1
0x005F 4 Calculate LocalVar4 += 1
The function first creates 4 local variables.
These are also exactly the variables that are left on the stack when the function returns.
It then begins calling functions and assigning their return values to these variables:
First, a reference to a local variable is pushed onto the stack to store the return value.
Then, the function is called (these functions take no arguments but return a result), populating this variable with its result.
The caller then pops this reference from the stack and retains the return value in the destination variable.
This means that while the stack is cleared of local variables created by a called function,
the call arguments and return value are left on the stack for the caller to clean up manually.
This is now sufficient information to implement handling of Call
instructions without destroying the stack!
# Exception Handling
IFPS implements try/catch/finally handlers.
These were fairly tricky to figure out, and there is one aspect of them that I haven't fully grasped - but it has not come up in practice yet so until then I'll just keep the guessed implementation that I have.
This is not purely academic either; sample **1** deliberately triggers a floating point division by zero exception:
function InitializeSetup(): Boolean
begin
... .. ...
0x0DB 4 PushEH End:0x338 CatchAt:0x2ED
... .. ...
0x26B 4 PushType U32
0x270 5 Assign LocalVar5 := LocalVar1
0x27B 5 Calculate LocalVar5 -= LocalVar2
0x287 5 PushType S32
0x28C 6 PushType S32
0x291 7 Assign LocalVar7 := LocalVar3
0x29C 7 PushVar LocalVar6
0x2A2 8 Call User32::GetSystemMetrics
0x2A7 8 Pop
0x2A8 7 Pop
0x2A9 6 Calculate LocalVar5 /= LocalVar6
... .. ...
0x2EB 4 PopEH EndTry
0x2ED 4 PushType S32
0x2F2 5 PushVar LocalVar5
0x2F8 6 Call kernel32::GetTickCount
0x2FD 6 Pop
0x2FE 5 Assign LocalVar3 := LocalVar5
0x309 5 Pop
0x30A 4 Assign GlobalVar1 := 1
0x316 4 Assign GlobalVar0 := '02b60c12469a674bf'
0x336 4 PopEH EndCatch
JumpDestination04:
0x338 4 PushType Boolean
... .. ...
0x378 4 Ret
end;
At offset 0x0DB
, an exception handler is registered. The call to GetSystemMetrics
at 0x2A2
returns 0
, which triggers a divide-by-zero later at 0x2A9
.
The exception handling code begins at offset 0x2ED
and is responsible for assigning the password 02b60c12469a674bf
to a global variable which is later used in the InitializeWizard
callback to automatically populate the password prompt.
The PushEH
opcode has 4 operands, named Finally1
, Catch
, Finally2
, and End
.
Each of these operands references an instruction offset in the current function.
Its counterpart is the PopEH
opcode, which has one operand specifying the type of block (Try
/Catch
/Finally1
/Finally2
) that is ending at this point.
I do not understand the push/pop semantics of the "exception stack" at all;
in the emulator, I treat PushEH
as the beginning of a try block, and each PopEH
instruction as ending a specific type of block (i.e. ending a Try
, Catch
, or Finally
block).
The emulator then directs the control flow in the same way as you would expect these constructs to work in any procedural language that has them.
From example code, it looks like Finally2
is used when a Catch
handler is specified, and Finally1
is used when there is no Catch
.
The details remain unclear to me, though, and the emulator simply executes each Finally
handler that is present.
True to my plan, since the implementation works well in practice, I've had no reason to change the approach so far.
It does work well enough to extract the password from sample **1** ... ¯\\_(ツ)_/¯.
# Variable Assignment Edge Cases
The last major issue I ran into were some edge cases for the Assign
opcode, which is used to assign a value to a variable.
A value can be encoded in an operand in 3 different ways: It can be an immediate, a reference to a variable, or a reference to an element of a container variable.
In the last case, the index into the container can either be encoded as an integer immediate or as another variable reference.
Now there are two kinds of variables that require special attention:
Variables of type Pointer
and containers (Array
, StaticArray
, or Record
).
A pointer is a variable that references another variable, and container types contain variables that can be accessed via an index.
Note that containers can be nested, i.e. you can have a Record
that contains a Record
which contains a StaticArray
.
Based on testing against the listed reference samples, the following algorithm works for implementing Assign
:
- Obtain the value of the source operand. For a Pointer
, its value is the value of the referenced variable. For a container, its value is the list of the values of all its members.
- If the target operand is a pointer, assign the new value to the referenced variable.
- If the target operand is a container, assert that the new value is a list and recursively assign the elements of that list to each member of the target container.
Again, I cannot be certain that this matches the reference implementation exactly:
It works well enough in practice, though, and I did have to revise the procedure several times before arriving at the above sequence of steps, which now works with all referenced samples.
# Conclusion
IFPS is horrible.
And fun.
But more importantly:
When you have to solve a difficult problem,
it is sometimes worth taking a step back and wondering whether there is a duality in the problem space that you can exploit to find an approach that has better qualities (i.e. shorter feedback loops):
For myself, I am quite sure there are way more opportunities like this that I miss than those that I take.
[BR]: https://github.com/binref/refinery/
[InnoSetup]: https://jrsoftware.org/isinfo.php
[IFPSTools]: https://github.com/Wack0/IFPSTools.NET
[PascalScript]: https://www.remobjects.com/ps.aspx
[PascalScriptGH]: https://github.com/remobjects/pascalscript
[PSAPI]: https://jrsoftware.org/ishelp/index.php?topic=scriptfunctions
[PSRuntime]: https://github.com/remobjects/pascalscript/blob/master/Source/uPSRuntime.pas
[BR56]: https://github.com/binref/refinery/issues/56
[BR58]: https://github.com/binref/refinery/issues/58
[BR70]: https://github.com/binref/refinery/issues/70
[BR74]: https://github.com/binref/refinery/issues/74
[GSM]: https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getsystemmetrics
[CrackBlog]: https://medium.com/@gdesmar/crack-any-password-protected-innosetup-installer-5daabb52dbfb
[squibbly]: https://github.com/Squiblydoo/
[gdesmar]: https://github.com/gdesmar
[Malcat]: https://malcat.fr/
[MalShare]: https://www.malshare.com/
[FordFulkerson]: https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm
[PushRelabel]: https://en.wikipedia.org/wiki/Push%E2%80%93relabel_maximum_flow_algorithm