Hexacorn

I love ROI-driven solutions and this post is about one of them. My personal cybersecurity consulting practice exposed me to many different types of ‘IT security’ jobs over last 13 years and today I will describe one of them…

Nearly a decade ago one of my clients contacted me saying that they got a USB key that belonged to their client, and their client was interested in regaining the access to the device’s content after they forgot the password.

Hmm interesting…

This was not your random USB key, but a removable device that was specifically designed to encrypt its data by default. As an input, I got a forensic ‘image’ of the USB key, plus some basic info about its vendor, and that was it – so I quickly googled around, and immediately realized the company that produced it was out business for a while…

Before I could even begin I was shot down.

To access the content of the device one needed to run their software (that was luckily present on the key in an unencrypted form), provide the password, and then the actual content of the key would be decrypted and mounted as a separate Windows device. I may not be remembering everything as it was, but the bottom line was that I got an image of an encrypted USB key and had to find a way to crack its password.

The software handling the decryption process was a mess. It was on the complexity level of today’s Rust, Go, Nim binaries – written in a language that was not very commonly used, very high-level, lots of dependencies and hard to analyze statically – definitely no dedicated tools to support analysis (I know I am vague, but it was long time ago – it could have been Visual FoxPro or something like this, I really don’t remember!).

After a few hours of static analysis in IDA I threw a towel and decided to take a different approach. I was hoping that a person that was using the encrypted key was using some simple password that is easy to remember.

So, I build a dictionary of popular English words, then ran that weird decryption software, and finally wrote a very rudimentary AutoIt script that would fetch a word from a dictionary text file (dictionary) one by one, save it to a log file in each iteration, then push it to the UI control of that software that was handling the password input, then send a key that would simulate someone pressing an ENTER key…

Luckily, the software didn’t have any anti-brute-force mechanisms built-in so I just let it ran over night. To my surprise, next morning I discovered the password was cracked!

It was a simple 5- or 6- character long English word, if I remember correctly and once I found out I was immediately ecstatic! I quickly relayed the message to my client, they did so to theirs, and we all ended up being happier and richer that day…

Is there a lesson there for us?

YES!

Sometimes stupid solutions work. You don’t need to understand everything. It’s good to be driven by ROI principles. The art of ‘hacking’ is elusive.

As many of you know, I am a big fan of Frida framework and I love its intuitiveness and flexibility, especially when it comes to auto-generating handlers for hooked functions, even if they are randomly chosen.

In my older Frida Delphi project I focused on functions that I could define. Today, I will focus on functions that are unknown.

How?

We are going to write an IdaPython script that will generate simple logging/tracing function stubs for all the subroutines that IDA ‘sees’ inside the executable.

When you load any executable into IDA it parses the analyzed program’s segments, recognizes the code, and… in it – many functions. We don’t really know or care what they do, other than being aware that they exist. FLIRT signatures help in recognizing some, but it is non-trivial, as well.

So, the value-proposition here is that we will try to use Frida to run the program and log calls to every subroutine ‘discovered’ or ‘recognized’ by IDA, and print out the strings that subroutine arguments may point to when the function is executed — for this exercise we will try to log ANSI and WIDE strings potentially passed to these functions, and strings delivered in their output.

Why?

This may help us to quickly understand the inner-workings of the program, and in some lucky cases extract IOCs, and overall, help in reverse engineering efforts. Especially for samples that are written in modern languages like Rust, Go, Nim.

The idea sounds great, but there is a problem. One that I don’t know how to solve, but by publishing my partial research, I hope someone more knowledgeable will help me to address… The problem is that any error in your OnEnter or OnLeave Frida handler function forces the script to bail out.

It’s a pity.

My ‘original’ code for this exercise looked like this:

import os
import shutil
import idautils
import idaapi
import idc
import re

idf = idc.get_idb_path()

print ("Original IDA File: %s" % idf)

m = re.match(r"\.idb", idf)

arch = 0
if m:
   arch = 32
   print ("- 32-bit")
else:
   arch = 64
   print ("- 64-bit")

if arch == 32:
   idf = idf.replace('.idb','.frida')
else:
	 idf = idf.replace('.i64','.frida')

print ("Output idf: %s" % idf)

filename=re.sub(r"\.frida", "", re.sub(r"^.+[\\/]", "", idf))
handlers=re.sub(r"[^\\/]+$", "", idf) + "__handlers__" + "/" + filename + "/"

if os.path.isdir(handlers):
	 print ("Deleting old handlers directory: %s" % handlers)
	 shutil.rmtree(handlers)

os.mkdir(handlers)

print ("Saving frida input file to '%s'" % idf)
print ("Saving '%s' handlers to '%s'" % (filename, handlers) )
g = open(idf, 'w')
base = idaapi.get_imagebase()
for f in idautils.Functions():
    dism_addr = list(idautils.FuncItems(f))
    ofs = "%X"%(dism_addr[0]-base)
    g.write ("-a %s!0x%s\n" % (filename, ofs))
    h = open(handlers + "/" + "sub_"+ofs+".js", 'w')
    h.write("""
{

  onEnter(log, args, state) {
    out = 'onenter: """+ofs+"""\\n'

    log(out)

    for (i = 0; i < 4; i++)
    {
       if (args[i]>0)
       {
          console.log(args[i].readUtf8String());
          console.log(args[i].readUtf16String());
          a = args[i].readUtf8String(256)
          if (a > 0)
          {
             out = out + ' [' + i + ']a ' + JSON.stringify(a) + '\\n'
          }
          w = args[i].readUtf16String(256)
          if (w > 0)
          {
             out = out + ' [' + i + ']w ' + JSON.stringify(w) + '\\n'
          }
       }
       this.args [i] = args [i]
    }

    if (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null)
    {
        state ['log_file']=new File('logfile.bin', 'wb');
    }

    if (! (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null) )
    {
        state ['log_file'].write(out);
        state ['log_file'].flush();
    }

  },

  onLeave(log, retval, state) {
    out = 'onenter: """+ofs+"""\\n'

    log(out)

    for (i = 0; i < 4; i++)
    {
       if (this.args[i]>0)
       {
          console.log(this.args[i].readUtf8String());
          console.log(this.args[i].readUtf16String());
          a = this.args[i].readUtf8String(256)
          if (a > 0)
          {
             out = out + ' [' + i + ']a ' + JSON.stringify(a) + '\\n'
          }
          w = this.args[i].readUtf16String(256)
          if (w > 0)
          {
             out = out + ' [' + i + ']w ' + JSON.stringify(w) + '\\n'
          }
       }
    }

    if (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null)
    {
        state ['log_file']=new File('logfile.bin', 'wb');
    }

    if (! (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null) )
    {
        state ['log_file'].write(out);
        state ['log_file'].flush();
    }
  }
}
    """)
    h.close()


g.close()

When executed in a Windows IDA the code generates:

a .frida file with a list of RVA addresses for frida-trace to intercept
a list of generic handlers and their code for all these subroutines that simply try to log 4 first arguments passed to these functions – both at the entry point, and the function return.

Unfortunately, Frida is very sensitive and any error during processing of these handlers forces a bail out :(.

So, after toying around with different variations of this, and similar code, I came up with this dumb script:

import os
import shutil
import idautils
import idaapi
import idc
import re

idf = idc.get_idb_path()

print ("Original IDA File: %s" % idf)

m = re.match(r"\.idb", idf)

arch = 0
if m:
   arch = 32
   print ("- 32-bit")
else:
   arch = 64
   print ("- 64-bit")

if arch == 32:
   idf = idf.replace('.idb','.frida')
else:
	 idf = idf.replace('.i64','.frida')

print ("Output idf: %s" % idf)

filename=re.sub(r"\.frida", "", re.sub(r"^.+[\\/]", "", idf))
handlers=re.sub(r"[^\\/]+$", "", idf) + "__handlers__" + "/" + filename + "/"

if os.path.isdir(handlers):
	 print ("Deleting old handlers directory: %s" % handlers)
	 shutil.rmtree(handlers)

os.mkdir(handlers)

print ("Saving frida input file to '%s'" % idf)
print ("Saving '%s' handlers to '%s'" % (filename, handlers) )
g = open(idf, 'w')
base = idaapi.get_imagebase()
for f in idautils.Functions():
    dism_addr = list(idautils.FuncItems(f))
    ofs = "%X"%(dism_addr[0]-base)
    g.write ("-a %s!0x%s\n" % (filename, ofs))
    h = open(handlers + "/" + "sub_"+ofs+".js", 'w')
    h.write("""
{

  onEnter(log, args, state) {
    out = 'onenter: """+ofs+"""\\n'
    log(out)

    for (i = 0; i < 4; i++)
    {
       console.log(' - '+ args[i] + 'a->' + args[i].readUtf8String()+'\\n');
       console.log(' - '+ args[i] + 'w->' + args[i].readUtf16String()+'\\n');
       this.args [i] = args [i]
    }
  },

  onLeave(log, retval, state) {
    out = 'onenter: """+ofs+"""\\n'
    log(out)
    for (i = 0; i < 4; i++)
    {
       console.log(' - '+ this.args[i] + 'a->' + this.args[i].readUtf8String()+'\\n');
       console.log(' - '+ this.args[i] + 'w->' + this.args[i].readUtf16String()+'\\n');
    }

  }
}
    """)
    h.close()


g.close()

It at least populates the console.log file with anything that may be of interest and we can grep, rg it to our liking…

Hexacorn

The art of cutting corners

Subfrida v0.1