One goal we had for the OFRAK Graphical User Interface (GUI) was to accelerate users’ exploration of firmware binaries – to save them time and help them gain actionable insights into the binary structure.
Embedded firmware and binary formats are diverse, so being able to get a quick bird’s eye view of the contours of a file helps an engineer situate themselves and accelerates analysis. This is what the “minimap view” (“minimap” for short) does.
On the far right of the GUI (see figure below), the minimap visually represents all the data of a selected binary. Each pixel in the minimap corresponds to one byte of the resource’s binary data, with the color of each pixel determined by the corresponding byte.
The minimap makes it easy to explore and navigate to the most important parts of the binary. Unlike the hex view (directly to the left of the minimap), which only shows a portion of a resource’s binary data at a time, the minimap provides a bird’s eye view of the entire resource.
To take the minimap view for a test drive, we’ll download some random binaries from the Internet and see what they look like. First up: a TinyCore Linux ISO.
Additional information can be found in the OFRAK documentation for the minimap view.
The entropy view shows that the file has pretty high entropy throughout, with the exception of a few small stripes and padding regions at the beginning and end. The byteclass and magnitude views confirm that there is no discernable pattern to the data.
The ISO unpacks into two large, compressed files. One contains the kernel, and the other contains the root filesystem. The ISO also unpacks into a few additional, smaller files. The root filesystem is GZIP compressed, and once again, we see high entropy throughout.
Unzipping the core.gz root filesystem yields a decompressed CPIO with much more visible structure. In particular, in the byteclass view, we can see significant regions of the CPIO contain mostly ASCII text (shown in yellow). In the bottom half of the file, though, it appears there is still a large amount of compressed data stored on the filesystem.
Unpacking the CPIO makes all of the files available for us to explore. Let’s first check out /etc/services, a local database of so-called “well-known” ports for network services.
The views make it apparent that this is a text file with newlines. The nearly uniform entropy confirms that the text itself is fairly homogeneous, such that no part of the file contains a disproportionate number of the same characters.
Now, let’s look at a more interesting file: /sbin/mke2fs. This is a userspace binary for creating EXT 2/3/4 filesystems from the command line. We can tell at a glance that this is a program.
Going from the top, this resource has the characteristic header regions at the beginning – a few bytes of data separated by padding regions of a few repeating null bytes. Next up is a symbol section, where there are clear columns of sparse, 16-byte-aligned structs containing lots of padding. The symbols are followed by a data section containing strings, with the byteclass view confirming null-terminated ranges of yellow, printable ASCII characters. After the symbols comes the Procedure Linkage Table (PLT), with aligned structs whose data forms 8- and 16-byte-aligned columns. Note how the appearance of PLT columns differs from those in the symbol section. In the middle of the resource is a relatively high-entropy section of scrambled-looking data. This is the main code region containing the functionality unique to this program. Finally, there are some more padding regions of null bytes, followed by more globally-allocated constant string data and aligned structures.
If we didn’t already know, just by looking at the minimap, it’s possible to guess that the previous program was built for an x86 or amd64 architecture. The telltale sign is that the main code region is fairly high-entropy, with minimal padding and no obvious alignment. The x86 ISA has instructions of highly variable length, so compared to code from ISAs that limit instructions to be either 2 or 4 bytes longs, x86 instructions look more scrambled in the minimap.
For example, consider the same /sbin/mke2fs binary from an ARM system. Ignoring that there is somewhat less padding, the ARM version of the binary looks to have roughly the same structure as its x86 counterpart. The most obvious difference is that the main code region is mostly composed of 4-byte-aligned instructions that make fairly neat columns in the magnitude view.
Besides some of the observations made here, there are lots more patterns to be noticed by dropping unknown binaries into the OFRAK GUI and inspecting them visually. Humans’ spatial perception systems are strong, and we’re excited to be able to take advantage of them for binary analysis and reverse engineering using OFRAK.
What binaries do you think would look interesting when visualized this way? Send them to us on Twitter (@redballoonsec)!
Can you guess which minimaps go with which binary formats?
RBS Software Engineer
To learn more about Red Balloon Security‘s offers, visit our Products page or contact us: [email protected]
© 2024 Red Balloon Security.
All Rights Reserved.
Sal Stolfo was an original founding member of Red Balloon Security, Inc.
Contact us now to discover more about Red Balloon Security’s range of solutions and services or to arrange a demonstration.
Reach out to learn more about our embedded security offering and to schedule a demo.
Reach out to learn more about our embedded security offering and to schedule a demo.
Reach out to learn more about our embedded security offering and to schedule a demo.
Reach out to learn more about our embedded security offering and to schedule a demo.