Presentation

Hopper is a tool that will assist you in your static analysis of executable files.

This quick presentation will give you a good overview of what is Hopper, and how it works.

Hopper is a rich-featured application, and all cannot be discussed here, but don’t worry, you’ll quickly find your marks, and easily discover all its subtilities.

The interface is split into three main area:

  • The left pane contains a list of all the symbols defined in the file, and the list strings. The list can be filtered using tags and text.
  • The right pane is called the inspector. It contains various contextual information on the area being explored.
  • The center part is where the assembly language is displayed.

The Concept

The idea behind Hopper is to transform a set of bytes (the binary you want to analyze) into something that could be read by a human.

To do so, Hopper will try to associate a type to each byte of the file. Because it would be too much expensive to do it manually, Hopper proceeds to an automatic analysis as soon as you loaded a file.

The various types that can be used in hopper are:

  • Data: an area is set to the data type when Hopper think it is an area that represents a constant, like an array of int for instance.
  • ASCII: a NULL terminated C string.
  • Code: an instruction
  • Procedure: a byte receives this type once it has been determinated that it is part of a method that has been successfully reconstructed by Hopper.
  • Undefined: this is an area that have not yet been explored by Hopper,

As soon as an executable is loaded, you can manually change the type, by using either the keyboard, or the toolbar on top of the window.

The toolbar contains one button by type you can set (D for data, A for ASCII, etc.). These letters are also the keyboard shortcut you can directly use.

The data type has a little specific behavior: the first time you use this type, Hopper will transform the area into a byte. If you use it again, the byte will be transformed into a 16bits integer, then an 32bits integer and so on…

Feel free to apply transformation to explore the executable; Hopper provides an undo / redo feature, so that you’ll not lose previously applied transformations.

Navigating Through the File

Segments and Sections

An executable file is split into smaller piece of data, called segments, and sections.

When the operating system load an executable, some parts of its bytes are mapped into memory. Each contiguous piece of the file mapped into memory is called segments. These segments are splitted into smaller parts, called sections, which will receive various access properties.

You can navigate through these objects by using the Navigate > Show Segment List and Navigate > Show Section List menu items.

Symbols, Tags and Strings

Because it would be too difficult to remember the address where each piece of code lies into the executable, you can affect names, or symbols to the addresses.

To name an address, you just need to put the cursor on the address, and press N. A dialog will pops up: simply type the name you want to set.

The symbol list is accessible in the left pane of the window.

Using the search field, you can filter the symbols listed below. Hopper uses a kind of regular expression to filter the list; first, it will present the items that completely contain the term you wrote. Then, right below, the list of symbols that contain one text insertion, then two insertions, and so on. This is what I called the fuzzy search, and this behavior can be disabled in the preferences of the application.

You can use the tags to filter even more efficiently the symbol list. Tags are textual information that can be put on an address, a basic-block of a procedure, or a whole procedure. You can open the Tag Scope element to see all tags that exists in the current document. If you select a tag, only procedures that contain this tag will be listed. Note that if you close the Tag Scope item, the filter is reset to all tags.

An interesting thing to note is that many tags are automatically generated during the loading process of an executable. For instance, every entry points will receive a specific entrypoint tag, and each implementation of each Objective-C class will be tagged with the name of the class (or category). It allows you to quickly navigate through code written in Objective-C!

You can choose to display the strings contained into the file. In this mode, only the ASCII strings are displayed, and the Tag Scope has no effect.

The Navigation Stack

You can jump to an address, or a symbol by double-clicking on it. The address where the cursor was is then pushed on a stack. You pop this stack, and navigate back by using the escape key or the backspace key on your keyboard. You can also use the navigation toolbar items.

The right arrow will jump to the address under the cursor, and the left arrow will come back.

The Navigation Bar

Just above the assembly, you’ll find the navigation bar.

This bar is used to quickly navigate into the file. A color scheme is used to indicate the various type given to the bytes of the file.

  • Blue parts represents code,
  • Yellow parts represents procedures,
  • Green parts represents ASCII strings,
  • Purple parts represents data,
  • Grey parts are undefined.

A little red arrow indicates where the cursor is currently located.

Using the Inspector

The inspector is the rightmost part of the window. It contains various components, that will show up, or hidden depending on the context where the cursor is currently located.

Here is a quick list of the components one can find in the inspector:

Instruction Encoding

This component displays the bytes of the current instruction. If the current processor has multiple CPU modes (like the ARM and Thumb modes of the ARM processor family), you’ll see a popup menu that lets you change the CPU mode at the current address.

Format

This component is used to change the display format of the operand of an instruction. You can choose between signed / unsigned hexadecimal, decimal, octal, address, etc.

Comment

You can associate a textual comment at a given address. Use this component to edit this comment.

Colors and Tags

This component lets you associate tags to addresses, basic-block of a procedure, or a procedure. Those tags are useful to navigate efficiently through the file.

You can even put some colors on addresses in order to quickly, and visually, distinguish parts of the executable.

References

This is a very important component; it shows all the references that one instruction can have to another instruction, or piece of data. It contains the references in the other way too, ie, the other instructions that references this one. You can even add your own references by hand, if the analysis performed by Hopper didn’t find some references.

Procedure

This component contains the information on the current procedure. For each basic-block, it displays the list of its predecessors and its successors.

At the bottom of the component, you’ll find a very useful button: Switch/case hint. This button is enabled on instructions like *jmp REGISTER. It allows you to help Hopper to find the statements of a switch/case construction.

CFG

The CFG button in the toolbar brings you the Control Flow Graph view of the current procedure. The button is then only enabled when the cursor is inside a valid procedure (not into unstructured code).

You can move the basic-blocks of the graph, using the mouse. If you double-click on a basic-block, Hopper will select the corresponding instructions in the main window.

It is possible to export the graph into a PDF file.

Decompiler

The Pseudo Code button is used to invocate the decompiler. Its role is to try to rebuild a representation of a valid procedure into a higher level language. This language is not so far from Objective-C, but is clearly not recompilable.

Please note that there is not decompilation engine for the AArch64 processor presently, and that the decompiler works best with the Intel processor than the ARM processor.

Modifying the File

The Hexadecimal Editor

Hopper provide a hexadecimal editor. The editor is synchronized with the assembly language view, and automatically highlights bytes that are part of the current instruction.

Double-click on a byte to modify it. You can use the Undo/Redo feature if you made a mistake.

The Assembler

An embedded assembler can be invocated from Hopper from the Modify > Assemble Instruction… menu.

You can also use the Modify > NOP Region menu to replace the currently selected instructions by NOP instructions.




Find a tutorial from @0xabad1dea:
Analyzing Binaries with Hopper’s Decompiler