Threat Research

Duh's not malicious, dude!

By Axelle Apvrille | December 10, 2009

** **iPhoneOS/Eeki.B!worm is said to contain two malicious binaries: sshd, the binary searching for new victims, and duh, a binary found only in variant B and after which some antivirus companies named the worm. This article focuses on the latter.

Duh is called by a malicious script named_ syslog_ (of course, there's no relationship with the traditional UNIX daemon syslog - it's just named that way to look less suspicious than if was named!):

<span style="color: #993300;">/private/var/mobile/home/duh /xml/p.php?id=$ID > /private/var/mobile/home/.tmp</span>

The first parameter is the malicious remote web server. The second parameter corresponds to the remote page to request: note it is a php script that takes an identifier as parameter. The output is dumped into a temporary file named .tmp.

The strings contained in duh do not look particularly malicious:

Can't set remote->sin_addr.s_addr, %s is not a valid IP address, Could not connect, Can't send query, Error receiving data, Can't create TCP socket, Can't get IP, Can't resolve host, GET /%s HTTP/1.0, Host: %s, User-Agent: %s, HTMLGET 1.0, _close, _connect, _fprintf, _fputs, _free, _gethostbyname, _herror, _inet_ntop, _inet_pton, _malloc, _memset, _perror, _recv, _send, _socket...

Those strings would typically be found in any HTTP or socket-based program. Consequently I decided to give it a closer look with a disassembler. As usual in C, the main function has 2 arguments: argc, the number of arguments, and argv, a table of arguments. The first parameter in argv represents the hostname ( Then the program tests whether there is another parameter or not. If so, the second parameter represents the page to request (/xml/p.php etc). Otherwise it defaults to /.


This means that running...

<span style="color: #993300;">/private/var/mobile/home/duh > /private/var/mobile/home/.tmp</span>

...would be acceptable, and would actually request

Then, the program sets up network communication via sockets. On one side, it creates a client socket and stores the socket descriptor in a variable. On the other side, it resolves the IP address of the remote hostname and sets the remote address of the peer. Then, it connects both. Network developers will probably feel at home, as this is a standard way to implement TCP sockets. If an error occurs, an error message is printed and the program exits.


The program then calls a function named _build_get_query. The reverse engineering of this function shows it takes two arguments, the hostname and the page, and returns a string such as "GET /page HTTP/1.0\r\nHost:hostname\r\nUser-Agent:HTMLGET 1.0\r\n\r\n" which is no more than an HTTP GET request.


Once this is done, the program actually sends the string. The number of bytes sent is initialized to 0 and incremented after each call to 'send' on the socket (send returns the number of bytes it successfully sent). The program loops until the entire string is sent. Again, if an error occurs, an appropriate message is printed and the program exits.

Then, the program waits for the remote server's answer and receives it by chunks of 1024 bytes until there is nothing more to receive (the recv function returns 0 when the peer has disconnected). In each chunk, it searches for the first occurrence of \r\n\r\n and, if it finds it, points after than location. As HTTP responses consist of HTTP headers, then two return carriage return + line feeds, and a content, the pointer consequently indicates the beginning of the HTTP response's content. This content is printed to stdout.


Finally, the program closes the socket, and frees all temporary buffers (see Figure 4, tag "end").

So, the reverse engineering confirms duh consists of standard, non malicious, network programming. It processes all URLs the same way - no malicious functionality or hidden code for The fact the implementation deals with error codes indicates duh is rather a clean implementation than a quick hack (thanks Alex for the hint !). It is worth a search on Internet. We are looking for source code that performs an HTTP GET request and outputs the response's content. Another noticeable element is the user agent the program uses: "HTMLGET 1.0".

Bingo! The winner is...


Compare the source code to the assembly: no doubt, this is the same code. Duh is no more than htmlget.c.

-- The Crypto Girl

Join the Discussion