Threat Research
Google patched some Android security vulnerabilities in early August. One of them was a remote code execution vulnerability in Mediaserver (CVE-2016-3820), which was discovered by me. This vulnerability could enable an attacker using a specially crafted file to cause memory corruption during media file and data processing. This issue was rated as Critical by Google due to the possibility of remote code execution within the context of the Mediaserver process. The Mediaserver process has access to audio and video streams, as well as access to privileges that third-party apps could not normally access. The affected functionality is provided as a core part of Android, and there are multiple applications that allow it to be reached with remote content, most notably MMS and browser playback of media.
In this blog, we want to share our analysis of this vulnerability.
Proof of Concept
The vulnerability exists in the software-based H.264 decoder. Mediaserver normally prefers the hardware-based H.264 decoder shipped with most Android devices over the vulnerable software-based one. If the hardware-based H.264 decoder is chosen to parse the PoC file, the vulnerability is not triggered. Applications supporting H.264 media, however, could be vulnerable depending on which decoder is chosen by them.
The testing was conducted on the following device and software setup:
[*]google/hammerhead/hammerhead:6.0.1/MOB30H/root08012302:userdebug/test-keys
[*]Android/aosp_hammerhead/hammerhead/android-6.0.1_r41
The standalone command stagefright can be used to trigger the vulnerability using the software codec option as below:
/system/bin/stagefright -s /sdcard/ FG-VD-16-030_PoC_minimized.mp4
The crash log is shown below:
--------- beginning of crash
05-05 17:45:17.428 2054 2325 F libc : Fatal signal 11 (SIGSEGV), code 2, fault addr 0xb5cd47c8 in tid 2325 (le.h264.decoder)
05-05 17:45:17.522 6319 6319 W debuggerd: type=1400 audit(0.0:7279): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.529 6319 6319 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-05 17:45:17.529 6319 6319 F DEBUG : Build fingerprint: 'google/hammerhead/hammerhead:6.0.1/MMB29V/2554798:user/release-keys'
--------- beginning of system
05-05 17:45:17.529 820 943 W NativeCrashListener: Couldn't find ProcessRecord for pid 2054
05-05 17:45:17.529 6319 6319 F DEBUG : Revision: '0'
05-05 17:45:17.529 6319 6319 F DEBUG : ABI: 'arm'
05-05 17:45:17.529 6319 6319 E DEBUG : AM write failed: Broken pipe
05-05 17:45:17.530 6319 6319 F DEBUG : pid: 2054, tid: 2325, name: le.h264.decoder >>> /data/local/tmp/stagefright <<<
05-05 17:45:17.530 6319 6319 F DEBUG : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0xb5cd47c8
05-05 17:45:17.536 6319 6319 F DEBUG : r0 b49807e8 r1 b4b09630 r2 00000001 r3 000001a0
05-05 17:45:17.536 6319 6319 F DEBUG : r4 00000081 r5 00000000 r6 b5cd47c8 r7 00000000
05-05 17:45:17.536 6319 6319 F DEBUG : r8 00000000 r9 b4b09630 sl b5cba5dc fp 000001a0
05-05 17:45:17.536 6319 6319 F DEBUG : ip b4b09fef sp b49806a8 lr b5c33aa3 pc b5cd47c8 cpsr 20070010
05-05 17:45:17.547 6319 6319 F DEBUG :
05-05 17:45:17.547 6319 6319 F DEBUG : backtrace:
05-05 17:45:17.547 6319 6319 F DEBUG : #00 pc 000547c8 [anon:libc_malloc]
05-05 17:45:17.547 6319 6319 F DEBUG : #01 pc 00018aa1 /system/lib/libstagefright_soft_avcdec.so (ih264d_process_intra_mb+4448)
05-05 17:45:17.547 6319 6319 F DEBUG : #02 pc 0000d67d /system/lib/libstagefright_soft_avcdec.so (ih264d_recon_deblk_slice+616)
05-05 17:45:17.547 6319 6319 F DEBUG : #03 pc 0000d949 /system/lib/libstagefright_soft_avcdec.so (ih264d_recon_deblk_thread+64)
05-05 17:45:17.548 6319 6319 F DEBUG : #04 pc 0003f45f /system/lib/libc.so (__pthread_start(void*)+30)
05-05 17:45:17.548 6319 6319 F DEBUG : #05 pc 00019b43 /system/lib/libc.so (__start_thread+6)
05-05 17:45:17.592 6319 6319 W debuggerd: type=1400 audit(0.0:7280): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.602 6319 6319 W debuggerd: type=1400 audit(0.0:7281): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.649 6319 6319 F DEBUG :
05-05 17:45:17.649 6319 6319 F DEBUG : Tombstone written to: /data/tombstones/tombstone_05
05-05 17:45:17.649 820 836 I BootReceiver: Copying /data/tombstones/tombstone_05 to DropBox (SYSTEM_TOMBSTONE)
Analysis
The vulnerability exists in the libavc H.264 decoder invoked by libstagefright. Mediaserver uses the stagefright lib to handle audio and video streams. Let’s look into this specially crafted .mp4 file first. A comparison between the normal MP4 file and the minimized PoC file is shown below.
Figure 1. PoC File vs The Original MP4 File
Figure 2. Parsing of the PoC File with 010 Editor
From Figure 1 and Figure 2, we can see that the only difference is a single byte at offset 0x1a65f, and this byte is located in atom ‘mdat’. The atom ‘mdat’ stores the H.264 media data. For H.264 specifications, please refer to https://www.itu.int/rec/T-REC-H.264.
The Network Abstraction Layer (NAL) and Video Coding Layer (VCL) are the two main concepts in H.264. An H.264 file consists of a number of NAL units (NALU), and each NALU can be classified as VCL or non-VCL. Video data is processed by the codec and packed into NAL units. Please refer to https://tools.ietf.org/html/rfc6184.
In an MP4 file, the H.264 media data is stored in the following format in the atom ‘mdat’.
|Len(4 bytes)|Type 'mdat'|NALU len|NALU(header+payload)|NALU len|NALU(header+payload)|...
We can extract the NAL unit that contains the byte at offset 0x1a65f from the PoC file and the original MP4 file as follows:
Figure 3. The NAL unit extracted from PoC
Figure 4. The NAL unit extracted from the original MP4 file
From Figure 3 and Figure 4, we know the length of the NAL unit is equal to 0x65.
Following is the H.264 structure.
Figure 5. H.264 Stream Layer Structure
Next, we trace this NAL unit through dynamic debugging in GDB.
From the ‘Proof of Concept’ section, the signal SIGSEGV occurs in a thread from the backtrace output. Obviously, this vulnerability is triggered in a multithreaded environment. It certainly will increase the complexity of the debugging.
Let’s enter into the debugging world!
First, here is our debugging environment.
[*]google/hammerhead/hammerhead:6.0.1/MOB30H/root08012302:userdebug/test-keys
[*]Android/aosp_hammerhead/hammerhead/android-6.0.1_r41
[*]Ubuntu 14.04 LTS Desktop (64-bit)
The function ih264d_parse_nal_unit (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_headers.c) is used to parse the NAL unit. Its definition is shown below.
WORD32 ih264d_parse_nal_unit(iv_obj_t *dec_hdl,
ivd_video_decode_op_t *ps_dec_op,
UWORD8 *pu1_buf,
UWORD32 u4_length)
{…
}
The 3rd parameter pu1_buf points to the buffer of NAL unit data, while the 4th parameter u4_length is the length of the NAL unit data. This allows us to set the following condition breakpoint in this function to trace the NAL unit whose length is 0x65.
b ih264d_parse_headers.c:ih264d_parse_nal_unit if u4_length==0x65
As it continues to run, the above condition breakpoint is eventually hit, and the debug info is shown below.
(gdb) c
Continuing.
[New Thread 2533]
[Switching to Thread 2533]
Breakpoint 1, ih264d_parse_nal_unit (dec_hdl=dec_hdl@entry=0xb608e000, ps_dec_op=ps_dec_op@entry=0xb5ea2580,
pu1_buf=pu1_buf@entry=0xb5200000 "e\210\200\025\200\002o#k\177\322", , u4_length=101)
at external/libavc/decoder/ih264d_parse_headers.c:1011
1011 {
(gdb) info args
dec_hdl = 0xb608e000
ps_dec_op = 0xb5ea2580
pu1_buf = 0xb5200000 "e\210\200\025\200\002o#k\177\322",
u4_length = 101
(gdb) x/101b pu1_buf
0xb5200000: 0x65 0x88 0x80 0x15 0x80 0x02 0x6f 0x23
0xb5200008: 0x6b 0x7f 0xd2 0xc4 0x00 0x00 0x03 0x00
0xb5200010: 0x00 0x09 0x69 0x0f 0x71 0x2a 0xd7 0x1c
0xb5200018: 0x18 0x26 0x77 0x84 0x49 0x58 0x26 0x91
0xb5200020: 0x9d 0xee 0xad 0xcc 0x0b 0xad 0x81 0x30
0xb5200028: 0x26 0xa2 0x96 0xf9 0x3a 0x12 0x80 0xfb
0xb5200030: 0x51 0x5a 0x08 0x3c 0xa2 0x48 0x1c 0xdc
0xb5200038: 0x1d 0x75 0xae 0x82 0x22 0x6d 0xfb 0x57
0xb5200040: 0x37 0x7c 0xa5 0xaa 0xad 0x23 0x6d 0xcd
0xb5200048: 0xe1 0x40 0xe5 0xae 0xe3 0xe6 0x69 0xc5
0xb5200050: 0xe9 0xeb 0xcb 0x48 0xef 0x58 0x4c 0xa4
0xb5200058: 0x6b 0xb5 0x29 0xa5 0xb4 0xe7 0xf4 0x17
0xb5200060: 0x53 0x8e 0x2a 0x19 0xc0
It matches the NAL unit data in Figure 3. We then continue to debug in GDB.
(gdb)
1040 u1_nal_unit_type = NAL_UNIT_TYPE(u1_first_byte);
(gdb)
1044 switch(u1_nal_unit_type)
(gdb) p/x u1_nal_unit_type
$18 = 0x5
switch(u1_nal_unit_type)
{
case SLICE_DATA_PARTITION_A_NAL:
case SLICE_DATA_PARTITION_B_NAL:
case SLICE_DATA_PARTITION_C_NAL:
if(!ps_dec->i4_decode_header)
ih264d_parse_slice_partition(ps_dec, ps_bitstrm);
break;
case IDR_SLICE_NAL:
case SLICE_NAL:
/* ! */
DEBUG_THREADS_PRINTF("Decoding a slice NAL\n");
if(!ps_dec->i4_decode_header)
{
if(ps_dec->i4_header_decoded == 3)
{
/* ! */
ps_dec->u4_slice_start_code_found = 1;
ih264d_rbsp_to_sodb(ps_dec->ps_bitstrm);
i_status = ih264d_parse_decode_slice( //enter here.
(UWORD8)(u1_nal_unit_type
== IDR_SLICE_NAL),
u1_nal_ref_idc, ps_dec);
The function ih264d_parse_decode_slice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_slice.c) is used to parse slice. Its definition is shown below.
1014 WORD32 ih264d_parse_decode_slice(UWORD8 u1_is_idr_slice,
1015 UWORD8 u1_nal_ref_idc,
1016 dec_struct_t *ps_dec /* Decoder parameters */
1017 )
1018 {
...
1599 if(ps_dec->u1_separate_parse == 1)
1600 {
1601 if(ps_dec->u4_dec_thread_created == 0)
1602 {
1603 ithread_create(ps_dec->pv_dec_thread_handle, NULL,
1604 (void *)ih264d_decode_picture_thread,
1605 (void *)ps_dec); //create a thread
1606
1607 ps_dec->u4_dec_thread_created = 1;
1608 }
1609
1610 if((ps_dec->u4_num_cores == 3) &&
1611 ((ps_dec->u4_app_disable_deblk_frm == 0) || ps_dec->i1_recon_in_thread3_flag)
1612 && (ps_dec->u4_bs_deblk_thread_created == 0))
1613 {
1614 ps_dec->u4_start_recon_deblk = 0;
1615 ithread_create(ps_dec->pv_bs_deblk_thread_handle, NULL,
1616 (void *)ih264d_recon_deblk_thread,
1617 (void *)ps_dec); //create a thread
1618 ps_dec->u4_bs_deblk_thread_created = 1;
1619 }
1620 }
...
1873 if(u1_slice_type == I_SLICE)
1874 {
1875 ps_dec->ps_cur_pic->u4_pack_slc_typ |= I_SLC_BIT;
1876
1877 ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice); //enter here
1878
1879 if(ps_dec->i4_pic_type != B_SLICE && ps_dec->i4_pic_type != P_SLICE)
1880 ps_dec->i4_pic_type = I_SLICE;
1881
1882 }
}
Continue to run until line 1599.
(gdb) until 1599
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1599
1599 if(ps_dec->u1_separate_parse == 1)
(gdb) p/x ps_dec->u1_separate_parse
$11 = 0x1
...
(gdb)
1632 && (u1_slice_type != B_SLICE)(gdb) info threads
[New Thread 2531]
[New Thread 2532]
[New Thread 2534]
[New Thread 3160]
[New Thread 3163]
Id Target Id Frame
7 Thread 3163 ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000,
ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc) at external/libavc/decoder/ih264d_thread_compute_bs.c:420
6 Thread 3160 ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:468
5 Thread 2534 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
4 Thread 2532 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
3 Thread 2531 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2 Thread 2533 ih264d_parse_decode_slice (u1_is_idr_slice=,
u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1632
1 Thread 2501 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44(gdb) thread 6
[Switching to thread 6 (Thread 3160)]
#0 ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:468
468 NOP(128);
(gdb) bt
#0 ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:468
#1 0xb5ed3560 in ih264d_decode_picture_thread (ps_dec=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:602
#2 0xb6b0e460 in __pthread_start (arg=0xb4cff930,
arg@entry=)
at bionic/libc/bionic/pthread_create.cpp:199
#3 0xb6ae8b44 in __start_thread (fn=, arg=)
at bionic/libc/bionic/clone.cpp:41
#4 0x00000000 in ?? ()(gdb) thread 7
[Switching to thread 7 (Thread 3163)]
#0 ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000, ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc)
at external/libavc/decoder/ih264d_thread_compute_bs.c:420
420 NOP(128);
(gdb) bt
#0 ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000, ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc)
at external/libavc/decoder/ih264d_thread_compute_bs.c:420
#1 0xb5eb09f0 in ih264d_recon_deblk_thread (ps_dec=0xb608f000)
at external/libavc/decoder/ih264d_thread_compute_bs.c:702
#2 0xb6b0e460 in __pthread_start (arg=0xb4bfd930,
arg@entry=)
at bionic/libc/bionic/pthread_create.cpp:199
#3 0xb6ae8b44 in __start_thread (fn=, arg=)
at bionic/libc/bionic/clone.cpp:41
#4 0x00000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 2533)]
#0 ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001',
u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000', ps_dec=ps_dec@entry=0xb608f000)
at external/libavc/decoder/ih264d_parse_slice.c:1632
1632 && (u1_slice_type != B_SLICE)
We can now see that the function ithread_create on line 1603 creates the thread 3160, and the function ithread_create on line 1615 creates the thread 3163.
Continue to run until line 1873.
(gdb) until 1873
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1873
1873 if(u1_slice_type == I_SLICE)
...
(gdb)
1877 ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice);
(gdb) s
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=0) at external/libavc/decoder/ih264d_parse_islice.c:1360
1360 dec_slice_params_t * ps_slice = ps_dec->ps_cur_slice;
(gdb) info args
ps_dec = 0xb608f000
u2_first_mb_in_slice = 0
Because the slice type is I_SLICE(0x5), enter into the function ih264d_parse_islice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_islice.c). Its definition is shown below.
1356 WORD32 ih264d_parse_islice(dec_struct_t *ps_dec,
1357 UWORD16 u2_first_mb_in_slice)
1358 {
...
1446 if(ps_pps->u1_entropy_coding_mode)
1447 {
1448 SWITCHOFFTRACE; SWITCHONTRACECABAC;
1449 if(ps_dec->ps_cur_slice->u1_mbaff_frame_flag)
1450 {
1451 ps_dec->pf_get_mb_info = ih264d_get_mb_info_cabac_mbaff;
1452 }
1453 else
1454 ps_dec->pf_get_mb_info = ih264d_get_mb_info_cabac_nonmbaff;
1455
1456 ret = ih264d_parse_islice_data_cabac(ps_dec, ps_slice,
1457 u2_first_mb_in_slice); // enter here.
1458 if(ret != OK)
1459 return ret;
1460 SWITCHONTRACE; SWITCHOFFTRACECABAC;
1461 }
1462 else
1463 {
1464 if(ps_dec->ps_cur_slice->u1_mbaff_frame_flag)
1465 {
1466 ps_dec->pf_get_mb_info = ih264d_get_mb_info_cavlc_mbaff;
1467 }
1468 else
1469 ps_dec->pf_get_mb_info = ih264d_get_mb_info_cavlc_nonmbaff;
1470 ret = ih264d_parse_islice_data_cavlc(ps_dec, ps_slice,
1471 u2_first_mb_in_slice);
1472 if(ret != OK)
1473 return ret;
1474 }
1475
1476 return OK;
}
Continue to run until line 1446.
(gdb) until 1446
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=0) at external/libavc/decoder/ih264d_parse_islice.c:1446
1446 if(ps_pps->u1_entropy_coding_mode)
(gdb) p/x ps_pps->u1_entropy_coding_mode
$22 = 0x1
Next, it enters into the function ih264d_parse_islice_data_cabac. Its definition is shown below.
972 WORD32 ih264d_parse_islice_data_cabac(dec_struct_t * ps_dec,
973 dec_slice_params_t * ps_slice,
974 UWORD16 u2_first_mb_in_slice)
975 {
...
1010 do
1011 {
...
1064 /* Parse Macroblock Data */
1065 if(25 == u1_mb_type)
1066 {
1067 /* I_PCM_MB */
1068 ps_cur_mb_info->ps_curmb->u1_mb_type = I_PCM_MB;
1069 ret = ih264d_parse_ipcm_mb(ps_dec, ps_cur_mb_info, u1_num_mbs);
1070 if(ret != OK)
1071 return ret;
1072 ps_cur_deblk_mb->u1_mb_qp = 0;
1073 }
1074 else
1075 {
1076 ret = ih264d_parse_imb_cabac(ps_dec, ps_cur_mb_info, u1_mb_type);
1077 if(ret != OK) // trace it.
1078 return ret;
1079 ps_cur_deblk_mb->u1_mb_qp = ps_dec->u1_qp;
1080 }
...
1154 }
1155 while(uc_more_data_flag);
...
1162 return ret;
1163 }
Next, set condition breakpoint on line 1077 as follows.
b ih264d_parse_islice.c:1077 if ret!=0x0
The debug info is shown below.
(gdb) b ih264d_parse_islice.c:1077 if ret!=0x0
Breakpoint 9 at 0xb5ebeeb2: file external/libavc/decoder/ih264d_parse_islice.c, line 1077.
Continue to run, the condition breakpoint is hit.
(gdb) c
Continuing.Breakpoint 9, ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000, u2_first_mb_in_slice=)
at external/libavc/decoder/ih264d_parse_islice.c:1077
1077 if(ret != OK)
(gdb) p/x ret
$13 = 0x6e
The value of variable 'ret' is equal to 0x6E(ERROR_EOB_TERMINATE_T). This means that it fails during parsing slice in the NAL unit that we specially crafted in the PoC file. The function will return the error code ERROR_EOB_TERMINATE_T. Meanwhile, we can check the status of these two threads.
(gdb) info threads
Id Target Id Frame
7 Thread 3163 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
6 Thread 3160 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
5 Thread 2534 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
4 Thread 2532 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
3 Thread 2531 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2 Thread 2533 ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000,
ps_slice=0xb52c2000, u2_first_mb_in_slice=)
at external/libavc/decoder/ih264d_parse_islice.c:1077
1 Thread 2501 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
(gdb) thread 6
[Switching to thread 6 (Thread 3160)]
#0 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
9 mov r7, ip
(gdb) bt
#0 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
#1 0xb5eba0d0 in ithread_yield () at external/libavc/common/ithread.c:116
#2 0xb5ed2ffc in ih264d_decode_recon_tfr_nmb_thread (ps_dec=ps_dec@entry=0xb608f000,
u1_num_mbs=, u1_num_mbs_next=, u1_end_of_row=)
at external/libavc/decoder/ih264d_thread_parse_decode.c:265
#3 0xb5ed34f4 in ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:585
#4 0xb5ed3560 in ih264d_decode_picture_thread (ps_dec=0xb608f000)
at external/libavc/decoder/ih264d_thread_parse_decode.c:602
#5 0xb6b0e460 in __pthread_start (arg=0xb4cff930,
arg@entry=)
at bionic/libc/bionic/pthread_create.cpp:199
#6 0xb6ae8b44 in __start_thread (fn=, arg=)
at bionic/libc/bionic/clone.cpp:41
#7 0x00000000 in ?? ()
(gdb) thread 7
[Switching to thread 7 (Thread 3163)]
#0 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
9 mov r7, ip
(gdb) bt
#0 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
#1 0xb5eba0d0 in ithread_yield () at external/libavc/common/ithread.c:116
#2 0xb5eb06a8 in ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000,
ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc) at external/libavc/decoder/ih264d_thread_compute_bs.c:572
#3 0xb5eb09f0 in ih264d_recon_deblk_thread (ps_dec=0xb608f000)
at external/libavc/decoder/ih264d_thread_compute_bs.c:702
#4 0xb6b0e460 in __pthread_start (arg=0xb4bfd930,
arg@entry=)
at bionic/libc/bionic/pthread_create.cpp:199
#5 0xb6ae8b44 in __start_thread (fn=, arg=)
at bionic/libc/bionic/clone.cpp:41
#6 0x00000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 2533)]
#0 ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000,
u2_first_mb_in_slice=) at external/libavc/decoder/ih264d_parse_islice.c:1077
1077 if(ret != OK)
We can see that both threads, 3160 and 3163, are in the status of sched_yield. It then forces the running thread to relinquish the processor until it again becomes the head of the thread list.
The function ih264d_parse_islice_data_cabac returns error code ERROR_EOB_TERMINATE_T, and the error code is always returned until the code line 2013 in the ih264d_video_decode function.
(gdb) n
ih264d_video_decode (dec_hdl=0xb608e000, pv_api_ip=0xb5ea25f0, pv_api_op=0xb5ea2580) at external/libavc/decoder/ih264d_api.c:2013
2013 if(ret != OK)
(gdb) p/x ret
$14 = 0x6e
(gdb)
The following is the code snippet around line 2013.
1864 do
1865 {
1866 WORD32 buf_size;
1867 pu1_buf = (UWORD8*)ps_dec_ip->pv_stream_buffer
1868 + ps_dec_op->u4_num_bytes_consumed;
1869 u4_max_ofst = ps_dec_ip->u4_num_Bytes
1870 - ps_dec_op->u4_num_bytes_consumed;
…
2010 ps_dec->u4_return_to_app = 0;
2011 ret = ih264d_parse_nal_unit(dec_hdl, ps_dec_op,
2012 pu1_bitstrm_buf, buflen);
2013 if(ret != OK)
2014 {
…
…
…
2053 if(ps_dec->u4_num_cores == 3)
2054 {
2055 ih264d_signal_bs_deblk_thread(ps_dec);
2056 }
2057 return (IV_FAIL);
2058
2059 }
As you can see above, line 2011 is in a loop. It will take the next iteration in the loop because the value of ret does not meet the condition of exiting the loop. Continue to run until the line 2011. It will parse the next NAL unit in the PoC file.
(gdb)
2011 ret = ih264d_parse_nal_unit(dec_hdl, ps_dec_op,
(gdb) s
ih264d_parse_nal_unit (dec_hdl=0xb608e000, ps_dec_op=0xb5ea2580, pu1_buf=0xb5200000 "e\002Ȉ\001X", u4_length=447)
at external/libavc/decoder/ih264d_parse_headers.c:1027
1027 if(u4_length)
(gdb) info args
dec_hdl = 0xb608e000
ps_dec_op = 0xb5ea2580
pu1_buf = 0xb5200000 "e\002Ȉ\001X"
u4_length = 447
(gdb) x/447b pu1_buf
0xb5200000: 0x65 0x02 0xc8 0x88 0x01 0x58 0x00 0x26
0xb5200008: 0xff 0xf5 0x9c 0x39 0x86 0x3f 0x47 0x0b
0xb5200010: 0xa2 0x47 0xf6 0x5c 0x1d 0x87 0x90 0x50
0xb5200018: 0x0a 0x0d 0x3f 0x88 0xcc 0x32 0x05 0xc4
0xb5200020: 0x53 0xda 0xe5 0x55 0x75 0xca 0x83 0xf6
…
0xb52001a0: 0xe8 0xd2 0x85 0x8a 0xf7 0xe9 0x64 0x3e
0xb52001a8: 0xa4 0x90 0xf5 0x77 0xec 0xd6 0xc8 0x7e
0xb52001b0: 0x44 0xe8 0xb7 0xb7 0x55 0x75 0x86 0xd2
0xb52001b8: 0xf5 0xff 0xe1 0x7b 0x14 0x08 0xce
The buffer pointed to by pu1_buf stores the next NAL unit data. It matches the NAL unit data in the PoC file. Continue to trace how the program handles it.
(gdb)
1068 i_status = ih264d_parse_decode_slice(
(gdb) s
ih264d_parse_decode_slice (u1_is_idr_slice=1 '\001', u1_nal_ref_idc=3 '\003', ps_dec=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1025
1025 WORD32 i4_poc = 0;
(gdb) p/x ps_dec->u1_separate_parse
$32 = 0x1
(gdb) p/x ps_dec->u4_dec_thread_created
$33 = 0x1
(gdb) p/x ps_dec->u4_bs_deblk_thread_created
$34 = 0x1
Because both ps_dec->u4_dec_thread_created and ps_dec->u4_bs_deblk_thread_created are 0x01, this time the program does not execute the function ithread_create on line 1603 and 1615 to create a new thread. Now, continue to run until line 1873 in function ih264d_parse_decode_slice.
(gdb) until 1873
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1873
1873 if(u1_slice_type == I_SLICE)
...
1877 ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice);
(gdb) s
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=88) at external/libavc/decoder/ih264d_parse_islice.c:1360
1360 dec_slice_params_t * ps_slice = ps_dec->ps_cur_slice;
(gdb) n
1361 UWORD32 *pu4_bitstrm_buf = ps_dec->ps_bitstrm->pu4_buffer;
Continue to run until line 1446 in function ih264d_parse_islice.
(gdb) until 1446
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=88) at external/libavc/decoder/ih264d_parse_islice.c:1446
1446 if(ps_pps->u1_entropy_coding_mode)
...
(gdb) s
1456 ret = ih264d_parse_islice_data_cabac(ps_dec, ps_slice,
(gdb) s
ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000, u2_first_mb_in_slice=88) at external/libavc/decoder/ih264d_parse_islice.c:975
975 {
(gdb)
In the function ih264d_parse_islice_data_cabac, the program will call the function ih264d_parse_tfr_nmb on line 1137.
1135 if(ps_dec->u1_separate_parse)
1136 {
1137 ih264d_parse_tfr_nmb(ps_dec, u1_mb_idx, u1_num_mbs,
1138 u1_num_mbs_next, u1_tfr_n_mb, u1_end_of_row); //trace it.
1139 ps_dec->ps_nmb_info += u1_num_mbs;
1140 }
So set the following breakpoint.
b ih264d_thread_parse_decode.c:ih264d_parse_tfr_nmb
The debug info is shown below when the above breakpoint is hit.
(gdb) c
Continuing.Breakpoint 4, ih264d_parse_tfr_nmb (ps_dec=ps_dec@entry=0xb608f000, u1_mb_idx=u1_mb_idx@entry=0 '\000', u1_num_mbs=u1_num_mbs@entry=22 '\026',
u1_num_mbs_next=0 '\000', u1_tfr_n_mb=1 '\001', u1_end_of_row=1 '\001') at external/libavc/decoder/ih264d_thread_parse_decode.c:68
68 {
(gdb)
The definition of the function ih264d_parse_tfr_nmb is shown below.
62 void ih264d_parse_tfr_nmb(dec_struct_t * ps_dec,
63 UWORD8 u1_mb_idx,
64 UWORD8 u1_num_mbs,
65 UWORD8 u1_num_mbs_next,
66 UWORD8 u1_tfr_n_mb,
67 UWORD8 u1_end_of_row)
68 {
69 WORD32 i, u4_mb_num;
70
71 const UWORD32 u1_mbaff = ps_dec->ps_cur_slice->u1_mbaff_frame_flag;
72 UWORD32 u4_n_mb_start;
73
74 UNUSED(u1_mb_idx);
75 UNUSED(u1_num_mbs_next);
76 if(u1_tfr_n_mb)
77 {
78
79
80 u4_n_mb_start = (ps_dec->u2_cur_mb_addr + 1) - u1_num_mbs;
81
82 // copy into s_frmMbInfo
83
84 u4_mb_num = u4_n_mb_start;
85 u4_mb_num = (ps_dec->u2_cur_mb_addr + 1) - u1_num_mbs; //u4_mb_num is 0x58
86
87 for(i = 0; i < u1_num_mbs; i++) // u1_num_mbs is 0x16
88 {
89 UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
90 ps_dec->u2_cur_slice_num);
91 DATA_SYNC();
92 UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_dec_mb_map, u4_mb_num);
93
94 u4_mb_num++;
95 }
96
...
164 }
165 }
We can see that the line 89 UPDATE_SLICE_NUM_MAP is used to update the buffer pointed to by ps_dec->pu2_slice_num_map with ps_dec->u2_cur_slice_num in a loop. The value of ps_dec->u2_cur_slice_num is 0x0. Go back to see why it’s 0x0.
1873 if(u1_slice_type == I_SLICE)
1874 {
1875 ps_dec->ps_cur_pic->u4_pack_slc_typ |= I_SLC_BIT;
1876
1877 ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice); //ret is ERROR_EOB_TERMINATE_T
1878
1879 if(ps_dec->i4_pic_type != B_SLICE && ps_dec->i4_pic_type != P_SLICE)
1880 ps_dec->i4_pic_type = I_SLICE;
1881
1882 }
…
1909 if(ret != OK)
1910 return ret; // return here
1911
1912 ps_dec->u2_cur_slice_num++; //didn’t increase ps_dec->u2_cur_slice_num,so ps_dec->u2_cur_slice_num is still 0x0.
1913 /* storing last Mb X and MbY of the slice */
When the program handled the slice in the previous NAL unit, it returned an error code and didn’t fix the slice number increment for error clips. So ps_dec->u2_cur_slice_num is still 0x0. The line 92 UPDATE_MB_MAP_MBNUM_BYTE is used to update the buffer pointed to by ps_dec->pu1_dec_mb_map with 0x01.
We can next check some variables and the status of threads.
(gdb) p/x ps_dec->u2_cur_slice_num
$37 = 0x0
87 for(i = 0; i < u1_num_mbs; i++)
(gdb)
89 UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
(gdb) p/x u1_num_mbs
$38 = 0x16
(gdb) p/x u4_mb_num
$39 = 0x58
(gdb) x/128b ps_dec->pu1_dec_mb_map
0xb60ff200: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff208: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff210: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff218: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff220: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff228: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff230: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff238: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff240: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff248: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff250: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff258: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff260: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff268: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff270: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff278: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) p/x ps_dec->pu1_recon_mb_map
$31 = 0xb60ff400
(gdb) x/128b 0xb60ff400
0xb60ff400: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff408: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff410: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff418: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff420: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff428: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff430: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff438: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff440: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff448: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff450: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff458: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff460: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff468: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff470: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff478: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) info threads
Id Target Id Frame
7 Thread 3163 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
6 Thread 3160 sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
5 Thread 2534 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
4 Thread 2532 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
3 Thread 2531 __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2 Thread 2533 ih264d_parse_tfr_nmb (ps_dec=ps_dec@entry=0xb608f000,
u1_mb_idx=u1_mb_idx@entry=0 '\000', u1_num_mbs=u1_num_mbs@entry=22 '\026',
u1_num_mbs_next=, u1_tfr_n_mb=1 '\001', u1_end_of_row=1 '\001')
at external/libavc/decoder/ih264d_thread_parse_decode.c:80
1 Thread 2501 syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
We need to monitor these buffers pointed to by ps_dec->pu1_dec_mb_map, ps_dec->pu2_slice_num_map and ps_dec->pu1_recon_mb_map.
Next, we see why threads 3160 and 3163 still yield, and how to have them continue to run.
For thread 3160, it yields in the function ih264d_decode_recon_tfr_nmb_thread (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_thread_parse_decode.c).
200 WORD32 ih264d_decode_recon_tfr_nmb_thread(dec_struct_t * ps_dec,
201 UWORD8 u1_num_mbs,
202 UWORD8 u1_num_mbs_next,
203 UWORD8 u1_end_of_row)
204 {
205 WORD32 i,j;
...
228 while(1)
229 {
230
231 UWORD32 u4_max_mb = (UWORD32)(ps_dec->i2_dec_thread_mb_y + (1 << u1_mbaff)) * ps_dec->u2_frm_wd_in_mbs - 1;
232 u4_mb_num = u2_cur_dec_mb_num;
233 /*introducing 1 MB delay*/
234 u4_mb_num = MIN(u4_mb_num + u1_num_mbs + 1, u4_max_mb);
235
236 CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_dec_mb_map, u4_cond); // check ps_dec->pu1_dec_mb_map, the line 92 in function ih264d_parse_tfr_nmb is used to update byte with 0x01 at the buffer pointed by ps_dec->pu1_dec_mb_map. When the buffer pointed by (u4_mb_num + ps_dec->pu1_dec_mb_map) is updated with 0x01, the u4_cond will be equal to 0x01, then break the loop and the thread continues to run.
237 if(u4_cond) // if u4_cond is 0x01, then break loop and stop thread yield.
238 {
239 break;
240 }
241 else
242 {
243 if(nop_cnt > 0)
244 {
245 nop_cnt -= 128;
246 NOP(128);
247 }
248 else
249 {
250 if(ps_dec->u4_output_present && (2 == ps_dec->u4_num_cores) &&
251 (ps_dec->u4_fmt_conv_cur_row < ps_dec->s_disp_frame_info.u4_y_ht))
252 {
253 ps_dec->u4_fmt_conv_num_rows =
254 MIN(FMT_CONV_NUM_ROWS,
255 (ps_dec->s_disp_frame_info.u4_y_ht
256 - ps_dec->u4_fmt_conv_cur_row));
257 ih264d_format_convert(ps_dec, &(ps_dec->s_disp_op),
258 ps_dec->u4_fmt_conv_cur_row,
259 ps_dec->u4_fmt_conv_num_rows);
260 ps_dec->u4_fmt_conv_cur_row += ps_dec->u4_fmt_conv_num_rows;
261 }
262 else
263 {
264 nop_cnt = 8*128;
265 ithread_yield();
266 }
267 }
268 }
269 }
270 /* N Mb MC Loop */
...
342 /* N Mb IQ IT RECON Loop */
343 for(j = 0; j < i; j++)
344 {
345 ps_cur_mb_info = &ps_dec->ps_frm_mb_info[ps_dec->cur_dec_mb_num];
346
347 if((ps_dec->u4_num_cores == 2) || !ps_dec->i1_recon_in_thread3_flag)
348 {
349 if(ps_cur_mb_info->u1_mb_type <= u1_skip_th)
350 {
351 ih264d_process_inter_mb(ps_dec, ps_cur_mb_info, j);
352 }
353 else if(ps_cur_mb_info->u1_mb_type != MB_SKIP)
354 {
355 if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
356 {
357 ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
358 ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j);
359 }
360 }
361
362
363 if(ps_dec->u4_use_intrapred_line_copy == 1)
364 ih264d_copy_intra_pred_line(ps_dec, ps_cur_mb_info, j);
365 }
366
367 DATA_SYNC();
368
369 if(u1_mbaff)
370 {
371 if(u4_update_mbaff)
372 {
373 UWORD32 u4_mb_num = ps_cur_mb_info->u2_mbx
374 + ps_dec->u2_frm_wd_in_mbs
375 * (ps_cur_mb_info->u2_mby >> 1);
376 UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num); //update byte in the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num)
377 u4_update_mbaff = 0;
378 }
379 else
380 {
381 u4_update_mbaff = 1;
382 }
383 }
384 else
385 {
386 UWORD32 u4_mb_num = ps_cur_mb_info->u2_mbx
387 + ps_dec->u2_frm_wd_in_mbs * ps_cur_mb_info->u2_mby;
388 UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num); //update byte in the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num)
389 }
390 ps_dec->cur_dec_mb_num++;
391 }
...
...
...
435 return OK;
436 }
From above code, we can see that the line 236 CHECK_MB_MAP_BYTE is used to check the buffer pointed to by ps_dec->pu1_dec_mb_map. Line 92 in function ih264d_parse_tfr_nmb is used to update the byte with 0x01 in the buffer pointed to by ps_dec->pu1_dec_mb_map. When the buffer pointed to by (u4_mb_num + ps_dec->pu1_dec_mb_map) is updated with 0x01, the u4_cond will be equal to 0x01, it then breaks the loop and the thread continues to run.
On the line 376 and 388, UPDATE_MB_MAP_MBNUM_BYTE is used to update the buffer pointed to by ps_dec->pu1_recon_mb_map at offset u4_mb_num.
For thread 3163, it yields in the function ih264d_recon_deblk_slice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_thread_compute_bs.c).
378 void ih264d_recon_deblk_slice(dec_struct_t *ps_dec, tfr_ctxt_t *ps_tfr_cxt)
379 {
...
...
...
531 while(1)
532 {
533 UWORD32 u4_cond = 0;
534 UWORD32 u4_mb_num = ps_dec->cur_recon_mb_num + recon_mb_grp - 1;
535
536 /*
537 * Wait for one extra mb of MC, because some chroma IQ-IT functions
538 * sometimes loads the pixels of the right mb and stores with the loaded
539 * values.
540 */
541 u4_mb_num = MIN(u4_mb_num + 1, (ps_dec->i2_recon_thread_mb_y + 1) * i2_pic_wdin_mbs - 1);
542
543 CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_recon_mb_map, u4_cond); // Check the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num), if u4_cond is 0x01, then break loop and the thread continues to run. The operation of upadting ps_dec->pu1_recon_mb_map was done in functuon ih264d_decode_recon_tfr_nmb_thread in thread 3160.
544 if(u4_cond)
545 {
546 break;
547 }
548 else
549 {
550 if(nop_cnt > 0)
551 {
552 nop_cnt -= 128;
553 NOP(128);
554 }
555 else
556 {
557 if(ps_dec->u4_output_present &&
558 (ps_dec->u4_fmt_conv_cur_row < ps_dec->s_disp_frame_info.u4_y_ht))
559 {
560 ps_dec->u4_fmt_conv_num_rows =
561 MIN(FMT_CONV_NUM_ROWS,
562 (ps_dec->s_disp_frame_info.u4_y_ht
563 - ps_dec->u4_fmt_conv_cur_row));
564 ih264d_format_convert(ps_dec, &(ps_dec->s_disp_op),
565 ps_dec->u4_fmt_conv_cur_row,
566 ps_dec->u4_fmt_conv_num_rows);
567 ps_dec->u4_fmt_conv_cur_row += ps_dec->u4_fmt_conv_num_rows;
568 }
569 else
570 {
571 nop_cnt = 8*128;
572 ithread_yield();
573 }
574 }
575 }
576 }
577
578 for(j = 0; j < recon_mb_grp; j++)
579 {
580 GET_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, ps_dec->cur_recon_mb_num,
581 u2_slice_num); // get slice num in ps_dec->pu2_slice_num_map, the slice num map was updated in line 89 UPDATE_SLICE_NUM_MAP of the function ih264d_parse_tfr_nmb. The u2_slice_num is always 0x0 because it's from ps_dec->u2_cur_slice_num.
582
583 if(u2_slice_num != ps_dec->u2_cur_slice_num_bs) // here ps_dec->u2_cur_slice_num_bs is 0x0, if u2_slice_num also is 0x0. So it does not break the loop and then the thread can continue to run.
584 {
585 u4_slice_end = 1;
586 break;
587 }
588 if(ps_dec->i1_recon_in_thread3_flag)
589 {
590 ps_cur_mb_info = &ps_dec->ps_frm_mb_info[ps_dec->cur_recon_mb_num];
591
592 if(ps_cur_mb_info->u1_mb_type <= u1_skip_th)
593 {
594 ih264d_process_inter_mb(ps_dec, ps_cur_mb_info, j);
595 }
596 else if(ps_cur_mb_info->u1_mb_type != MB_SKIP)
597 {
598 if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
599 {
600 ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
601 ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j); // trace it.
602 }
603 }
604
605 ih264d_copy_intra_pred_line(ps_dec, ps_cur_mb_info, j);
606 }
607 ps_dec->cur_recon_mb_num++;
608 }
From above code, we see that the line 543 CHECK_MB_MAP_BYTE is used to check the buffer pointed to by (ps_dec->pu1_recon_mb_map+u4_mb_num). If u4_cond is 0x01, then break the loop and the thread continues to run. The update of the buffer pointed to by ps_dec->pu1_recon_mb_map is done in function ih264d_decode_recon_tfr_nmb_thread in thread 3160.
The line 580 GET_SLICE_NUM_MAP is used to get the slice num from ps_dec->pu2_slice_num_map. The slice num map is updated on line 89 UPDATE_SLICE_NUM_MAP of the function ih264d_parse_tfr_nmb. u2_slice_num is always 0x0 because it's from ps_dec->u2_cur_slice_num.
On line 583, ps_dec->u2_cur_slice_num_bs is 0x0, and u2_slice_num is also 0x0. So it does not break the loop and the thread can continue to run.
Through the above analysis, we now know how to have these two threads continue to run.
Next, go back the GDB.
Set the following condition breakpoint to monitor when the loop is broken in function ih264d_decode_recon_tfr_nmb_thread.
b ih264d_thread_parse_decode.c:237 if u4_cond==1
Continue to run until the above condition breakpoint is hit. Then run the following command.
set scheduler-locking on
Continue to run until the line 388 (UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num);). In a loop, it updates the buffer pointed to by ps_dec->pu1_recon_mb_map with 0x01.
In the funtion ih264d_recon_deblk_slice, the line 543 (CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_recon_mb_map, u4_cond);) running in a loop always checks to see if the byte pointed to by (ps_dec->pu1_recon_mb_map+u4_mb_num) is 0x01. Once u4_cond becomes 0x01, it will break the loop and then continue to run in this thread.
Next, set the following breakpoint when debugging in the function ih264d_decode_recon_tfr_nmb_thread.
b ih264d_process_intra_mb.c:ih264d_process_intra_mb
Run the following command to disable scheduler-locking:
set scheduler-locking off
Continue to run until the above breakpoint is hit. The debug info is shown below:
(gdb) c
Continuing.Breakpoint 6, ih264d_process_intra_mb (ps_dec=ps_dec@entry=0xb608f000,
ps_cur_mb_info=ps_cur_mb_info@entry=0xb52b44dc, u1_mb_num=u1_mb_num@entry=0 '\000')
at external/libavc/decoder/ih264d_process_intra_mb.c:725
725 {
(gdb) x/16b ps_dec->pv_proc_tu_coeff_data
0xb5140600: 0x01 0x01 0x01 0x01 0xff 0xff 0xff 0xff
0xb5140608: 0x09 0x20 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) p/x ps_dec->pu1_recon_mb_map
$31 = 0xb60ff400
(gdb) x/128b 0xb60ff400
0xb60ff400: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff408: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff410: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff418: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff420: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff428: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff430: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff438: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff440: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff448: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff450: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff458: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff460: 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0xb60ff468: 0x01 0x01 0x01 0x01 0x01 0x01 0x00 0x00
0xb60ff470: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xb60ff478: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
From above output we can see ih264d_process_intra_mb is called in a loop on line 578 in function ih264d_recon_deblk_slice. The debug info below shows when the breakpoint is hit for the 11th time.
(gdb) c
Continuing.Breakpoint 6, ih264d_process_intra_mb (ps_dec=ps_dec@entry=0xb608f000,
ps_cur_mb_info=ps_cur_mb_info@entry=0xb52b46f8, u1_mb_num=u1_mb_num@entry=10 '\n')
at external/libavc/decoder/ih264d_process_intra_mb.c:725
725 {
(gdb) x/16b ps_dec->pv_proc_tu_coeff_data
0xb5140828: 0x00 0x00 0x40 0x03 0x80 0xfc 0x80 0xfc
0xb5140830: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
...
(gdb)
764 UWORD8 *pu1_prev_intra4x4_pred_mode_data = (UWORD8 *)ps_dec->pv_proc_tu_coeff_data; //Pointer to keep track of intra4x4_pred_mode data in pv_proc_tu_coeff_data buffer
(gdb) n
767 u4_num_pmbair = (u1_mb_num >> u1_mbaff);
(gdb) p/x pu1_prev_intra4x4_pred_mode_data
$20 = 0xb5140828
(gdb) x/16b pu1_prev_intra4x4_pred_mode_data
0xb5140828: 0x00 0x00 0x40 0x03 0x80 0xfc 0x80 0xfc
0xb5140830: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
...
(gdb) p/x pu1_prev_intra4x4_pred_mode_flag
$21 = 0xb5140828
(gdb) p/x pu1_rem_intra4x4_pred_mode
$22 = 0xb514082c
(gdb) x/4b pu1_rem_intra4x4_pred_mode
0xb514082c: 0x80 0xfc 0x80 0xfc
(gdb) x/4b pu1_prev_intra4x4_pred_mode_flag
0xb5140828: 0x00 0x00 0x40 0x03
1652 i1_intra_pred = ((i1_left_pred_mode < 0) | (i1_top_pred_mode < 0)) ?
(gdb)
1665 if(!pu1_prev_intra4x4_pred_mode_flag[u1_sub_mb_num])
(gdb)
1669 >= i1_intra_pred);
(gdb) p/x i1_intra_pred
$25 = 0x2
(gdb) n
1668 + (pu1_rem_intra4x4_pred_mode[u1_sub_mb_num]
(gdb)
1667 i1_intra_pred = pu1_rem_intra4x4_pred_mode[u1_sub_mb_num]
(gdb) n
1671 if(i1_intra_pred<0)
(gdb) p/x i1_intra_pred
$26 = 0x81
(gdb) p i1_intra_pred
$27 = -127 '\201'
(gdb) p/x u1_sub_mb_num
$28 = 0x0
Go back to the source code. The code on line 1563 is the start of a loop where line 1634 calculates the value of i1_intra_pred and it's 0x02. Next, go to the line 1647, and you will see that pu1_prev_intra4x4_pred_mode_flag points to the buffer |00 00 40 03|, and u1_sub_mb_num is 0x0. Now that the ‘if’ condition is true, go to line 1649 to re-calculate i1_intra_pred. pu1_rem_intra4x4_pred_mode points to the buffer |80 fc 80 fc|, so i1_intra_pred = 0x80+(0x80>0x02)=0x81.
The definition of pu1_rem_intra4x4_pred_mode is on line 1359.
1359 UWORD8 *pu1_rem_intra4x4_pred_mode = pu1_prev_intra4x4_pred_mode_data + 4;
The definition of i1_intra_pred is on line 1357.
1357 WORD8 i1_intra_pred;
pu1_rem_intra4x4_pred_mode is an unsigned char pointer, and i1_intra_pred is a signed char. The value of pu1_rem_intra4x4_pred_mode[0] is an unsigned char. But when an unsigned char is assigned to a signed char, it's easy to cause an overflow. Here, sthe i1_intra_pred is equal to -127. Next, go to line 1681.
1681 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
au1_ngbr_pels, pu1_luma_rec_buffer, 1,
ui_rec_width,
((u1_is_top_sub_block << 2) | u1_is_left_sub_block));
}
ps_dec->apf_intra_pred_luma_8x8 is an array of function pointers, which length is 0x09. In C programming language, the array can accept a negative number, but it can also cause an unexpected memory operation. Here the program jumps to an unexpected memory address to execute a function. The memory address could be in the code segment or the data segment.
The following is the debug info.
(gdb) n
1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) si
0xb5efbc00 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) si
0xb5efbc04 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) x/10i $pc
=> 0xb5efbc04: movs r2, #1
0xb5efbc06: ldr r0, [sp, #84] ; 0x54
0xb5efbc08: ldr r3, [sp, #32]
0xb5efbc0a: add.w r8, r1, r7, lsl #2
0xb5efbc0e: mov r1, r9
0xb5efbc10: ldr.w r7, [r8, #4]
0xb5efbc14: blx r7
0xb5efbc16: ldr r0, [sp, #60] ; 0x3c
0xb5efbc18: ldrb r3, [r0, #2]
0xb5efbc1a: asrs r3, r6
(gdb) si
0xb5efbc06 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb)
0xb5efbc08 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb)
0xb5efbc0a 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) si
0xb5efbc0e 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) x/10i $pc
=> 0xb5efbc0e: mov r1, r9
0xb5efbc10: ldr.w r7, [r8, #4]
0xb5efbc14: blx r7
0xb5efbc16: ldr r0, [sp, #60] ; 0x3c
0xb5efbc18: ldrb r3, [r0, #2]
0xb5efbc1a: asrs r3, r6
0xb5efbc1c: lsls r0, r3, #31
0xb5efbc1e: bpl.n 0xb5efbc78
0xb5efbc20: ldr r7, [sp, #96] ; 0x60
0xb5efbc22: movs r1, #1
(gdb) si
0xb5efbc10 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb)
0xb5efbc14 1713 ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) i r
r0 0xb4c407e8 3032745960
r1 0xb4e2adc0 3034754496
r2 0x1 1
r3 0x1a0 416
r4 0x0 0
r5 0x81 129
r6 0x0 0
r7 0xb60f8fc8 3054473160
r8 0xb60d05dc 3054306780
r9 0xb4e2adc0 3034754496
r10 0xb60cf4a1 3054302369
r11 0x0 0
r12 0xb4e2b77f 3034756991
sp 0xb4c40688 0xb4c40688
lr 0x0 0
pc 0xb5efbc14 0xb5efbc14
cpsr 0x70030 458800
(gdb) si
Cannot access memory at address 0x0
0xb60f8fc8 in ?? ()
(gdb) x/8i $pc
=> 0xb60f8fc8: ldrbtlt r12, [r10], #3104 ; 0xc20
0xb60f8fcc: ldrbtlt r0, [sp], #400 ; 0x190
0xb60f8fd0: muleq r0, r0, r0
0xb60f8fd4: andeq r0, r0, r0
0xb60f8fd8: andeq r0, r0, r0
0xb60f8fdc: lsreq r0, r0, #3
0xb60f8fe0: adcseq r0, r8, r0, ror r1
0xb60f8fe4: andeq r0, r0, r0
We can use command "cat /proc/[pid]/maps" to check the memory map.
b60c0000-b6100000 rw-p 00000000 00:00 0 [anon:libc_malloc]
The address 0xb60f8fc8 is between 0xb60c0000 and 0xb6100000, which is a data segment without execution privilege. So it causes a memory corruption.
In summary, we have drawn the code execution flow chart below to show how the vulnerability is triggered in a multithread environment.
Figure 6. The code execution flow to trigger vulnerability
Finally, let’s see Google’s patch for this issue. Please refer to https://android.googlesource.com/platform/external/libavc/+/a78887bcffbc2995cf9ed72e0697acf560875e9e. Google fixed the slice number increment for error clips. In ih264d_parse_slice.c, the patch is shown below.
diff --git a/decoder/ih264d_parse_slice.c b/decoder/ih264d_parse_slice.c
index 5ff92f8..73bc45d 100644
--- a/decoder/ih264d_parse_slice.c
+++ b/decoder/ih264d_parse_slice.c
@@ -374,6 +374,7 @@
ps_dec->ps_parse_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
ps_dec->ps_decode_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
ps_dec->ps_computebs_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
+ ps_dec->u2_cur_slice_num = 0;/* Initialize all the HP toolsets to zero */
ps_dec->s_high_profile.u1_scaling_present = 0;
@@ -573,7 +574,6 @@
ps_dec->u2_mv_2mb[1] = 0;
ps_dec->u1_last_pic_not_decoded = 0;
- ps_dec->u2_cur_slice_num = 0;
ps_dec->u2_cur_slice_num_dec_thread = 0;
ps_dec->u2_cur_slice_num_bs = 0;
ps_dec->u4_intra_pred_line_ofst = 0;
@@ -1425,7 +1425,10 @@
}if (ps_dec->u4_first_slice_in_pic == 0)
+ {
ps_dec->ps_parse_cur_slice++;
+ ps_dec->u2_cur_slice_num++;
+ }
ps_dec->u1_slice_header_done = 0;@@ -1908,7 +1911,6 @@
if(ret != OK)
return ret;
- ps_dec->u2_cur_slice_num++;
/* storing last Mb X and MbY of the slice */
ps_dec->i2_prev_slice_mbx = ps_dec->u2_mbx;
ps_dec->i2_prev_slice_mby = ps_dec->u2_mby;
From the above patch, we can see Google fixed the slice number increment for error slice.
Combined with our analysis, after patching, ps_dec->u2_cur_slice_num will be 0x01 after handling the specially crafted NAL unit. When the program handles the next NAL unit, it executes in the function ih264d_parse_tfr_nmb.
87 for(i = 0; i < u1_num_mbs; i++) // u1_num_mbs is 0x16
88 {
89 UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
90 ps_dec->u2_cur_slice_num);
91 DATA_SYNC();
92 UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_dec_mb_map, u4_mb_num);
93
94 u4_mb_num++;
95 }
The buffer pointed to by ps_dec->pu2_slice_num_map at offset u4_mb_num will be updated with 0x01 in a loop. Next, the program continues to run in the function ih264d_recon_deblk_slice.
578 for(j = 0; j < recon_mb_grp; j++)
579 {
580 GET_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, ps_dec->cur_recon_mb_num,
581 u2_slice_num);
582
583 if(u2_slice_num != ps_dec->u2_cur_slice_num_bs) // here ps_dec->u2_cur_slice_num_bs is 0x0
584 {
585 u4_slice_end = 1;
586 break;
587 }
588 if(ps_dec->i1_recon_in_thread3_flag)
…
598 if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
599 {
600 ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
601 ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j);
602 }
603 }
u2_slice_num will be 0x01 via line 580 GET_SLICE_NUM_MAP. ps_dec->u2_cur_slice_num_bs is 0x0, so it breaks the loop. Then the function ih264d_process_intra_mb on line 601 will not be called, and the vulnerability will not be triggered.
Demo
As mentioned in the “Proof of Concept” section, this vulnerability exists in the software-based H.264 decoder. Mediaserver normally prefers the hardware-based H.264 decoder shipped with most Android devices over the vulnerable software-based one. If the hardware-based H.264 decoder is chosen to parse the PoC file, the vulnerability is not triggered. Applications supporting H.264 media, however, could be vulnerable to the vulnerability depending on which decoder is chosen by them.
We developed an Android app that can demonstrate this vulnerability. From the video below, you can see that the Mediaserver crashed and restarted.
Mitigation
All users of Google Android are encouraged to upgrade to the latest version of the software. Additionally, organizations that have deployed Fortinet IPS solutions are already protected from this vulnerability with the signature Google.Android.Mediaserver.Remote.Code.Execution.
Timeline
2016-05-06: Kai Lu of Fortinet's FortiGuard Labs reported this vulnerability to Google
2016-05-31: Google confirmed this vulnerability and set the severity to Critical
2016-08-01: Google released the patch
2016-08-05: Advisory posted by Fortinet's FortiGuard