Debugging Tools for Windows

Annotated Itanium Disassembly

Source Code

The following is the code for the function that will be analyzed.

HRESULT CUserView::CloseView(void)
{
    if (m_fDestroyed) return S_OK;

    BOOL fViewObjectChanged = FALSE;
    ReleaseAndNull(&m_pdtgt);

    if (m_psv) {
        m_psb->EnableModelessSB(FALSE);
        if (m_pws) m_pws->ViewReleased();

        IShellView* psv;

        HWND hwndCapture = GetCapture();
        if (hwndCapture && hwndCapture == m_hwnd) {
            SendMessage(m_hwnd, WM_CANCELMODE, 0, 0);
        }

        m_fHandsOff = TRUE;
        m_fRecursing = TRUE;
        NotifyClients(m_psv, NOTIFY_CLOSING);
        m_fRecursing = FALSE;

        m_psv->UIActivate(SVUIA_DEACTIVATE);

        psv = m_psv;
        m_psv = NULL;

        ReleaseAndNull(&_pctView);

        if (m_pvo) {
            IAdviseSink *pSink;
            if (SUCCEEDED(m_pvo->GetAdvise(NULL, NULL, &pSink)) && pSink) {
                if (pSink == (IAdviseSink *)this)
                    m_pvo->SetAdvise(0, 0, NULL);
                pSink->Release();
            }

            fViewObjectChanged = TRUE;
            ReleaseAndNull(&m_pvo);
        }

        if (psv) {
            psv->SaveViewState();
            psv->DestroyViewWindow();
            psv->Release();
        }

        m_hwndView = NULL;
        m_fHandsOff = FALSE;

        if (m_pcache) {
            GlobalFree(m_pcache);
            m_pcache = NULL;
        }

        m_psb->EnableModelessSB(TRUE);

        CancelPendingActions();
    }

    ReleaseAndNull(&_psf);

    if (fViewObjectChanged)
        NotifyViewClients(DVASPECT_CONTENT, -1);

    if (m_pszTitle) {
        LocalFree(m_pszTitle);
        m_pszTitle = NULL;
    }

    SetRect(&m_rcBounds, 0, 0, 0, 0);
    return S_OK;
}

Assembly Code

This section contains the annotated disassembly.

HRESULT CUserView::CloseView(void)

On entry to a function, the parameters are passed in registers r32 through r39. Remember that for C++ functions, the first parameter is the secret "this" pointer. Therefore, on entry to CUserView::CloseView, the registers are:

r32 = this
br = return address

Parameters are passed in r32 through r39.

The br variable contains the function return address.

The Itanium separates its registers into several categories. The ones you will see most are r (regular integer registers), b (branch registers, used for branching), and p (predicate registers, which can hold the value TRUE or FALSE).

It so happens that CloseView is a method on ViewState, which is at offset 12 in the underlying object. This method will be referred to as "this," although when there is possible confusion with another base class, it will be more carefully specified as "(ViewState*)this".

sample!.CUserView__CloseView:

Notice that the symbol name is preceded by a dot. Recall that function pointers on the Itanium are not pointers to code. Instead, they point to a descriptor block (Intel calls it the PLABEL), which contains information about the function (including the address of its first instruction). The symbol without a leading dot represents the function descriptor. The symbol with a leading dot is the first line of code.

{
717796c0   {          alloc    r39 = ar.pfs, 0ah, 00h, 05h, 00h
717796c4              mov      r40 = pr

The alloc instruction builds a stack frame. In this case, the stack frame has the local region with 10 (0ah) registers and it needs 5 output registers. Recall that in disassembly, the input registers and local registers are combined to form the local region. Because you know that you have only one parameter ("this"), there must be 9 local registers.

Every function begins with an alloc instruction.

Because you have a total of 10 registers in your local region, and the local region begins with register r32, the local region registers must be r32 through r41, leaving r42 through r46 as the 5 output registers.

The second instruction saves the predicate registers (pr) into register r40. This allows the predicate registers to be restored before the function returns.

    if (m_fDestroyed) return S_OK;
717796c8              adds     r31 = 0180h, r32          }// r31 = &this->m_fDestroyed

This is equivalent to "r31 = 0180h + r32". The adds instruction adds a small integer to a register. There is a corresponding addl instruction that adds a large integer, but it can only add to registers r0, gp, r2, and r3.

Arithmetic operations are of the form "op dst = src, src, src".

Before proceeding with the computations, you need to finish the function prologue.

717796d0   {          adds     sp = -32, sp ;;    // local stack space
717796d4              ld8.nta  r3 = [sp]          // stack probe
717796d8              mov      r38 = rp       } // save return address
717796e0   {          or       r41 = gp, r0 ;;    // save gp

Notice that the integer argument to the adds sp instruction is disassembled in decimal rather than hexadecimal. However, the debugger default is to assume hexadecimal for all inputs, so the command ?sp-32 will typically display as "sp-0x32". You can use the n (Set Number Base) command to change the default radix to 10. (The default radix can be overridden with the 0x hexadecimal prefix or the 0n decimal prefix.)

The next adds instruction allocates some space on the stack. It is followed by a load instruction that appears unusual.

The ld instruction loads a register from memory.

The "nta" suffix means "this memory location will not be accessed for a long time" and is an indication to the processor to allow it to better optimize its L2 cache. The only place you will see it in regular code is at the top of a function to perform a stack probe.

The third instruction saves the return address into register r38 so you know where to go when this function is finished.

The fourth instruction says, "r41 = gp | r0".

The r0 register is hard-wired to the value zero.

Therefore, ORing something with zero and storing the result is the same as copying the value.

"or dest = src, r0" copies a register.

The gp register, by convention, always points to the current module's global variables.

Because you are calling functions that might belong to other DLLs, you need to save the gp register so you can access your global variables after calling those functions. This will require the gp register to point to their global variables.

717796e4              ld4      r31 = [r31]       // r31 = m_fDestroyed
717796e8              nop.i    00h ;;            }

The ld4 instruction is another memory fetch, but this time you load only 32 bits (4 bytes) from memory instead of a full 64 bits.

The ld4 instruction zero-extends the loaded value to 64 bits.

The next instruction is a NOP. The suffix indicates that this is actually an integer NOP However, this is not important, because the instruction does nothing.

717796f0   {          cmp4.eq  p14, p15 = r0, r31         // if m_fDestroyed
717796f4              nop.f    00h
717796f8     (p15)    br.cond.dpnt.few  $+07c0h ;;       }// jump if not to ReturnSOK

The cmp4.eq instruction compares the bottom 32 bits of the two registers r0 and r31. The result of the comparison is saved into the p14 register, and the opposite result is saved to p15.

The cmp instruction compares two registers. The result of the comparison is saved in the first destination register; the opposite is saved in the second destination register.

The cmp4.eq instruction compares the r31 register (which you just loaded) with r0 (which is zero). If the registers are equal, then p14 is set to TRUE; otherwise, p14 is set to FALSE. The p15 register is set to the opposite of p14 (that is, FALSE or TRUE, respectively).

After another NOP, check the result of the comparison.

A parenthesized register in the left margin indicates that the instruction is executed only if that register is TRUE.

In this case, you execute the branch instruction only if the p15 register is TRUE, which happens when the previous comparison is FALSE, because p15 was the second destination of the comparison.

The suffixes on the br.cond instruction are hints to the processor for optimization and do not affect the execution semantics.

The target of the branch instruction is not quite right. The target is disassembled as $+07c0h, but the actual target address is 07b8h bytes away. That is because the branch target is computed relative to the beginning of the instruction bundle (in other words, relative to the preceding open brace). Thus, when you are disassembling through code and trying to follow the flow of the code, be careful how you compute your jump targets.

Jump instructions are relative to the start of the bundle, not to the start of the instruction.

Now you can see how that gp register gets used. The compiler has interleaved some instructions for performance, so the next step it does is actually the initial step for calling an imported function.

    ReleaseAndNull(&m_pdtgt);
71779700   {          addl     r31 = -2025856, gp         // r31 = &__imp__ReleaseAndNull
71779704              adds     r42 = 0198h, r32           // r42 = &this->m_pdtgt

Because the gp register points to your global variables, the addl instruction against the gp register is computing the address of a global variable. In this case, it is the import table entry for the ReleaseAndNull function.

As you computed at the beginning of the function, the output registers start at r42, and here you see the first output register being loaded with the address of the m_pdtgt member. This is the first, and in this case, the only parameter of ReleaseAndNull.

    if (m_psv)
71779708              adds     r33 = 0110h, r32          }// r33 = &this->m_psv

The compiler has aggressively started emitting instructions for the line two ahead of where you are.

    BOOL fViewObjectChanged = FALSE;
71779710   {          or       r37 = r0, r0 ;;             ; r37 = 0

The compiler used the r37 register to represent the local variable fViewObjectChanged. Because the r0 register is hard-wired to zero, the OR instruction zeroes out the r37 register.

    ReleaseAndNull(&m_pdtgt);
71779714              ld8      r31 = [r31]                // r31 = &ReleaseAndNull
71779718              nop.i    00h ;;                    }
71779720   {          ld8      r30 = [r31], 08h ;;        // r30 = .ReleaseAndNull
71779724              ld8      gp = [r31]                 // gp  = gp.ReleaseAndNull
71779728              mov      b6 = r30, $+0x0           }// b6  = .ReleaseAndNull
71779730   {          nop.m    00h
71779734              nop.f    00h
71779738              br.call.dptk.many  rp = b6 ;;      }// call .ReleaseAndNull

Here is the remainder of the function call. Recall that a function pointer is really a pointer to a descriptor. Thus, after getting the address of the descriptor, you read the actual code address and the gp of the target function.

The following diagram outlines this concept.

__imp__Function    Function            .Function
+--------------+   +---------------+   +-----------------------+
| Function    ---> | code pointer ---> | code...               |
+--------------+   +---------------+   |                       |
                   | gp            |   |                       |
                   +---------------+   |                       |

Remember, a function name without a period in front of it refers to the function descriptor. The function with a period represents the first line of code.

"ld dest = [src], n" is a post-incremented load.

In this case, "ld8 r30 = [r31], 08h" means "load the r30 register from the 8 bytes pointed to by r31, and then increment r31 by 8."

After you load the code address into r30, you move it into the b6 register so you can branch to it. You cannot move from memory directly into a branch register; you have to go through a regular integer register.

Finally, execute a br.call instruction to call the target function. (Again, the other suffixes are optimization indications.)

You will see this pattern repeatedly; it is the template for calling an imported function.

To call an imported function
  1. Load the address of the import table
  2. Load from the import table to get the descriptor
  3. Load the code address from the descriptor, with postincrement +8
  4. Load the gp register from the descriptor
  5. Move the code address into a branch register
  6. Call through the branch register

    if (m_psv)
71779740   {          ld8      r29 = [r33]                // r29 = m_psv
71779744              or       gp = r41, r0 ;;            // restore gp
71779748              cmp.eq   p14, p15 = r0, r29        }// m_psv == 0?
71779750   {          nop.m    00h
71779754              nop.f    00h
71779758     (p14)    br.cond.dpnt.few  $+0620h ;;       }// Y: jump to NoPSV

After control returns, dereference the r33 register, which you set up before calling ReleaseAndNull. The processor automatically preserves the local region across calls, so the r33 register still contains the value you set up earlier.

After returning from the imported function call, restore the gp register to point to your own global registers so you can access them once again. (Though theoretically this often is not necessary because you do not access any global variables before the next call that discards it anyway.)

The following is again a comparison followed by a predicated branch instruction. Check if the value you loaded is zero and branch it if it is.

        m_psb->EnableModelessSB(FALSE);
71779760   {          adds     r36 = 0a0h, r32            // r36 = &this->m_psb
71779764              addl     r43 = 00h, r0              // r43 = 0
71779768              nop.b    00h ;;                    }
71779770   {          ld8      r42 = [r36] ;;             // r42 = this->m_psb
71779774              ld8      r31 = [r42]                // r31 = this->m_psb.vtbl
71779778              nop.i    00h ;;                    }
71779780   {          adds     r31 = 048h, r31 ;;         // r31 = &vtbl.EnableModelessSB
71779784              ld8      r30 = [r31]                // r30 = &EnableModelessSB
71779788              nop.i    00h ;;                    }
71779790   {          ld8      r29 = [r30], 08h ;;        // r29 = .EnableModelessSB
71779794              ld8      gp = [r30]                 // gp  = gp.EnableModelessSB
71779798              mov      b6 = r29, $+0x0           }// b6  = .EnableModelessSB
717797a0   {          nop.m    00h
717797a4              nop.f    00h
717797a8              br.call.dptk.many  rp = b6 ;;      }// call .EnableModelessSB

Now you get to see how a virtual C++ method call gets compiled. As before, registers r42 and r43 receive the parameters to the function.

This time, instead of getting the function pointer from your gp, you load the vtbl out of m_psb, step forward to the method you want to call, then set up for another function call.

The following diagram outlines this process.

object      object vtbl
+-------+   +-----------------+
| vtbl ---> | QueryInterface  |
+-------+   +-----------------+
| ...   |   | AddRef          |
            +-----------------+
            | Release         |
            +-----------------+
            | ...             |   EnableModeless      .EnableModeless
            +-----------------+   +---------------+   +-----------------------+
            | EnableModeless ---> | code pointer ---> | code...               |
            +-----------------+   +---------------+   |                       |
            | ...             |   | gp            |   |                       |
                                  +---------------+   |                       |

The following is a pattern you should recognize.

How to compile a virtual C++ method
  1. Load the object into an output register.
  2. Load the vtbl from the object.
  3. Add to the vtbl to get to the function you want to call.
  4. Load from the vtbl to get the descriptor.
  5. Load the code address from the descriptor, with postincrement +8.
  6. Load the gp register from the descriptor.
  7. Move the code address into a branch register.
  8. Call through the branch register.

        if (m_pws) m_pws->ViewReleased();
717797b0   {          adds     r28 = 0228h, r32           // r28 = &this->m_pws
717797b4              or       gp = r41, r0               // restore gp after call
717797b8              nop.b    00h ;;                    }
717797c0   {          ld8      r42 = [r28] ;;             // r42 = this->m_pws
717797c4              cmp.eq   p14, p15 = r0, r42         // equal to zero?
717797c8              nop.i    00h                       }
717797d0   {          nop.m    00h
717797d4              nop.f    00h
717797d8     (p14)    br.cond.dpnt.few  $+050h ;;        }// Y: Jump to NoWS
717797e0   {          ld8      r31 = [r42] ;;             // r31 = this->m_pws.vtbl
717797e4              adds     r31 = 018h, r31            // r31 = &vtbl.ViewReleased
717797e8              nop.i    00h ;;                    }
717797f0   {          ld8      r30 = [r31] ;;             // r30 = &ViewReleased
717797f4              ld8      r29 = [r30], 08h           // r29 = .ViewReleased
717797f8              nop.i    00h ;;                    }
71779800   {          ld8      gp = [r30]                 // gp  = gp.ViewReleased
71779804              mov      b6 = r29, $+0x0            // b6  = ViewReleased
71779808              br.call.dptk.many  rp = b6 ;;      }// call  .ViewReleased
71779810   {          or       gp = r41, r0               // restore gp after call
71779814              nop.f    00h
71779818              nop.b    00h ;;                    }
NoWS:

You are now familiar enough to read the following entire code sequence, which consists of another variable test, conditional jump, and method call:

        HWND hwndCapture = GetCapture();
71779820   {          addl     r31 = -2024992, gp ;;      // r31 = &__imp__GetCapture
71779824              ld8      r30 = [r31]                // r30 = &GetCapture
71779828              nop.i    00h ;;                    }
71779830   {          ld8      r29 = [r30], 08h ;;        // r29 = .GetCapture
71779834              ld8      gp = [r30]                 // gp  = gp.GetCapture
71779838              mov      b6 = r29, $+0x0           }// b6  = .GetCapture
71779840   {          nop.m    00h
71779844              nop.f    00h
71779848              br.call.dptk.many  rp = b6 ;;      }// call .GetCapture
71779850   {          or       gp = r41, r0               // restore gp after call

Again, a pattern you already recognize, but this time it is a call to an imported function:

        if (hwndCapture && hwndCapture == m_hwnd) {
71779854              cmp.eq   p14, p15 = r0, ret0        // hwndCapture == NULL?
71779858     (p14)    br.cond.dpnt.few  $+080h ;;        }// Y: Jump to NoCapture
71779860   {          adds     r31 = 0b8h, r32 ;;         // r31 = &this->m_hwnd
71779864              ld8      r42 = [r31]                // r42 = this->m_hwnd
71779868              nop.i    00h ;;                    }
71779870   {          cmp.eq   p15, p14 = ret0, r42       // hwndCapture == this->m_hwnd?
71779874              nop.f    00h
71779878     (p14)    br.cond.dptk.few  $+060h ;;        }// Y: Jump to NoCapture

Functions return their value in the ret0 register.

Check if hwndCapture is NULL or equal to m_hwnd, and jump if either condition is met.

            SendMessage(m_hwnd, WM_CANCELMODE, 0, 0);
        }
71779880   {          addl     r31 = -2025920, gp         // r31 = &__imp__SendMessage
71779884              addl     r45 = 00h, r0              // r45 = 0 (lParam)
71779888              addl     r44 = 00h, r0             }// r44 = 0 (wParam)
71779890   {          addl     r43 = 01fh, r0 ;;          // r43 = WM_CANCELMODE (uMsg)
71779894              ld8      r30 = [r31]                // r30 = &SendMessage
71779898              nop.i    00h ;;                    }
717798a0   {          ld8      r29 = [r30], 08h ;;        // r29 = .SendMessage
717798a4              ld8      gp = [r30]                 // gp  = gp.SendMessage
717798a8              mov      b6 = r29, $+0x0           }// b6  = .SendMessage
717798b0   {          nop.m    00h
717798b4              nop.f    00h
717798b8              br.call.dptk.many  rp = b6 ;;      }// call .SendMessage
717798c0   {          or       gp = r41, r0               // restore gp after call
717798c4              nop.f    00h
717798c8              nop.b    00h                       }
NoCapture:

This time, you have to call a function that takes four parameters, so you fill the output registers r42 through r45 with the parameters. Note that you set up r42 ahead of time while you were still deciding whether to call SendMessage.

        m_fHandsOff = TRUE;
        m_fRecursing = TRUE;
        NotifyClients(m_psv, NOTIFY_CLOSING);
717798d0   {          ld8      r43 = [r33]                // r43 = m_psv
717798d4              adds     r42 = 040h, r32            // r42 = (CNotifySource*)this
717798d8              adds     r35 = 0218h, r32          }// r35 = &this->bitfield
717798e0   {          addl     r31 = 08001h, r0 ;;        // r31 = 0x8001
717798e4              addl     r44 = 04h, r0              // r44 = 4
717798e8              nop.i    00h ;;                    }
717798f0   {          ld8      r29 = [r42]                // r29 = (CNotifySource*)this->vtbl
717798f4              ld4      r30 = [r35]                // r30 = bitfield
717798f8              nop.b    00h ;;                    }
71779900   {          or       r31 = r31, r30 ;;          // set the two bits in r30
71779904              st4      [r35] = r31                // store the result

Now you get to see how bitfields get compiled, with some instructions from the upcoming method call interleaved. Recall that you set up the r33 register to point to the m_psv member, so all you have to do is dereference it to fetch the value.

The other instructions prepare for the method call by setting up the output registers (r42 through r44) and loading the vtbl of m_psv.

Meanwhile, the two bits in the bitfield are set by loading the full bitfield, ORing the values, then storing the result.

        NotifyClients(m_psv, NOTIFY_CLOSING);
71779908              adds     r29 = 018h, r29 ;;        }// r29 = &vtbl.NotifyClients
71779910   {          ld8      r30 = [r29] ;;             // r30 = NotifyClients
71779914              ld8      r31 = [r30], 08h           // r31 = .NotifyClients
71779918              nop.i    00h ;;                    }
71779920   {          ld8      gp = [r30]                 // gp  = gp.NotifyClients
71779924              mov      b6 = r31, $+0x0            // b6  = .NotifyClients
71779928              br.call.dptk.many  rp = b6 ;;      }// call .NotifyClients

Here is the rest of the method call:

        m_fRecursing = FALSE;
71779930   {          ld4      r28 = [r35]                // r28 = bitfield
71779934              ld8      r42 = [r33]                // r42 = m_psv
71779938              addl     r29 = -32769, r0 ;;       }// r29 = ~0x8000
71779940   {          and      r29 = r29, r28             // r29 = bitfield & ~0x8000
71779944              or       gp = r41, r0               // restore gp after call
71779948              addl     r43 = 00h, r0 ;;          }// r43 = 0 (SVUIA_DEACTIVATE)
71779950   {          st4      [r35] = r29                // store updated bitfield

Here is how to clear a bit in a bitfield:

        m_psv->UIActivate(SVUIA_DEACTIVATE);
71779954              ld8      r31 = [r42]                // r31 = m_psv.vtbl
71779958              nop.b    00h ;;                    }
71779960   {          adds     r31 = 038h, r31 ;;         // r31 = &vtbl.UIActivate
71779964              ld8      r30 = [r31]                // r30 = &UIActivate
71779968              nop.i    00h ;;                    }
71779970   {          ld8      r29 = [r30], 08h ;;        // r29 = .UIActivate
71779974              ld8      gp = [r30]                 // gp  = gp.UIActivate
71779978              mov      b6 = r29, $+0x0           }// b6  = .UIActivate
71779980   {          nop.m    00h
71779984              nop.f    00h
71779988              br.call.dptk.many  rp = b6 ;;      }// call .UIActivate

These method calls are becoming routine.

        psv = m_psv;
        m_psv = NULL;
        ReleaseAndNull(&_pctView);
71779990   {          ld8      r34 = [r33]                // r34 (psv) = m_psv
71779994              or       gp = r41, r0               // restore gp after call
71779998              adds     r42 = 0100h, r32          }// r42 = &this->_bbt._pctView
717799a0   {          st8      [r33] = r0 ;;              // m_psv = 0
717799a4              addl     r31 = -2025856, gp         // r31 = &__imp__ReleaseAndNull
717799a8              nop.i    00h ;;                    }
717799b0   {          ld8      r30 = [r31] ;;             // r30 = &ReleaseAndNull
717799b4              ld8      r29 = [r30], 08h           // r29 = .ReleaseAndNull
717799b8              nop.i    00h ;;                    }
717799c0   {          ld8      gp = [r30]                 // gp  = gp.ReleaseAndNull
717799c4              mov      b6 = r29, $+0x0            // b6  = .ReleaseAndNull
717799c8              br.call.dptk.many  rp = b6 ;;      }// call .ReleaseAndNull

Another method call, with some variable rearranging in it:

        if (m_pvo) {
717799d0   {          adds     r33 = 0178h, r32           // r33 = &this->m_pvo
717799d4              or       gp = r41, r0               // restore gp after call
717799d8              nop.b    00h ;;                    }
717799e0   {          ld8      r42 = [r33] ;;             // r42 = this->m_pvo
717799e4              cmp.eq   p14, p15 = r0, r42         // m_pvo == NULL?
717799e8              nop.i    00h                       }
717799f0   {          nop.m    00h
717799f4              nop.f    00h
717799f8     (p14)    br.cond.dpnt.few  $+01a0h ;;       }// Y: Jump to NoPVO

Simply checking a variable:

            IAdviseSink *pSink;
            if (SUCCEEDED(m_pvo->GetAdvise(NULL, NULL, &pSink)) ...
71779a00   {          ld8      r31 = [r42]                // r31 = this->m_pvo->vtbl
71779a04              adds     r45 = 010h, sp             // r45 = &pSink
71779a08              addl     r44 = 00h, r0             }// r44 = NULL
71779a10   {          addl     r43 = 00h, r0 ;;           // r43 = NULL
71779a14              adds     r37 = 010h, sp             // r37 = &pSink
71779a18              nop.i    00h ;;                    }
71779a20   {          adds     r31 = 040h, r31 ;;         // r31 = &vtbl.GetAdvise
71779a24              ld8      r30 = [r31]                // r30 = &GetAdvise
71779a28              nop.i    00h ;;                    }
71779a30   {          ld8      r29 = [r30], 08h ;;        // r29 = .GetAdvise
71779a34              ld8      gp = [r30]                 // gp  = gp.GetAdvise
71779a38              mov      b6 = r29, $+0x0           }// b6  = .GetAdvise
71779a40   {          nop.m    00h
71779a44              nop.f    00h
71779a48              br.call.dptk.many  rp = b6 ;;      }// call .GetAdvise

Now call the GetAdvise method, and check the return value.

            if (SUCCEEDED(...) && pSink) {
71779a50   {          ld8      r42 = [r37]                // r42 = pSink
71779a54              cmp4.eq  p14, p15 = 01h, r0         // p14 = FALSE, p15 = TRUE
71779a58              or       gp = r41, r0 ;;           }// restore gp after call
71779a60   {          cmp4.gt.or.andcm  p14, p15 = r0, ret0 // if (0 > ret0) { p14 = TRUE; p15 = FALSE }
71779a64              cmp.eq.or.andcm  p14, p15 = r0, r42 // if (0 == pSink) { p14 = TRUE; p15 = FALSE }
71779a68     (p14)    br.cond.dpnt.few  $+0f0h ;;        }// if FAILED or NULL, jump to NoSink

Now you see the unusual conditional instructions in action.

The first comparison instruction seems pointless in that it compares one against zero. It does this to initialize the predicate registers p14 to FALSE and p15 to TRUE.

The next two comparison instructions execute in parallel. The first one checks if 0 > ret0; that is, if the return value is negative. The next two suffixes indicate that the result is ORed into the first destination and complemented and ANDed (andcm) with the second destination.

After doing the combination testing, you jump if p14 is TRUE, which happens if one of the combination tests was true.

There are two interesting points to note about this sequence. First, the two combination comparisons were executed in parallel. (Notice that they execute as part of the same instruction group.) Second, you only executed a single conditional jump. Furthermore, that conditional jump was part of the same instruction group, so it all executes in a single cycle.

A traditional CPU would have had to test the return value, jump conditionally, then test pSink, and jump conditionally again.Those four operations are dependent on each other, so it would take four cycles to perform the combination test instead of the one cycle that the Itanium required.

                if (pSink == (IAdviseSink *)this)
71779a70   {          adds     r31 = -40, r32 ;;          // r31 = (CUserView*)this
71779a74              cmp.eq   p14, p15 = r0, r31         // r31 == NULL?
71779a78              adds     r31 = 028h, r32           }// r31 = (IAdviseSink*)this
71779a80   {          nop.m    00h
71779a84              nop.f    00h
71779a88     (p15)    br.cond.dpnt.few  $+020h ;;        }// N: Jump to NotNULL
71779a90   {          or       r31 = r0, r0               // r31 = NULL
71779a94              nop.f    00h
71779a98              nop.b    00h ;;                    }
NotNULL:

This is a quirk of the C++ specification, namely that if p is a NULL pointer, then (T*)p must be equal to NULL for all types T. This means that changing from one base class to another requires the compiler to stick in special checks for NULL.

At the end of the following sequence, the r31 register contains the result of the cast (IAdviseSink*).

                if (pSink == (IAdviseSink *)this)
                    m_pvo->SetAdvise(0, 0, NULL);
71779aa0   {          cmp.eq   p15, p14 = r42, r31        // r31 == pSink?
71779aa4              nop.f    00h
71779aa8     (p14)    br.cond.dptk.few  $+070h ;;        }// N: Jump to NotSameSink
71779ab0   {          ld8      r42 = [r33]                // r42 = this->m_pvo
71779ab4              addl     r45 = 00h, r0              // r45 = 0
71779ab8              addl     r44 = 00h, r0             }// r44 = 0
71779ac0   {          addl     r43 = 00h, r0 ;;           // r43 = NULL
71779ac4              ld8      r31 = [r42]                // r31 = this->m_pvo->vtbl
71779ac8              nop.i    00h ;;                    }
71779ad0   {          adds     r31 = 038h, r31 ;;         // r31 = &vtbl.SetAdvise
71779ad4              ld8      r30 = [r31]                // r30 = SetAdvise
71779ad8              nop.i    00h ;;                    }
71779ae0   {          ld8      r29 = [r30], 08h ;;        // r29 = .SetAdvise
71779ae4              ld8      gp = [r30]                 // gp  = gp.SetAdvise
71779ae8              mov      b6 = r29, $+0x0           }// b6  = .SetAdvise
71779af0   {          nop.m    00h
71779af4              nop.f    00h
71779af8              br.call.dptk.many  rp = b6 ;;      }// call .SetAdvise
71779b00   {          ld8      r42 = [r37]                //r42 = pSink (restore register variable)
71779b04              or       gp = r41, r0               // restore gp after call
71779b08              nop.b    00h ;;                    }
NotSameSink:

Notice that the compiler had to reload the r42 register after the call, because r42 is an output register, and output registers can be modified across a call. (That is not important in this case, because you also explicitly destroyed the r42 register.)

                pSink->Release();
            }
71779b10   {          ld8      r31 = [r42] ;;             // r31 = pSink->vtbl
71779b14              adds     r31 = 010h, r31            // r31 = &vtbl.Release
71779b18              nop.i    00h ;;                    }
71779b20   {          ld8      r30 = [r31] ;;             // r30 = &Release
71779b24              ld8      r29 = [r30], 08h           // r29 = .Release
71779b28              nop.i    00h ;;                    }
71779b30   {          ld8      gp = [r30]                 // gp  = gp.Release
71779b34              mov      b6 = r29, $+0x0            // b6  = .Release
71779b38              br.call.dptk.many  rp = b6 ;;      }// call .Release
71779b40   {          or       gp = r41, r0               // restore gp after call
71779b44              nop.f    00h
71779b48              nop.b    00h ;;                    }
NoSink:

Another method call:

            fViewObjectChanged = TRUE;
            ReleaseAndNull(&m_pvo);
        }
71779b50   {          addl     r31 = -2025856, gp         // r31 = &__imp__ReleaseAndNull
71779b54              or       r42 = r33, r0              // r42 = r33 = &this->m_pvo
71779b58              addl     r37 = 01h, r0 ;;          }// r37 (fViewObjectChanged) = 1 = TRUE
71779b60   {          ld8      r30 = [r31] ;;             // r30 = &ReleaseAndNull
71779b64              ld8      r29 = [r30], 08h           // r29 = .ReleaseAndNull
71779b68              nop.i    00h ;;                    }
71779b70   {          ld8      gp = [r30]                 // gp  = gp.ReleaseAndNull
71779b74              mov      b6 = r29, $+0x0            // b6  = .ReleaseAndNull
71779b78              br.call.dptk.many  rp = b6 ;;      }// call .ReleaseAndNull
71779b80   {          or       gp = r41, r0               // restore gp after call
71779b84              nop.f    00h
71779b88              nop.b    00h                       }

Recall that the fViewObjectChanged local variable is being kept in the r37 register.

        if (psv) {
            psv->SaveViewState();
            psv->DestroyViewWindow();
            psv->Release();
        }
71779b90   {          cmp.eq   p14, p15 = r0, r34         // r34 (psv) == NULL?
71779b94              nop.f    00h
71779b98     (p14)    br.cond.dptk.few  $+0d0h ;;        }// Y: Jump to NoPSV2
71779ba0   {          ld8      r31 = [r34]                // r31 = psv->vtbl
71779ba4              or       r42 = r34, r0 ;;           // r42 = psv
71779ba8              adds     r31 = 068h, r31 ;;        }// r31 = &vtbl.SaveViewState
71779bb0   {          ld8      r30 = [r31] ;;             // r30 = &SaveViewState
71779bb4              ld8      r29 = [r30], 08h           // r29 = .SaveViewState
71779bb8              nop.i    00h ;;                    }
71779bc0   {          ld8      gp = [r30]                 // gp  = gp.SaveViewState
71779bc4              mov      b6 = r29, $+0x0            // b6  = .SaveViewState
71779bc8              br.call.dptk.many  rp = b6 ;;      }// call .SaveViewState

71779bd0   {          ld8      r28 = [r34]                // r28 = psv->vtbl
71779bd4              or       gp = r41, r0               // restore gp after call (pointless)
71779bd8              or       r42 = r34, r0 ;;          }// r42 = psv
71779be0   {          adds     r31 = 050h, r28 ;;         // r31 = &vtbl.DestroyViewWindow
71779be4              ld8      r29 = [r31]                // r29 = &DestroyViewWindow
71779be8              nop.i    00h ;;                    }
71779bf0   {          ld8      r27 = [r29], 08h ;;        // r27 = .DestroyViewWindow
71779bf4              ld8      gp = [r29]                 // gp  = gp.DestroyViewWindow
71779bf8              mov      b6 = r27, $+0x0           }// b6  = .DestroyViewWindow
71779c00   {          nop.m    00h
71779c04              nop.f    00h
71779c08              br.call.dptk.many  rp = b6 ;;      }// call .DestroyViewWindow

71779c10   {          ld8      r30 = [r34]                // r30 = psv->vtbl
71779c14              or       gp = r41, r0               // restore gp after call (pointless)
71779c18              or       r42 = r34, r0 ;;          }// r42 = psv
71779c20   {          adds     r31 = 010h, r30 ;;         // r31 = &vtbl.Release
71779c24              ld8      r28 = [r31]                // r28 = &Release
71779c28              nop.i    00h ;;                    }
71779c30   {          ld8      r29 = [r28], 08h ;;        // r29 = .Release
71779c34              ld8      gp = [r28]                 // gp  = gp.Release
71779c38              mov      b6 = r29, $+0x0           }// b6  = .Release
71779c40   {          nop.m    00h
71779c44              nop.f    00h
71779c48              br.call.dptk.many  rp = b6 ;;      }// call .Release
71779c50   {          or       gp = r41, r0               // restore gp after call
71779c54              nop.f    00h
71779c58              nop.b    00h                       }
NoPSV2:

The restores of gp after the function call are pointless, because the next function call is going to destroy it anyway. This is an actual compiler error; its optimizer should have noticed that gp is not read before it is rewritten.

        m_hwndView = NULL;
        m_fHandsOff = FALSE;
71779c60   {          ld4      r30 = [r35]                // r30 = this->bitfield
71779c64              adds     r31 = 0120h, r32           // r31 = &m_hwndView
71779c68              adds     r33 = 0108h, r32 ;;       }// r33 = &m_pcache
71779c70   {          ld8      r42 = [r33]                // r42 = m_pcache
71779c74              st4      [r31] = r0, 04h            // [r31] = NULL, r31 += 4
71779c78              and      r30 = -2, r30 ;;          }// r30 = r30 & ~-2 (clear bit)
71779c80   {          st4      [r31] = r0                 // [r31] = NULL (other half)
71779c84              st4      [r35] = r30                // bitfield = r30

The compiler is making additional errors. It splits the m_hwndView into two parts, storing the NULL two DWORDs at a time instead of as a single 64-bit store. The compiler thinks that the m_hwndView member is unaligned so it has to split the store. So you now get to see a postincremented store.

"st [dest] = src, n" is a post-incremented store.

The "and r30 = -2, r30" clears the bottom bit.

        if (m_pcache) {
71779c88              cmp.eq   p14, p15 = r0, r42        }// m_pcache == NULL?
71779c90   {          nop.b    00h
71779c94              nop.b    00h
71779c98     (p14)    br.cond.dpnt.few  $+090h ;;        }// Y: Jump to NoCache

The r42 register was set up while you were clearing bits in the previous block of code.

            GlobalFree(m_pcache);
71779ca0   {          addl     r31 = -2029288, gp ;;      // r31 = &__imp__GlobalFree
71779ca4              ld8      r30 = [r31]                // r30 = &GlobalFree
71779ca8              nop.i    00h ;;                    }
71779cb0   {          ld8      r29 = [r30], 08h ;;        // r29 = .GlobalFree
71779cb4              ld8      gp = [r30]                 // gp  = gp.GlobalFree
71779cb8              mov      b6 = r29, $+0x0           }// b6  = .GlobalFree
71779cc0   {          nop.m    00h
71779cc4              nop.f    00h
71779cc8              br.call.dptk.many  rp = b6 ;;      }// call .GlobalFree

Recall that r42 already contains the value of m_pcache.

            m_pcache = NULL;
        }
71779cd0   {          st1      [r33] = r0, 01h            // [r33] = 0, r33++
71779cd4              or       gp = r41, r0               // restore gp after call
71779cd8              nop.b    00h ;;                    }
71779ce0   {          st1      [r33] = r0, 01h ;;         // [r33] = 0, r33++
71779ce4              st1      [r33] = r0, 01h            // [r33] = 0, r33++
71779ce8              nop.i    00h ;;                    }
71779cf0   {          st1      [r33] = r0, 01h ;;         // [r33] = 0, r33++
71779cf4              st1      [r33] = r0, 01h            // [r33] = 0, r33++
71779cf8              nop.i    00h ;;                    }
71779d00   {          st1      [r33] = r0, 01h ;;         // [r33] = 0, r33++
71779d04              st1      [r33] = r0, 01h            // [r33] = 0, r33++
71779d08              nop.i    00h ;;                    }
71779d10   {          st1      [r33] = r0                 // [r33] = 0
71779d14              nop.f    00h
71779d18              nop.b    00h                       }
NoCache:

More compiler errors. The compiler assumes that m_pcache is unaligned so it has to zero it out by writing 8 bytes, one at a time, even though it loaded the value at address 71779c70, assuming the address was aligned.

        m_psb->EnableModelessSB(TRUE);
71779d20   {          ld8      r42 = [r36]                // r42 = this->m_psb
71779d24              addl     r43 = 01h, r0              // r43 = TRUE
71779d28              nop.b    00h ;;                    }
71779d30   {          ld8      r31 = [r42] ;;             // r31 = this->m_psb.vtbl
71779d34              adds     r31 = 048h, r31            // r31 = &vtl.EnableModeless
71779d38              nop.i    00h ;;                    }
71779d40   {          ld8      r30 = [r31] ;;             // r30 = &EnableModeless
71779d44              ld8      r29 = [r30], 08h           // r29 = .EnableModeless
71779d48              nop.i    00h ;;                    }
71779d50   {          ld8      gp = [r30]                 // gp  = gp.EnableModeless
71779d54              mov      b6 = r29, $+0x0            // b6  = .EnableModeless
71779d58              br.call.dptk.many  rp = b6 ;;      }// call .EnableModeless
71779d60   {          or       gp = r41, r0               // restore gp after call
        CancelPendingActions();
    }
71779d64              adds     r42 = -40, r32            // r42 = (CUserView*)this
71779d68              br.call.sptk.many  rp = $-15424 ;;  } // call .CancelPendingActions
NoPSV:

This is a direct call rather than an imported function or a virtual method call. Notice that you did not have to set up the gp register, because CancelPendingActions was not a virtual method. Thus, you know that it resides in your own DLL and, therefore, its gp is equal to your gp. It also means that you do not need to restore gp after the call.

    ReleaseAndNull(&_psf);
71779d70   {          addl     r31 = -2025856, gp         // r31 = &__imp__ReleaseAndNull
71779d74              adds     r42 = 0118h, r32           // r42 = &this->_psf
71779d78              nop.b    00h ;;                    }
71779d80   {          ld8      r30 = [r31] ;;             // r30 = &ReleaseAndNull
71779d84              ld8      r29 = [r30], 08h           // r29 = .ReleaseAndNull
71779d88              nop.i    00h ;;                    }
71779d90   {          ld8      gp = [r30]                 // gp  = gp.ReleaseAndNull
71779d94              mov      b6 = r29, $+0x0            // b6  = .ReleaseAndNull
71779d98              br.call.dptk.many  rp = b6 ;;      }// call .ReleaseAndNull
71779da0   {          or       gp = r41, r0               // restore gp after call
    if (fViewObjectChanged)
        NotifyViewClients(DVASPECT_CONTENT, -1);
71779da4              cmp4.eq  p14, p15 = r0, r37         // r36 (fViewObjectChanged) == FALSE?
71779da8     (p14)    br.cond.dpnt.few  $+060h ;;        }// Y: Jump to NoChange
71779db0   {          adds     r42 = -40, r32             // r42 = (CBaseBrowser*)this
71779db4              addl     r44 = -1, r0               // r44 = -1
71779db8              addl     r43 = 01h, r0 ;;          }// r43 = 1 = DVASPECT_CONTENT
71779dc0   {          ld8      r31 = [r42] ;;             // r31 = this.vtbl
71779dc4              adds     r31 = 048h, r31            // r31 = &vtbl.NotifyViewClients
71779dc8              nop.i    00h ;;                    }
71779dd0   {          ld8      r30 = [r31] ;;             // r30 = &NotifyViewClients
71779dd4              ld8      r29 = [r30], 08h           // r29 = .NotifyViewClients
71779dd8              nop.i    00h ;;                    }
71779de0   {          ld8      gp = [r30]                 // gp  = gp.NotifyViewClients
71779de4              mov      b6 = r29, $+0x0            // b6  = .NotifyViewClients
71779de8              br.call.dptk.many  rp = b6 ;;      }// call .NotifyViewClients
71779df0   {          or       gp = r41, r0               // restore gp after call
71779df4              nop.f    00h
71779df8              nop.b    00h                       }
NoChange:
    if (m_pszTitle) {
        LocalFree(m_pszTitle);
        m_pszTitle = NULL;
    }
71779e00   {          adds     r33 = 0128h, r32 ;;        // r33 = &this->m_pszTitle
71779e04              ld8      r42 = [r33]                // r42 = this->m_pszTitle
71779e08              nop.i    00h ;;                    }
71779e10   {          cmp.eq   p14, p15 = r0, r42         // r42 == NULL?
71779e14              nop.f    00h
71779e18     (p14)    br.cond.dpnt.few  $+050h ;;        }// Y: Jump to NoTitle
71779e20   {          addl     r31 = -2029752, gp ;;      // r31 = &__imp__LocalFree
71779e24              ld8      r30 = [r31]                // r30 = &LocalFree
71779e28              nop.i    00h ;;                    }
71779e30   {          ld8      r29 = [r30], 08h ;;        // r29 = .LocalFree
71779e34              ld8      gp = [r30]                 // gp  = gp.LocalFree
71779e38              mov      b6 = r29, $+0x0           }// b6  = .LocalFree
71779e40   {          nop.m    00h
71779e44              nop.f    00h
71779e48              br.call.dptk.many  rp = b6 ;;      }// call .LocalFree
71779e50   {          st8      [r33] = r0                 // this->m_pszTitle = 0
71779e54              or       gp = r41, r0               // restore gp after call
71779e58              nop.b    00h ;;                    }
NoTitle:

Nothing new in the following.

    SetRect(&m_rcBounds, 0, 0, 0, 0);
71779e60   {          addl     r31 = -2024936, gp         // r31 = &__imp__SetRect
71779e64              addl     r46 = 00h, r0              // r46 = 0
71779e68              addl     r45 = 00h, r0             }// r45 = 0
71779e70   {          addl     r44 = 00h, r0 ;;           // r44 = 0
71779e74              ld8      r30 = [r31]                // r30 = &SetRect
71779e78              addl     r43 = 00h, r0             }// r43 = 0
71779e80   {          adds     r42 = 0208h, r32 ;;        // r42 = &this->m_rcBounds
71779e84              ld8      r29 = [r30], 08h           // r29 = .SetRect
71779e88              nop.i    00h ;;                    }
71779e90   {          ld8      gp = [r30]                 // gp  = gp.SetRect
71779e94              mov      b6 = r29, $+0x0            // b6  = .SetRect
71779e98              br.call.dptk.many  rp = b6 ;;      }// call .SetRect
71779ea0   {          or       gp = r41, r0               // restore gp after call
71779ea4              nop.f    00h
71779ea8              nop.b    00h                       }
    return S_OK;
ReturnSOK:
71779eb0   {          or       ret0 = r0, r0

Finally, you get to set the return value:

}
71779eb4              mov      rp = r38, $+0x0            // restore return address
71779eb8              adds     sp = 020h, sp ;;          }// clean up stack
71779ec0   {          nop.m    00h
71779ec4              mov      pr = r40, -2 ;;            // restore predicate registers
71779ec8              mov.i    ar.pfs = r39              }// clean up stack frame
71779ed0   {          nop.m    00h
71779ed4              nop.f    00h
71779ed8              br.ret.sptk.many  rp ;;            }// return to caller

And this is the function epilogue.

Build machine: CAPEBUILD