DCSIMG
All Your Base Are Belong To Us

All Your Base Are Belong To Us

Mostly .NET internals and other kinds of gory details

Obtaining Reliable Thread Call Stacks of 64-bit Processes

The x64 calling convention is a great improvement over the state of affairs in x86. Few would argue about this. After all, remembering the differences between __stdcall and __cdecl, when to use each, which API defaults to which calling convention, and which specific variation of __fastcall JIT compilers use when given the choice -- is not the best use of developer time and not the best in terms of debugging productivity.

With that said, the x64 calling convention often makes it very difficult to retrieve parameter values from the call stack if you don't have private symbols for the relevant frame. In a nutshell, the problem is that the x64 calling convention allows many parameters to be passed in volatile registers, which can then be modified by the callee. Often enough, the compiler spills parameters from volatile registers to a predefined location on the stack when these volatile registers must be used for another purpose. In other cases, however, parameter values might vanish without a trace, and make stack reconstruction exceptionally difficult, especially when you're dealing with a dump file and not a live process in which you can set up breakpoints and examine the context at any point.

But, enough said: let's take a look at an example where we're interested in obtaining parameter values from the stack. In this case, we have a UI thread that called the WaitForMultipleObjects API, and we're interested in the first two parameters passed to WaitForMultipleObjects: the number of synchronization objects for which the thread is waiting, and the array of handles to these objects. A first attempt involves the kb command, which takes a guess at what the method's parameters are by dumping out the first three QWORDs at RBP+8 (immediately following the method's return address if FPO wasn't used):

0:000> kb
RetAddr           : Args to Child                                                           : Call Site
000007f9`346212d2 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtWaitForMultipleObjects+0xa
000007f9`368e1292 : 00000000`00000001 000007f6`1cfc9000 00000000`00000001 000000b9`e75faa68 : KERNELBASE!WaitForMultipleObjectsEx+0xe5
000007f6`1d3e1da9 : 00000000`00000001 cccccccc`cccccccc cccccccc`cccccccc cccccccc`cccccccc : KERNEL32!WaitForMultipleObjects+0x12
000007f9`1f710c37 : 000000b9`e5cff3b0 000000b9`e5cfeba0 000000b9`e5cfe2b8 cccccccc`cccccccc : BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged+0x159
...snipped for brevity...

These values are, of course, utter nonsense. If we had the source code for the BatteryMeter module, we could inspect it and try to identify the parameters. Without source code, however, we must resort to disassembling the function around the call to WaitForMultipleObjects:

0:000> uf BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged
  ...snipped for brevity...
  185 000007f6`1d3e1d8d 41b9ffffffff    mov     r9d,0FFFFFFFFh
  185 000007f6`1d3e1d93 41b801000000    mov     r8d,1
  185 000007f6`1d3e1d99 488d542448      lea     rdx,[rsp+48h]
  185 000007f6`1d3e1d9e b904000000      mov     ecx,4
  185 000007f6`1d3e1da3 ff1597df0200    call    qword ptr [BatteryMeter!_imp_WaitForMultipleObjects (000007f6`1d40fd40)]
  ...snipped for brevity...

Note that this is indeed the call site: the CALL instruction (which is six bytes long: ff1597df0200) is at 000007f6`1d3e1da3, whereas the return address for WaitForMultipleObjects is at 000007f6`1d3e1da9, six bytes later.

Once we have the call site, we can recall the parameter order in the x64 calling convention. Specifically, WaitForMultipleObjects has four parameters: the number of synchronization objects (a DWORD), the array of handles, a Boolean indicating whether to wait for all the objects to become signaled or any of them, and finally a timeout (a DWORD). These parameters are passed in the ECX, RDX, R8D, and R9D registers, respectively. (Recall that RnD is an alias for the least-significant 32 bits of the 64-bit Rn register.)

At this point we know the number of objects in the array -- it is a constant, 4. Furthermore, even through the RDX register was probably clobbered by the callee, we can still determine the address of the array by inspecting the stack location RSP+48 in the caller's frame. To find the value of RSP, we can use the k command:

0:000> k
Child-SP          RetAddr           Call Site
000000b9`e5cfdbd8 000007f9`346212d2 ntdll!NtWaitForMultipleObjects+0xa
000000b9`e5cfdbe0 000007f9`368e1292 KERNELBASE!WaitForMultipleObjectsEx+0xe5
000000b9`e5cfdec0 000007f6`1d3e1da9 KERNEL32!WaitForMultipleObjects+0x12
000000b9`e5cfdf00 000007f9`1f710c37 BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged+0x159
...snipped for brevity...

Now, we can inspect the handles themselves:

0:000> dq 000000b9`e5cfdf00+48 L4
000000b9`e5cfdf48  00000000`00000118 00000000`00000120
000000b9`e5cfdf58  00000000`00000128 00000000`0000012c

...or even ask the debugger to print out the handle information for the handles in the array:

0:000> .foreach /pS 1 /ps 1 (h {dq /c 1 000000b9`e5cfdf00+48 L4}) {!handle h f}
Handle 118
  Type          Thread
  Attributes    0
  GrantedAccess 0x1fffff:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         Terminate,Suspend,Alert,GetContext,SetContext,SetInfo,QueryInfo,SetToken,Impersonate,DirectImpersonate
  HandleCount   3
  PointerCount  786412
  Name          <none>
  Object Specific Information
    Thread Id   1940.1158
    Priority    12
    Base Priority 0
    Start Address 1d3e1fa0 BatteryMeter!CPUSelectorThread
Handle 120
  Type          Thread
  Attributes    0
  GrantedAccess 0x1fffff:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         Terminate,Suspend,Alert,GetContext,SetContext,SetInfo,QueryInfo,SetToken,Impersonate,DirectImpersonate
  HandleCount   3
  PointerCount  786416
  Name          <none>
  Object Specific Information
    Thread Id   1940.12d0
    Priority    10
    Base Priority 0
    Start Address 1d3e1ff0 BatteryMeter!HardwareChangeDetectorThread
Handle 128
  Type          Thread
  Attributes    0
  GrantedAccess 0x1fffff:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         Terminate,Suspend,Alert,GetContext,SetContext,SetInfo,QueryInfo,SetToken,Impersonate,DirectImpersonate
  HandleCount   3
  PointerCount  786420
  Name          <none>
  Object Specific Information
    Thread Id   1940.1220
    Priority    12
    Base Priority 0
    Start Address 1d3e2040 BatteryMeter!LocationAwarenessThread
Handle 12c
  Type          Thread
  Attributes    0
  GrantedAccess 0x1fffff:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         Terminate,Suspend,Alert,GetContext,SetContext,SetInfo,QueryInfo,SetToken,Impersonate,DirectImpersonate
  HandleCount   3
  PointerCount  786424
  Name          <none>
  Object Specific Information
    Thread Id   1940.1814
    Priority    10
    Base Priority 0
    Start Address 1d3e20a0 BatteryMeter!TemperaturePropagationThread

To summarize, with some effort we were able to discover that the WaitForMultipleObjects function was invoked with an array of four threads, which we can now go ahead and inspect. But this was an easy case, in which the parameters haven't been clobbered -- it boggles the mind to think that you have to go through disassembly listings every time you want to dump parameter values.

Enter CMKD -- a free debugging extension that streamlines the analysis of 64-bit call stacks. This extension performs some fairly clever analysis of the stack structure, non-volatile register storage areas, and function call sites to display parameter values or at least explain where they came from. In our particular case, this extension works brilliantly:

0:000> !stack -p -t
Call Stack : 44 frames
## Stack-Pointer    Return-Address   Call-Site       
...snipped for brevity...
01 000000b9e5cfdbe0 000007f9368e1292 KERNELBASE!WaitForMultipleObjectsEx+e5 
  Parameter[0] = 0000000000000004 : rcx saved in current frame into NvReg rbx which is saved by child frames
  Parameter[1] = 000000b9e5cfdf48 : rdx saved in current frame into NvReg r13 which is saved by child frames
  Parameter[2] = 0000000000000001 : r8  saved in current frame into stack 
  Parameter[3] = 0000000000000000 : r9  saved in current frame into NvReg r14 which is saved by child frames
02 000000b9e5cfdec0 000007f61d3e1da9 KERNEL32!WaitForMultipleObjects+12 
  Parameter[0] = 0000000000000004 : rcx setup in parent frame by movb instruction @ 000007f61d3e1d9e from immediate data 
  Parameter[1] = 000000b9e5cfdf48 : rdx setup in parent frame by lea instruction @ 000007f61d3e1d99 from mem @ 000000b9e5cfdf48 
  Parameter[2] = 0000000000000001 : r8  setup in parent frame by movb instruction @ 000007f61d3e1d93 from immediate data 
  Parameter[3] = 00000000ffffffff : r9  setup in parent frame by movb instruction @ 000007f61d3e1d8d from immediate data 
03 000000b9e5cfdf00 000007f91f710c37 BatteryMeter!CBatteryMeterDlg::OnCPUSelectorChanged+159 
  Parameter[0] = (unknown)        : 
  Parameter[1] = (unknown)        : 
  Parameter[2] = (unknown)        : 
  Parameter[3] = (unknown)        : 
...snipped for brevity...

In the preceding output, the highlighted parameters correspond to what we previously discovered with manual labor. CMKD deduced the parameter values and explained where they came from -- the specific MOVB/LEA instructions that initialized the registers.

To conclude: CMKD makes it much easier to analyze x64 method calls that use the x64 calling convention. It's a valuable addition to your arsenal if you need to debug dumps of optimized binaries for which you do not have private symbols and source code.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Virtual Machines Are The New Processes

Once upon a time, threads were a new thing. Hardcore Unix architectures were processes-only, cheap forking, and would have none of this lightweight threads business. Some system architects -- stuck in the 1970s -- still produce architectures for modern operating systems that consist of dozens of processes. I have personally seen a complex UI application on Windows that relies on >35 processes, of which eight different processes display parts of the application's UI (at the same time!). There is much good to be said about the isolation benefits of multiple processes, but having a Unix-inspired fear of threads is often completely unjustified today, especially in face of the cost of inter-process communication and the complexity of starting up, coordinating, and shutting down multiple processes.

I am now seeing the same thing happening with virtual machines. It is very cheap to put up a virtual machine on Azure, Amazon, or Rackspace. Very easy, too, and scriptable -- in ten minutes' time you can set up a farm of twenty virtual machines doing your bidding. This leads to architectures where every system component -- which used to live in a separate process or thread -- now lives in a separate virtual machine. Starting up, coordinating, and shutting down multiple virtual machines is harder and slower than doing the same with processes, and it seems to me that we're falling into the same trap again.

If you're separating your system into a bunch of virtual machines, I think it would be valuable to stop and ask yourself if they are all strictly necessary. In a system with good separation of concerns and good decoupling between independent components, you can always scale to multiple VMs if necessary. This is another case of premature optimization, which has a measurable financial cost in maintenance, operations, and management overhead.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Building the Next YouTube: Windows Azure Media Services

My third (and last) talk at the SELA Developer Practice was about Windows Azure Media Services. If you haven't explored it yet, it's a SaaS offering for uploading, encoding, managing, and delivering media to a variety of devices, scaled by the power of Windows Azure. A couple of months ago this blog featured a detailed overview of one of the proof-of-concept workflows I built with Windows Azure Media Services, so I won't repeat myself.

If you are considering Windows Azure Media Services for your own application or service, feel free to contact me and I'll be happy to help. If you're in Israel, Microsoft Israel would also love to be a fly on the wall in our conversation :-P


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Attacking Web Applications

My first breakout session at the SELA Developer Practice covered the most common attacks against web applications and how to defend against these attacks. When planning this talk, I knew 60 minutes are hardly enough to cover all common vulnerabilities -- especially if I wanted to show any demos -- so I decided to focus on the three most prevalent vulnerability types, according to the OWASP Top 10:

  1. Injection (command injection and SQL injection)
  2. Broken authentication or session management
  3. Cross-site scripting (and CSRF as a bonus)

I've demonstrated these common vulnerabilities in a series of demos using toy applications -- DVWA and Google Gruyere. Unfortunately, with very little effort, I could find real targets to exploit by doing a simple Google search. Knowledge is power, and power must be wielded wisely...

OS Command Injection
I used DVWA (Damn Vulnerable Web Application) to demonstrate OS command injection. DVWA ships with a helpful "free ping tool" that lets you input an IP address and run an OS 'ping' command on your behalf. When given the following input, DVWA hangs and waits for connections on port 13371 and runs whatever commands you type through that connection:

127.0.0.1;mkfifo /tmp/pipe;sh /tmp/pipe | nc -l 13371 > /tmp/pipe

Broken Cookies
I used Google Gruyere for this demo -- specifically, the cookie manipulation vulnerability. You can read more about it on the official Gruyere documentation.

Insecure Password Storage
I demonstrated -- using the LinkedIn hash leak -- that storing weakly hashed passwords is not enough. Specifically, LinkedIn users chose horrible passwords like "password" and "iloveyou", and a plain SHA-1 of this kind of password is not going to withstand even the free online rainbow tables, like OnlineHashCrack.com. We then discussed salting and hashing passwords, and using slower hash functions like bcrypt (at the risk of DOS'ing your service when multiple concurrent login requests are issued).

XSS and CSRF
Again, I used Google Gruyere for this demo -- it allows multiple opportunities for persistent and temporary XSS. For example, when the URL is invalid, Gruyere reflects that URL into the error page without escaping, which allows embedding scripts in the URL which Gruyere then happily plays back to the user. For more types of XSS and CSRF vulnerabilities in Gruyere, see the official docs.

Security Misconfiguration
Finally, I talked about how it's easy to find open admin consoles that allow anyone on the Internet to pry deep into the bowels of your web server and even obtain logs and traces that contain sensitive information, such as authentication cookies. As an example, try
this Google search for ELMAH pages that contain an ASP.NET authentication cookie.

One Advisory To Rule Them All
I concluded the talk by mentioning the recent vulnerability advisory for a certain class of DLink routers. This single advisory contains XSS and CSRF issues, information disclosure holes, insecure password storage problems, and OS command injection opportunities.

Thank you for coming to the talk, and I look forward to meeting you during the rest of the conference or at the next SDP!


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Next Week: Sela Developer Practice 2013

Next week, May 5-9, is our biggest developer conference yet. We have developers from more than 150 software companies attending more than 70 sessions and workshops taught by local and international speakers. We are very happy to host industry rockstars like Jesse Liberty, Shawn Wildermuth, and Udi Dahan -- and we're looking forward to make SDP even bigger and more interesting for software developers everywhere.

We are expanding our technology reach beyond the traditional .NET stack, with talks on Node.js, PhoneGap, RavenDB, Hadoop, Solr, TypeScript, and more. Fourteen of the workshops have sold out, and some of the breakout session tracks are also full. As is often the case with the SDP, we had to duplicate five workshops because of lack of classroom space. My own workshop, Improving .NET Performance, was duplicated twice -- and I'm very happy to see new developers learn about performance measurement and performance optimization every year.

If you're coming to the SDP next week, please come and say hi and enjoy the conference! All slides and demos will be posted online by the speakers, and we will also publish video recordings from all the breakout sessions and some of the workshops (to registered attendees only).

And finally, if you can't make it this time, stay tuned for the next SDP later this year!


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Two More Ways for Diagnosing For Which Synchronization Object Your Thread Is Waiting

It is as though there is an infinite variety of heuristics that you can use to determine which synchronization object your thread is waiting for. In fact, these are heuristics for retrieving fastcall parameters passed in registers that have been clobbered by subsequent method calls.

Method 1: Inspect the handle passed to WaitForMultipleObjectsEx
The CLR uses an auto-reset event to implement sync block synchronization, which means that every attempt to acquire an owned sync block will result in a call to WaitForMultipleObjectsEx. If you inspect this method's parameters, you'll find a handle that you might be able to correlate with the internal (undocumented) structure that the CLR maintains for each sync block.

First, let's take a look at the waiting thread and the parameters it passes to WaitForMultipleObjectsEx:

0:007> kb
ChildEBP RetAddr  Args to Child              
06eded3c 7538c752 00000001 009d6e88 00000001 ntdll!NtWaitForMultipleObjects+0xc
06edeec0 796561fa 00000001 00000000 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x10b
06edef28 79655e27 00000001 009d6e88 00000000 mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f
06edef48 79655f30 00000001 009d6e88 00000000 mscorwks!Thread::DoAppropriateAptStateWait+0x3c
06edefcc 79655fc5 00000001 009d6e88 00000000 mscorwks!Thread::DoAppropriateWaitWorker+0x13c
06edf01c 79656149 00000001 009d6e88 00000000 mscorwks!Thread::DoAppropriateWait+0x40
06edf078 794f56b8 ffffffff 00000001 00000000 mscorwks!CLREvent::WaitEx+0xf7
06edf08c 7964d517 ffffffff 00000001 00000000 mscorwks!CLREvent::Wait+0x17
06edf118 79655348 090a4390 ffffffff 090a4390 mscorwks!AwareLock::EnterEpilog+0x8c
06edf134 796552cc 9c1267bc 06edf218 02812934 mscorwks!AwareLock::Enter+0x61
06edf1d4 00ba0c55 00000000 06edf1ac 06edf28c mscorwks!JIT_MonEnterWorker_Portable+0xb3
WARNING: Frame IP not in any known module. Following frames may be wrong.
06edf228 05252a7e 02e0ca3c 06edf248 04b3045f 0xba0c55
[...snipped...]

0:007> dd 009d6e88 L1
009d6e88  000003cc

0:007> !handle 3cc f
Handle 3cc
  Type          Event
  Attributes    0
  GrantedAccess 0x1f0003:
         Delete,ReadControl,WriteDac,WriteOwner,Synch
         QueryState,ModifyState
  HandleCount   2
  PointerCount  524289
  Name          <none>
  Object Specific Information
    Event Type Auto Reset
    Event is Waiting

Now that we have the handle, let's take a look at all the sync blocks and see if we can tell which one's internal data structure contains that handle value:

0:007> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   19 009d6e74            3         1 009ad008   e60   0   028138e4 System.String
   20 009d6ea4            3         1 090a4390  1a70   7   02813908 System.String
-----------------------------
Total           20
CCW             1
RCW             0
ComClassFactory 0
Free            0

0:007> dd 009d6e74            
009d6e74  00000003 00000001 009ad008 00000001
009d6e84  80000013 000003cc 0000000d 00000000
009d6e94  00000000 00000000 00000000 00000000
[...snipped...]

All right -- so now we know that the current thread (thread #7) is waiting for the sync block 009d6e74, which is owned by thread #0.

Method 2: If the object is stored on the stack, retrieve it using ESP or EBP
If at any point the address of the synchronization object has been stored on the stack, it is very likely that it is still there while the Monitor.Enter method is executing. The only problem is that retrieving values from the stack often requires reverse engineering the method to understand which EBP or ESP offsets are used for the stack access. Furthermore, if any frame higher on the stack uses FPO omission (i.e., uses the EBP register for arbitrary purposes), EBP reconstruction is impossible and you have to rely on ESP offsets from EBP to locate values on the stack.

For example, consider the following stack trace. We want to recover the parameter passed to Monitor.Enter, so we inspect the disassembly around the call to Monitor.Enter:

0:007> !clrstack
OS Thread Id: 0x1a70 (7)
ESP       EIP     
06edf0b4 773d1318 [GCFrame: 06edf0b4] 
06edf184 773d1318 [...] System.Threading.Monitor.Enter(System.Object)
06edf1dc 00ba0c55 FileExplorer.MainForm.LaunchNotepad(System.Object)
06edf230 05252a7e System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
06edf23c 04b3045f System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
06edf254 0502c90a System.Threading.ThreadHelper.ThreadStart(System.Object)
06edf47c 794f1b6c [GCFrame: 06edf47c] 

0:007> !u 00ba0c55 
Normal JIT generated code
FileExplorer.MainForm.LaunchNotepad(System.Object)
Begin 00ba0b98, size 16d
00ba0b98 55              push    ebp
00ba0b99 8bec            mov     ebp,esp
[...snipped...]
00ba0c3e 8945d0          mov     dword ptr [ebp-30h],eax
00ba0c41 8b45dc          mov     eax,dword ptr [ebp-24h]
00ba0c44 8b804c010000    mov     eax,dword ptr [eax+14Ch]
00ba0c4a 8945c4          mov     dword ptr [ebp-3Ch],eax
00ba0c4d 8b4dc4          mov     ecx,dword ptr [ebp-3Ch]
00ba0c50 e8701f9578      call    mscorwks!JIT_MonEnterWorker (794f2bc5)
00ba0c55 90              nop
00ba0c56 90              nop
00ba0c57 8b45dc          mov     eax,dword ptr [ebp-24h]
00ba0c5a 8945bc          mov     dword ptr [ebp-44h],eax
00ba0c5d 8b45bc          mov     eax,dword ptr [ebp-44h]
00ba0c60 8945b8          mov     dword ptr [ebp-48h],eax
00ba0c63 837dcc00        cmp     dword ptr [ebp-34h],0
00ba0c67 752c            jne     00ba0c95
[...snipped]
We can conclude that EBP has been used to access the parameter that is then passed to Monitor.Enter in the ECX register. If the value of EBP is not available, we can inspect the method's prologue to determine the offset between ESP and EBP while the method was executing:

0:007> !u 00ba0c55 
Normal JIT generated code
FileExplorer.MainForm.LaunchNotepad(System.Object)
Begin 00ba0b98, size 16d
00ba0b98 55              push    ebp
00ba0b99 8bec            mov     ebp,esp
00ba0b9b 57              push    edi
00ba0b9c 56              push    esi
00ba0b9d 53              push    ebx
00ba0b9e 83ec40          sub     esp,40h
[...snipped...]

Let's see; after EBP and ESP are aligned on the second instruction, we push three 32-bit registers and then subtract another 0x40 bytes from ESP. As a result, the offset between ESP and EBP is 4+4+4+0x40 = 0x4c bytes.

Now that we know the offset, we find the ESP value from the !CLRStack output, and then add the offset to obtain the value of EBP, and, as a result, the address of the parameter passed to Monitor.Enter, which we can verify to be a sync block owner. Great success.

0:007> !clrstack
OS Thread Id: 0x1a70 (7)
ESP       EIP     
06edf0b4 773d1318 [GCFrame: 06edf0b4] 
06edf184 773d1318 [...] System.Threading.Monitor.Enter(System.Object)
06edf1dc 00ba0c55 FileExplorer.MainForm.LaunchNotepad(System.Object)
06edf230 05252a7e System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
06edf23c 04b3045f System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
06edf254 0502c90a System.Threading.ThreadHelper.ThreadStart(System.Object)
06edf47c 794f1b6c [GCFrame: 06edf47c] 

0:007> dd 06edf1dc+0x4c-0x3c L1
06edf1ec  028138e4

0:007> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   19 009d6e74            3         1 009ad008   e60   0   028138e4 System.String
   20 009d6ea4            3         1 090a4390  1a70   7   02813908 System.String
-----------------------------
Total           20
CCW             1
RCW             0
ComClassFactory 0
Free            0


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Revisiting Value Types vs. Reference Types

Why do C#, the .NET Framework, and the CLR need value types and reference types? Why two categories of types? Why the added complexity in training developers to understand why and when to use each type of type?

There are many answers, but very few get to the crux of the matter. You could try to justify the need for two types of types by looking at the semantic differences C# affords each. For example, you know that by default, instances of value types are copied when passed to a function, but instances of reference types are not -- only the references are copied. Or, you could say that by default, the Equals method compares whether two instances of a reference type are identical (point to the same memory location), but for instances of value types it compares their contents. There are many other superficial semantic differences, too. But do they justify having two types of types?

It seems that the standard reasons of "stack vs. heap", "by value vs. by reference", "identity vs. contents" are not by themselves enough to justify the associated language and implementation complexity of having two categories of types. Here is an attempt at an alternative explanation.

Consider C#, Java, and C++ for a moment. All three languages have types that are more lightweight than others:

Language Lightweight Types Heavyweight Types
Java Primitives (int, boolean) Object types (String, Integer)
C# Value types Reference types
C++ Structs/classes without virtual methods Structs/classes with virtual methods

What do the "heavyweight" types have in common? Instances of these types -- in all three languages -- are provided some additional services by the compiler/runtime/execution environment at the expense of additional overhead. These services, and that overhead, are the reason for having two types of types.

What are those services, then? Although it somewhat depends on the language and environment, all three examples above afford heavyweight types with support for polymorphism, namely virtual method invocation. Virtual methods in Java, C#, and C++ rely on a method table stored in memory, and pointed to from the header of each object instance. This pointer ("vfptr"" in C++, "method table pointer" in the CLR) is the overhead, the cost heavyweight types must pay for a service that lightweight types do not have access to.

+---Ref Type Instance---+
| Object Header Word    |
| Method Table Pointer ------> +---Method Table (Simplified)---+
| ... object fields ... |      | Ptr to Base MT                |
+-----------------------+      | Ptr to Object.Equals          |
                               | Ptr to Object.ToString        |
                               | ... additional methods ...    |
                               +-------------------------------+

In the CLR and the JVM, reference types enjoy additional services on top of virtual method invocation. For example, reference types participate in Monitor synchronization: you can use the C# lock or Java synchronized keyword to use an arbitrary reference type instance for synchronization. Additionally, both the CLR and the JVM offer garbage collection for heap objects. Both of these services require additional memory overhead associated with each reference type.

It is not the semantic difference in copying vs. passing a reference or comparing identity vs. comparing contents that explains why two types of types are so prevalent. The additional services -- supporting virtual methods, object synchronization, garbage collection, finalization -- make the overhead necessary for reference types. This very overhead is not acceptable for small, primitive types of which millions of instances are likely to be required. Integers, floats, characters, Booleans, two-dimensional points, rectangle coordinates cannot afford to waste 4-16 bytes of overhead per instance.

This is why C#, Java, and C++ have two categories of types -- even if you don't think of them as different categories. And this is also why you should consider using value types: not because they make it easier to copy objects by value or compare their contents, but because they do not pay the cost of services you will not require of them.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Windows Azure Mobile Services "Rent a Home" Sample, Part 3: Authentication

Last time around, we explored the user interface and the server script for our apartment listings application. Today we'll see how to add authentication to the mix, and limit certain operations only to authenticated users. This is particularly important in the Rent a Home application, because you don't want anonymous users deleting and updating apartment listings! In fact, you'd probably want only the user that created an apartment listing to have the right to update or delete it.

NOTE: Windows Azure Mobile Services is configured by default to enable any user with the application URL and application key to perform create, retrieve, update, and delete operations on any tables belonging to that service. Because the application key is distributed to client devices, this is obviously a very bad idea for a production app -- anyone can extract the application key from your application and perform arbitrary modifications of your data. Therefore, in a production configuration, it is extremely important that you limit access to your data only to authenticated users. You might find it reasonable to allow read data to anyone with the application key, but create/update/delete operations should require authentication.

Windows Azure Mobile Services currently supports authentication with four authentication providers: Microsoft Account (Live Connect), Google, Twitter, and Facebook. (The Rent a Home application supports only Twitter authentication, but it would be trivial to add support for the other providers as well.)

Before you can authenticate with Twitter, you have to create a Twitter application (do so by visiting dev.twitter.com) and specify its consumer key and consumer secret in the Windows Azure portal. Then, you can immediately start making authentication requests from the supported clients:

Android

mobileService.login(MobileServiceAuthenticationProvider.Twitter,
  new UserAuthenticationCallback() {
    public void onCompleted(MobileServiceUser user,
                            Exception exception,
                            ServiceFilterResponse response) {
      if (exception != null) {
        displayError(exception);
      } else {
        setTitle("Welcome! User id: " + user.getUserId());
      }
    }
  });

NOTE: The user id is not the user name; it is an opaque number that users will be surprised to see. Later in this post, we'll see how to obtain the user's name from the user id.

iOS

UIViewController *vc =
  [self.client loginViewControllerWithProvider:@"twitter"
               completion:^(MSUser *user, NSError *error) {
    if (error) {
      NSLog(@"Authentication Error: %@", error);
      //error.code == -1503 means user cancelled the dialog
    } else {
      //Authentication was successful, use self.client.currentUser
      //to read information about the user
    }
    [self dismissViewControllerAnimated:YES completion:nil];
  }];
[self presentViewController:vc animated:YES completion:nil];
Windows Phone and Windows 8
var user = await App.MobileService.LoginAsync(
                MobileServiceAuthenticationProvider.Twitter);
btnLogin.Visibility = Visibility.Collapsed;
txtLogin.Visibility = Visibility.Visible;
txtLogin.Text = "Logged in using Twitter";

Server-Side Authentication
With this client-side infrastructure in place, every subsequent request performed by an authenticated user will be associated with that user, and available to table scripts as the user parameter of the respective script function. Having it available is not enough, however -- we need to explicitly associate our apartment listings with users. The following part of the insert script on the apartment illustrates how easy this is:

function insert(item, user, request) {
  if (user) {
    item.user = user.userId;
  }
  //The rest of the code omitted for brevity
}

Thanks to dynamic schema, there's no need to pre-declare the user property on apartment listings. This property is actually enough to protect apartment listings from modification by the "wrong" user. For example, this could be our delete script on the apartment table:

function delete(item, user, request) {
  if (!user || user.userId !== item.userId) {
    request.respond(403, 'Only the user that created the ' +
                         'apartment listing may delete it.');
  }
  request.execute();
}

As you saw in the application UI, it would also be valuable to display owner information to other users (this would form the basis of helping users contact the apartment owner to broker a deal). Because the user id is not a human-readable username, we have to use a service-specific API (in this case, Twitter's) to retrieve the username. This is done in the insert script of the apartment table, like so:

item.username = '<Unknown User>';
var identities = user ? user.getIdentities() : null;
if (identities && identities.twitter) {
  var id = user.userId.substring(userId.indexOf(':') + 1);
  var url = 'https://api.twitter.com/1/users/show.json?user_id='
            + id;
  reqModule(url, function (err, resp, body) {
    if (err || resp.statusCode !== 200) {
      request.respond(500, body);
    } else {
      try {
        item.username = JSON.parse(body).name;
        //Continue processing as usual
      } catch (ex) {
        request.respond(500, ex);
      }
    }
  });
}

Reading the specific user's data depends on the identity provider -- the API endpoint for Twitter users is obviously different from the API endpoint for Facebook users. Carlos Figuera has taken the time to provide a set of code snippets you can integrate in your backend to obtain user information from any of the four identity providers that are currently supported.

Summary
At this point, we have added authentication support to our application. Users can sign in with Twitter and their apartment listings are associated with their Twitter identity. We also restricted access so that only authenticated users can create new apartment listings, and only the creating user is allowed to update or delete an apartment listing. One additional thing that would be "nice to have" is support for a custom identity, i.e. custom username and password credentials for your service. Josh Twist of the Mobile Services team explains what would be necessary to provide such support on your application's side.

Finally, there's one fairly annoying detail about the current implementation: it requires the user to sign in every time they open the application. Although the backend doesn't require re-authentication, the client SDK does not currently have an automatic way to persist the authentication token in some secure storage. This shouldn't stop you, however, from persisting the authentication token yourself, and passing it to the appropriate login method of the Mobile Service client. For a complete example, see Josh Twist's post.

In the next installment, we'll take a look at how to configure push notifications on all four platforms, and how to deliver push notifications to registered clients.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Windows Azure Mobile Services "Rent a Home" Sample, Part 2: UI and Data

In the previous installment, we saw the general UI of the application. We'll now turn to see how that UI was implemented on all four platforms. If you're looking for a quick start or documentation on Mobile Services, you should take a look at the Windows Azure Mobile Developer Center.

Android
The model class for apartment listings on Android is the following:

public class Apartment implements Serializable {
  private int id; 
  private String address;
  private boolean published;
  private int bedrooms;
  private double latitude;
  private double longitude;
  private String username;
  //Getters and setters omitted for brevity
}
Note that the class doesn't have to be Java-serializable, but its field types are restricted to what Mobile Services currently supports. The Android SDK relies on the gson library (Google's fast and extensible JSON serializer) to serialize objects on the wire. The reason my class implements the market Serializable interface is so that it can be passed across activities.

The main activity on Android loads a simple layout that consists of a ListView, bound to a custom adapter (derived from ArrayAdapter<Apartment>). It displays three TextViews for each apartment listing: the apartment's address, the number of bedrooms, and the user who submitted that apartment listing.

public class ApartmentAdapter extends ArrayAdapter<Apartment> {
  public ApartmentAdapter(Context context,
                          List<Apartment> apartments) {
    super(context, R.layout.apartment_row, apartments);
  }
  
  @Override
  public View getView(int position, View row, ViewGroup parent) {
    Apartment apartment = getItem(position);
    if (row == null) {
      LayoutInflater inflater = LayoutInflater.from(getContext());
      row = inflater.inflate(R.layout.apartment_row, null);
    }
    TextView address = (TextView)row.findViewById(R.id.txtAddress);
    TextView username = (TextView)row.findViewById(R.id.txtSecondary);
    TextView bedrooms = (TextView)row.findViewById(R.id.txtBedrooms);
    address.setText(apartment.getAddress());
    username.setText("added by " + apartment.getUserName()));
    bedrooms.setText(Integer.toString(apartment.getBedrooms()));
    return row;
  } 
}

When the activity initializes, it retrieves a list of apartment listings from the Mobile Services backend, and binds it to the UI using the custom adapter:

mobileService = new MobileServiceClient(
            MOBILESERVICE_URL, MOBILESERVICE_APIKEY, this);
apartmentTable = mobileService.getTable("apartment", Apartment.class);
apartmentTable
  .where()
  .field("published").eq(true).and()
  .field("bedrooms").gt(1)
  .orderBy("bedrooms", QueryOrder.Descending)
  .execute(new TableQueryCallback<Apartment>() {
    public void onCompleted(List<Apartment> items, int count,
                            Exception exception,
                            ServiceFilterResponse response) {
      if (exception != null) {
        displayError(exception);
      } else {
        ApartmentAdapter aa = new ApartmentAdapter(this, items);
        listApartments.setAdapter(aa);
      }
    }
  });

To add a new apartment listing, the user taps the menu/action bar "Add" item, and is presented with a dialog that collects the listing's information and submits it to Mobile Services:

//Dialog view setup omitted for brevity
AlertDialog.Builder builder = new AlertDialog.Builder(this);
builder.setTitle("Add New Apartment");
builder.setView(innerLayout);
builder.setPositiveButton("Submit", new OnClickListener() {
  public void onClick(DialogInterface dialog, int which) {
    Apartment apartment = new Apartment();
    apartment.setAddress(editAddress.getText().toString());
    apartment.setBedrooms((Integer)spinBedrooms.getSelectedItem());
    apartment.setPublished(true);
    apartmentTable.insert(apartment,
      new TableOperationCallback() /* omitted for brevity */);
  }
});
builder.setNegativeButton("Cancel", null);
builder.create().show();
Finally, when the map activity is invoked, it displays apartment listings using a simple map overlay on top of Google's MapView (using Google Maps in your application requires an API key, which you obtain online). The overlay relies on the coordinates provided by the server when a new apartment listings is inserted (we'll see how that happens later). When an overlay item is tapped, the overlay displays a simple dialog with the listing's details.
public class ApartmentOverlay extends ItemizedOverlay<OverlayItem> {
  private List<OverlayItem> items = new ArrayList<OverlayItem>();
  private List<Apartment> apartments = new ArrayList<Apartment>();
  private Context context;

  public void addApartment(Apartment apartment) {
    Location location = apartment.getLocation();
    items.add(new OverlayItem(new GeoPoint(
      (int) (location.getLatitude() * 1E6),
      (int) (location.getLongitude() * 1E6)),
      "Apartment",
      apartment.getAddress()));
    apartments.add(apartment);
    populate();
  }

  public ApartmentOverlay(Context ctx, Drawable defaultMarker) {
    super(boundCenterBottom(defaultMarker));
    context = ctx;
    populate();
  }

  @Override
  protected OverlayItem createItem(int i) {
    return items.get(i);
  }

  @Override
  public int size() {
    return items.size();
  }

  @Override
  protected boolean onTap(int index) {
    Apartment apartment = apartments.get(index);
    AlertDialog.Builder builder = new AlertDialog.Builder(context);
    builder.setTitle("Apartment");
    builder.setMessage("Address: " + apartment.getAddress() + "\n" +
                       apartment.getBedrooms() + " bedrooms");
    builder.create().show();
    return super.onTap(index);
  }
}

iOS
The Mobile Services framework on iOS does not rely on static types to convey information across the wire. Instead, it uses the NSDictionary class, which contains a collection of key-value pairs. Although my implementation could provide a static Apartment class, which would be "serialized" to and from the NSDictionary representation, I opted to use NSDictionary throughout. If the model were more complex, I might have considered the static type approach.

The main view controller on iOS contains a UITableView that uses the subtitle table view cell style. When the view controller is initialized, it retrieves a list of apartment listings from the Mobile Services backend, and provides it to the UITableView in the UITableViewDelegate's numberOfSectionsInTableView:, tableView:numberOfRowsInSection:, and tableView:cellForRowAtIndexPath: methods.

- (void)viewDidLoad {
  [super viewDidLoad];
  self.client = [MSClient clientWithApplicationURLString:kMobileAppURL
                                      withApplicationKey:kMobileAppKey];
  self.table = [self.client getTable:@"apartment"];
  NSPredicate *predicate = [NSPredicate
                            predicateWithFormat:@"published == YES"];
  [self.table readWhere:predicate
             completion:^(NSArray *results, NSInteger totalCount, NSError *error) {
    self.items = [results mutableCopy];
  }];
}

- (NSInteger)numberOfSectionsInTableView:(UITableView *)tableView {
    return 1;
}

- (NSInteger)tableView:(UITableView *)tableView
 numberOfRowsInSection:(NSInteger)section {
    return [self.items count];
}

- (UITableViewCell *)tableView:(UITableView *)tv
         cellForRowAtIndexPath:(NSIndexPath *)indexPath {
  static NSString *CellIdentifier = @"Cell";
  UITableViewCell *cell = [tv dequeueReusableCellWithIdentifier:CellIdentifier
                              forIndexPath:indexPath];
  NSDictionary *apt = [self.items objectAtIndex:indexPath.row];
  cell.textLabel.text = apt[@"address"];
  cell.detailTextLabel.text = [NSString stringWithFormat:@"%d bedrooms",
                               [apt[@"bedrooms"] integerValue]];
  return cell;
}

To add an apartment listing, the user navigates to a secondary view controller that uses a UITableView with static cells to collect the apartment's address and number of bedrooms. When the user taps "Save", the secondary view controller creates a new NSDictionary with the apartment's details, and provides it to its delegate (which is the home view controller), that in turns inserts the apartment listing to the Mobile Services backend:

//In the secondary view controller:
- (IBAction)saveTapped:(id)sender {
  NSDictionary *apartment = @{
    @"address" : self.itemText.text,
    @"bedrooms" : @(self.bedrooms.selectedSegmentIndex+1),
    @"published" : @(YES)
  };
  if ([self.delegate respondsToSelector:@selector(saveApartment:)]) {
    [self.delegate performSelector:@selector(saveApartment:)
                        withObject:apartment];
  }
}

//In the primary view controller:
- (void)saveApartment:(NSDictionary *)apartment {
  [self.navigationController popViewControllerAnimated:YES];

  __weak HomeController *s = self;
  [self.table insert:item completion:^(NSDictionary *result, NSError *error) {
    //Error handling omitted for brevity
    NSUInteger index = [self.items count];
    [s.items addObject:result];
    NSIndexPath *indexPath = [NSIndexPath indexPathForRow:index
                                                inSection:0];
    [s.tableView insertRowsAtIndexPaths:@[ indexPath ]
                       withRowAnimation:UITableViewRowAnimationTop];
  }];
}

Finally, if the user navigates to the map view controller, it displays the apartment listings using an ApartmentAnnotation class that implements the MKAnnotation protocol -- this tells the MKMapView where to display the apartments on the map.

//The MapViewController's viewDidLoad method:
- (void)viewDidLoad {
  for (NSDictionary *apartment in self.apartments) {
    MKAnnotation *annotation = [[ApartmentAnnotation alloc]
                                initWithApartment:apartment];
    [self.mapView addAnnotation:annotation];
  }
}

//The ApartmentAnnotation class:
@interface ApartmentAnnotation : NSObject <MKAnnotation>

- (id)initWithApartment:(NSDictionary *)apartment;

@end

@implementation ApartmentAnnotation

- (id)initWithApartment:(NSDictionary *)apartment {
    if (self = [super init]) {
        self.apartment = apartment;
    }
    return self;
}

- (NSString *)title {
    return self.apartment[@"address"];
}

- (NSString *)subtitle {
    return [NSString stringWithFormat:@"%d bedrooms",
            [self.apartment[@"bedrooms"] integerValue]];
}

- (CLLocationCoordinate2D)coordinate {
    return CLLocationCoordinate2DMake(
      [self.apartment[@"latitude"] doubleValue],
      [self.apartment[@"longitude"] doubleValue]
    );
}

@end

Windows Phone 8
The Windows Phone Azure Mobile Services SDK supports typed data, much like the Android version. The Apartment class is very similar to the Android version, and the [DataTable]/[DataMember] attributes help with customizing the serialized JSON output to fit the backend model.

[DataTable(Name = "apartment")]
public class Apartment
{
  public int Id { get; set; }

  [DataMember(Name = "address")]
  public string Address { get; set; }

  [DataMember(Name = "published")]
  public bool Published { get; set; }

  [DataMember(Name = "bedrooms")]
  public int Bedrooms { get; set; }

  [DataMember(Name = "latitude")]
  public double Latitude { get; set; }

  [DataMember(Name = "longitude")]
  public double Longitude { get; set; }

  [DataMember(Name = "username")]
  public string UserName { get; set; }
}

The Windows Phone UI uses the Pivot control, which enables swipe navigation from the apartments list to the map that displays them, and to an additional page that is used to add new apartment listings. The apartment list is a simple ListBox control that has a data template with a few TextBlocks. The Pivot control is set up as follows:

<phone:Pivot Title="RENT A HOME">
  <phone:PivotItem Header="apartments">
    <StackPanel>
      <ListBox x:Name="listApartments">
        <ListBox.ItemTemplate>
          <DataTemplate>
            <StackPanel Orientation="Vertical">
              ... three TextBox controls omitted for brevity ...
            </StackPanel>
          </DataTemplate>
        </ListBox.ItemTemplate>
      </ListBox>
    </StackPanel>
  </phone:PivotItem>
  <phone:PivotItem Header="map">
    <maps:Map x:Name="mapApartments" CartographicMode="Hybrid"
                                     LandmarksEnabled="True" />
  </phone:PivotItem>
  <phone:PivotItem Header="new">
    <StackPanel>
      ... standard UI for adding listings omitted for brevity ...
    </StackPanel>
  </phone:PivotItem>
</phone:Pivot>

When the page loads, the application fetches apartment listings from the mobile service backend and binds the resulting list to the ListBox. The LINQ-like syntax is very convenient for expressing queries, such as retrieving only published apartment listings, and the C# support for async methods makes it very easy to perform this operation asynchronously using the await operator:

var items = await MobileService.GetTable<Apartment>()
                               .Where(a => a.Published == true)
                               .ToListAsync();
listApartments.ItemsSource = items;

The Windows Phone application uses the Nokia Maps control, which is the recommended maps framework for Windows Phone 8 (Microsoft.Phone.Maps namespace). Apartment listings are displayed on top of the map as simple overlays, that, when tapped, display a message with the apartment's details and zooms in to the listing's location on the map:

mapApartments.Layers.Clear();
MapLayer layer = new MapLayer();
foreach (Apartment apartment in apartments)
{
  MapOverlay overlay = new MapOverlay();
  overlay.GeoCoordinate = new GeoCoordinate(
                  apartment.Latitude, apartment.Longitude);
  overlay.PositionOrigin = new Point(0, 0);
  Grid grid = new Grid
  {
    Height = 40,
    Width = 25,
    Background = new SolidColorBrush(Colors.Red)
  };
  TextBlock text = new TextBlock
  {
    Text = apartment.Bedrooms.ToString(),
    VerticalAlignment = VerticalAlignment.Center,
    HorizontalAlignment = HorizontalAlignment.Center
  };
  grid.Children.Add(text);
  overlay.Content = grid;
  grid.Tap += (s, e) =>
  {
    MessageBox.Show(
      "Address: " + apartment.Address + Environment.NewLine +
      apartment.Bedrooms + " bedrooms",
      "Apartment", MessageBoxButton.OK);
    mapApartments.SetView(overlay.GeoCoordinate, 15,
                          MapAnimationKind.Parabolic);
  };
  layer.Add(overlay);
}
mapApartments.Layers.Add(layer);

Windows 8
The Windows 8 implementation is strikingly similar to the Windows Phone one. In fact, the latest release of Windows Azure Mobile Services consolidates most of the .NET frameworks into a single portable class library, with only minor parts provided as separate auxiliary assemblies (this was enabled by introducing much-awaited portable class library support for the HttpClient class). This means that our application's model could be placed in a portable class library as well, and reused from all supporting .NET platforms. (This is not currently the case.)

Because of the larger screen estate, the Windows 8 application doesn't have multiple pages -- the entire UI can fit on the screen. The apartment listings are bound to a ListView control, and the map on the right displays them alongside.

The code responsible for manipulating the model is not very interesting, but the maps framework is worth mentioning. The Windows 8 application uses the Bing Maps control, which requires an API key that you obtain online. In the Bing Maps parlance, overlays are called pushpins, and here's how you place them on the map:

mapApartments.Children.Clear();
foreach (Apartment apartment in apartments)
{
  Pushpin pushpin = new Pushpin
  {
    Text = apartment.Bedrooms.ToString()
  };
  mapApartments.Children.Add(pushpin);
  Location location = new Location(
                    apartment.Latitude, apartment.Longitude);
  MapLayer.SetPosition(pushpin, location);
  pushpin.Tapped += (s, e) =>
  {
    mapApartments.SetView(location, 15);
  };
}

Server Scripts
On the backend side, we need a server script to enrich our apartment listing with geographical coordinates. The user provides an address, such as "One Microsoft Way, Redmond WA", which we have to convert to a latitude-longitude pair. (This process is called geocoding.)

Even though each mobile platform supports some geocoding service (for example, on Android it's the Google geocoding service, exposed through the Geocoder class), it would be a fairly bad idea to perform geocoding on the client. One reason is that the geocoding process is slow and expensive. Another reason is that on each platform, geocoding the same address might lead to a different result, which is wildly unintuitive. This is why this work is best offloaded to the service's backend.

Specifically, here's the relevant part from the insert script on the apartment table, which performs geocoding with Google's free geocoding API using the request Node.js module provided by Windows Azure Mobile Services:

function insert(item, user, request) {
  var reqModule = require('request');
  var base = 'http://maps.googleapis.com/maps/api/geocode/json?sensor=false'; 
  var what = escape(item.address);
  reqModule(base + '&address=' + what,
    function(error, response, body) {
      if (!error) {
        var geoResult = JSON.parse(body);
        var location = geoResult.results[0].geometry.location;
        item.latitude = location.lat;
        item.longitude = location.lng;
      }
      //Continue processing the request, omitted for brevity
    }
  );
}

Summary
This concludes our whirlwind tour of the Rent a Home application, more specifically its UI-related parts and the maps frameworks used. In the next installment, we'll look at how authentication (with Twitter) was integrated into the app on all four platforms, and how it was then used to associate apartment listings with the name of the user who added them.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Windows Azure Mobile Services "Rent a Home" Sample, Part 1: Introduction

For my Visual Studio Live! talk on Windows Azure Mobile Services, I decided to go beyond the "todolist" quick start samples and implement an application that illustrates more framework-specific and platform-specific features. The application is called "Rent a Home", and helps users share apartments for rent and view apartments for rent on a map around their location. Although this is not a production quality application -- for one thing, there is no way to contact the apartment owner! -- it's a more realistic illustration of why you would want a shared backend for your mobile application on all four major mobile platforms: Android, iOS, Windows Phone, and Windows 8.

In this post, we'll look at the feature list and some screenshots from the four platforms. In Part 2 we'll look at the user interface implementation and how it is connected to the backing Mobile Service. We will also see how server-side scripts enrich the user experience by providing geocoding of user-specified apartment addresses. In Part 3 we will explore authentication support (with Twitter), and obtain the user's name from the Twitter API. Finally, in Part 4 we'll look at push notifications on the four platforms, and see what's necessary on the server side to send push notifications whenever a new apartment listing is added. You might want to follow along by looking at the code, which is available on GitHub.

The Rent a Home application has the following features on all four platforms:

  • Users can add an apartment listing by providing a street address and the number of bedrooms
  • Users can unpublish an apartment listing that they previously published
  • Users can view the apartment listings in a table or on a map
  • Users can sign in with Twitter to associate apartment listings with their identity
  • Users can receive push notifications whenever a new apartment listing is added

Android screenshots

iOS screenshots

Windows Phone screenshots

Windows 8 screenshot

In the next part, we'll look at the implementation of the application's UI on all platforms, and how server scripts enrich this implementation with location information for the maps to work properly.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Wish List for the Visual Studio Editor and Debugger: Drawing Inspiration from Other IDEs

One of the benefits of using more than one development platform, more than one IDE, and more than one debugger is that you gain a better understanding of what your personal ideal development workflow looks like. It might well be the case that no single tool provides every feature you're excited about, which is what I feel these days. Because Visual Studio is my long-time (since 1999) favorite IDE and debugger, here are some features from other tools I'd like to see integrated in Visual Studio.

Inspired by command-line debuggers, like WinDbg/GDB/DDD/DBX

Command-line debuggers tend to make automation much easier than their visual counterparts. Visual Studio 2010 used to offer VBA macros, which you could use for fairly sophisticated automation. For instance, you could have a breakpoint trigger a macro that would set another breakpoint, dump out all thread stacks, look for local variable values, and evaluate complex expressions. In Visual Studio 2012, macro support has been removed from the IDE, which makes command-line tools even more powerful in comparison.

Here are some of the things you can do in command-line debuggers that you can't do in Visual Studio today:

  • Set a breakpoint when a large memory location is read/written.
    This is something you can do in WinDbg using the !vprotect extension command to configure a PAGE_GUARD page, and then process the exception Windows will raise when the memory is accessed.
  • Run an arbitrary debugger command when a breakpoint is hit.
    In WinDbg, any breakpoint you set (including data breakpoints) can run any debugger command -- and decide whether to actually stop execution or continue as if the breakpoint wasn't hit.
  • Execute a set of debugger commands from a script file (the aptly named $$>< command in WinDbg).
  • Inspect managed heap contents, possibly running a command for each interesting heap object you encountered.
    This is very useful for managed memory leak diagnostics. For example, you could run the !gcroot command for all objects of a specific type:
    .foreach (obj {!dumpheap -type MyNamespace.MyClass -short}) {!gcroot obj; .echo -----;}
  • Load debugger extensions (like SOS or SOSEX) and extend your debugger with arbitrary commands. For example, SOSEX has a great variety of stack listing, object inspection, and synchronization-related commands that you could then use.

Inspired by Xcode

The Xcode IDE has a mixed reception from developers. I think most people moving to Xcode from Visual Studio will find it inadequate, but it actually has a few very nice features both in development-time and during debugging. This is not to say that I would actually want to do most of my development in Xcode -- but I've already been spoiled by Visual Studio :-)

One thing Xcode is really good at is static code analysis, which it performs on the fly. Now, Visual C++ also has static code analysis (the /analyze compiler switch), but it only runs when you compile your project, and takes quite a while to produce errors. Also, there's a big problem with signal to noise ratio, which I wish were addressed in future releases of the compiler. Here are a couple of examples:

  • Detecting a null pointer dereference and explaining where it's coming from. (This is actually something Visual C++ can do, but with less explanatory details.)

 

  • Detecting reference counting errors.

 

  • Detecting errors that result from calling methods on a nil pointer, that would return garbage in the case of structs.

 

Another thing the Xcode debugger is pretty good at (thanks to the underlying lldb debugger engine) is smart breakpoints, and performing a variety of actions when a breakpoint is hit. The Apple version of macros is AppleScript, and Xcode breakpoints can run an arbitrary AppleScript script as well as any lldb command.

For example, you could configure a breakpoint so that it takes a screenshot of your iPhone Simulator whenever it's hit, using the following AppleScript script [source]:

tell application "iPhone Simulator"
       activate
end tell

tell application "System Events"
       tell process "iOS Simulator"
               tell menu bar 1
                       tell menu bar item "iOS Simulator"
                               tell menu "iOS Simulator"
                                       click menu item "Hide Others"
                               end tell
                       end tell
               end tell
       end tell
end tell

do shell script "screencapture -m /tmp/screencapture.png"

Inspired by Eclipse

Although Eclipse draws lots of criticism -- it's slow, buggy, and some say ugly -- it has an unparalleled set of built-in refactorings and code-fixes that make most other IDEs pale in comparison.

First, just take a look at the Refactor menu in Eclipse:

eclipse_refactor_menu 

Now, many of these things can be added to Visual Studio through ReSharper or CodeRush or similar tools, but it's actually very nice to have these features in the out-of-the-box IDE experience. Also, the code completion and error-fixing offered by Eclipse is so powerful that some people say you can create a new project in Eclipse, and just by using Ctrl+Space generate the entire source code for Eclipse itself ;-)

For example, here's Eclipse offering to fix a problem where an anonymous class method is trying to access a non-final local variable from the enclosing scope:

eclipse_quickfix_menu

Summary

I often suffer every minute I have to spend outside of Visual Studio. The shortcut muscle memory, the powerful code editor, and of course my favorite language -- C# -- make it somewhat a pain to use anything else. That is why I am constantly looking for more features to extend that experience -- so that I can spend even more time in Visual Studio and be even more productive.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Wishes for the CLR JIT in the 2020s

There have been some very interesting discussions at the MVP Summit concerning the CLR JIT, what we expect of it, and how to evolve it forward. I obviously can't disclose any NDA materials, but what I can do is share my hopes and dreams for the JIT, going forward. This is not a terribly popular subject, but there are some UserVoice suggestions around the JIT, such as adding SIMD support to C#.

The state of the JIT today is that it's a fairly quick compiler that does a fairly bad job at optimization. There are some tricks it employs that are not available to statically compiled languages, such as interface method inlining via profiling, but compared to state-of-the-art dynamic JITs, it lags strongly behind. Some of the biggest weaknesses, in my opinion, include:

  • Almost complete lack of support for SIMD operations, both automatic and programmer-specified
  • Issues with inlining support of methods that return or accept complex value types
  • Disparity between x86 and x64 JITs in terms of code quality and compilation speed
  • Lack of flexibility on inlining and "hot-spot" optimization decisions
  • No extra, breath-taking optimizations in NGen compared to the runtime JIT
  • Insufficient knobs for tweaking JIT behavior, aggressiveness, memory utilization

Of all these, I think the biggest pain point is support for SIMD operations. In Visual C++ 2012, the compiler can automatically vectorize loops over integers or floats and produce super-efficient vector operations using 128- and 256-bit registers. The potential speedups from auto-vectorization are 8x on modern hardware. The biggest challenge with auto-vectorization is figuring out whether it's safe to perform it -- i.e., making sure that no dependencies exist that would break the vectorized version. But that kind of analysis is easier in C# than it is in C++.

Even without auto-vectorization support, C++ compilers have offered intrinsic operations for years that give low-level developers the opportunity to optimize their loops, game engines, and math operations, at the expense of nastiness such as:

for (int i = 0; i < size; i += 8) {
  __m128i vb = _mm_load_si128((__m128i const*)&b[i]);
  __m128i vc = _mm_load_si128((__m128i const*)&c[i]);
  __m128i vd = _mm_load_si128((__m128i const*)&d[i]);
  vc = _mm_add_epi16(vc, vtwo);
  vd = _mm_add_epi16(vd, vk);
  __m128i mask = _mm_cmpgt_epi16(vb, vzero);
  vc = _mm_and_si128(vc, mask);
  vd = _mm_andnot_si128(mask, vd);
  __m128i vr = _mm_or_si128(vc, vd);
  _mm_store_si128((__m128i*)&a[i], vr);
}
Now, I'm not an advocate for including every single intrinsic into C# so that it's supported by the JIT. In fact, I think it would be an ugly approach to take, albeit a relatively easy one. Another option is simulating existing vector libraries for C++ code, using a small number of built-in types whose operations are automatically promoted to SIMD instructions. For example:
Vector4f a = new Vector4f(1.0f, 2.0f, 3.0f, 4.0f);
Vector4f b = a.Reverse();
Vector4f c = -a * b/2;

But there definitely exist more elegant approaches that also decouple the source code from the processor implementation, so that, for example, when larger registers are available (e.g. 512-bit registers), they could be used automatically. This would likely require either fully automatic vectorization, so that you write "a simple loop" and the compiler generates appropriate instructions, or significant new syntax/attributes that would be used by the developer to hint that automatic vectorization is possible. Perhaps a new type of arrays, or a new type of array indexing syntax, would go toward enabling this experience.

My next item on the list after SIMD support would be extra work in NGen. Currently, running NGen on your application has (almost) only the effect of improving startup performance, because there's not need to perform compilation at runtime. However, I'll be willing to sacrifice compilation time when running NGen -- i.e., have a much slower NGen turnaround time -- at the expense of generating more efficient code and introducing additional optimizations.

You are welcome to chime in with your suggestions and thoughts in the comments, and make sure to visit UserVoice and make suggestions. (Although you should first skim through the existing ones to avoid diluting votes on already-posted suggestions.)


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Easier Tuple-Like Classes in C#

A discussion during the MVP Summit prompted me to think about what would make it easier to use tuple-like classes while preserving valuable naming information of the tuple's constituents. The purpose of this exercise is to try and come up with a solution that does not require modification of existing C# syntax.

To set the scene, consider the following method:

bool ParseRequest(string request, out string operation, out int id) {
	string[] parts;
	if (request == null || (parts = request.Split(' ')).Length != 2) {
		return false;
	}
	operation = parts[1];
	return int.TryParse(parts[0], out id);
}

Using this method entails the very inconvenient syntax typical for out parameters:

string request = ...;
string operation;
int id;
if (ParseRequest(request, out operation, out id)) ...

We can resort to tuples, but it makes things that much uglier on the caller's side:

Tuple<bool, string, int> ParseRequest(string request) ...

var result = ParseRequest(request);
if (result.Item1) ... //continue to use result.Item2, result.Item3

One alternative that requires syntax changes relies on providing methods with multiple return values, or at least syntax for unpacking tuples such as the following:

(success, operation, id) = ParseRequest(request);
if (success) ...

In the latter syntax, the types of success, operation, and id would be implicitly determined by the compiler based on the return type of the ParseRequest method. But again, that requires syntax changes. Here's an idea that doesn't:

struct ParseResult {
	public bool Success;
	public string Operation;
	public int Id;
}

ParseResult ParseRequest(string request) {
	//proceeds to return new ParseResult { ... }
}

This is nice, but loses many of the advantages that the Tuple class has, such as immutability. How about we combine the two?

class ParseResult : Tuple<bool, string, int> {}

That's bad for the caller -- we still have to access the Item1, Item2, and Item3 properties. In other words, we need renaming:

class ParseResult : Tuple<bool, string, int> {
	public bool Success { return Item1; }
	public string Operation { return Item2; }
	public int Id { return Item3; }
}

Now, this is the kind of thing compilers are good at. It is tempting to go back to the drawing board and add some auxiliary syntax, maybe something like:

class ParseResult (bool success, string operation, int id) {}

The compiler could then go ahead and generate exactly the same thing as the previous snippet. If you don't like the parens before the class definition, I guess we could settle for the following:

[RenamedTuple("success", "operation", "id")]
class ParseResult : Tuple<bool, string, int> {}

This is even something you could easily feed into an AOP framework and expect it to generate the boilerplate code for you. Alternatively, if you can live with the perf hit introduced by dynamic invocation, you could have the following:

class ParseResult : DynamicTuple<bool, string, int> {
	public ParseResult() : base("success", "operation", "id") {}
}

Here, the DynamicTuple class can be essentially a dynamic key-value store that would provide the necessary Success, Operation, and Id properties for the caller, as well as possibly a convenient constructor-like initialization syntax for the callee.

There are probably dozens of additional solutions that all nudge the verbosity in different directions, but nothing that can be concise without introducing additional syntax to the language. Hopefully this illustrates that it's easy to complain about the language, but not easy at all to come up with a satisfactory alternative that would be agreeable for everyone :-)


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Identify the User-Mode Drivers Loaded into a WUDFHost.exe Instance

Once upon a time, it was fairly challenging to determine which services were running in an individual svchost.exe process. Today, with Process Explorer, there’s nothing easier – just hover over the svchost.exe process and you get a list of services, or double-click an svchost.exe process and go to the Services tab:

image

A similar problem can arise with user-mode drivers (UMDF). User-mode drivers are COM DLLs loaded into WUDFHost.exe processes, and some WUDFHost.exe processes may contain more than one user-mode driver. Process Explorer does not help in identifying which user-mode drivers are loaded into a WUDFHost.exe process, and although you can look at the list of DLLs and try to identify the ones that represent drivers, a more reliable way is desired. One option is to look at the list of threads inside the process, and identify command threads for UMDF drivers, such as this one:

image

A more reliable approach that will give you additional information on the driver and the device stack is the following:

  1. Run WinDbg as an administrator and attach (File > Attach to Process) to the WUDFHost.exe process in which you are interested.
  2. Type .load wudfext
  3. Type !umdevstacks

The resulting output will be similar to the following, and allow you to identify which device stacks (and hence user-mode drivers) are hosted in that process:

0:009> .load wudfext
0:009> !umdevstacks
Number of device stacks: 1
  Device Stack: 0x0000009d88ad5810    Pdo Name: \Device\0000001c
    Active: Yes
    Number of UM devices: 1
    Device 0
      Driver Config Registry Path: SensorsSimulatorDriver
      UMDriver Image Path: C:\Windows\system32\DRIVERS\UMDF\SensorsSimulatorDriver.dll
      Fx Driver: IWDFDriver 0x9d88d39e28
      Fx Device: IWDFDevice 0x9d88d3a118
        IDriverEntry: (unknown type) 0x0000009d88af21b0
      Open UM files (use !umfile <addr> for details): <None>
      Device XFerMode: CopyImmediately RW: Buffered CTL: Buffered
      Object Tracker Address: 0x0000000000000000
        Object   Tracking OFF
        Refcount Tracking OFF
    DevStack XFerMode: CopyImmediately RW: Buffered CTL: Buffered

This gives you enough information to identify everything that’s going on inside that process. By the way, if you’re into UMDF development, you should certainly check out other commands from the wudfext extension, that will give you insight into specific I/O requests, queues, and other UMDF objects.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn

Windows Performance Analyzer

In 2008, I blogged about the just-released Windows Performance Toolkit, and the xperf tool that collects ETW events (including stack traces) and displays them in a form that allows basic analysis. Since then, ETW generation and collection have taken a huge leap forward. Microsoft has released a great library for creating ETW providers, and a set of tools (PerfMonitor, PerfView) for analyzing ETW traces in .NET apps.

With the release of the Windows 8 SDK, xperf has been superseded by two new tools: WPR (Windows Performance Recorder), which enables ETW providers and captures traces, and WPA (Windows Performance Analyzer), which displays traces in graphical form including graphs and detail tables.

I wouldn’t want to sound like a broken record, but ETW is truly one of the most incredible instrumentation and diagnostic tools on Windows. The wealth of information you can discern from a properly captured ETW trace is overwhelming, and many seemingly-impossible problems have been solved in the past with simple ETW traces. For example, check out this story about identifying a faulty Western Digital hard disk driver that was doing 4GB memory allocations, or this story about performance issues in Windows Live Photo Gallery.

Getting started with WPA can be a little intimidating, but in the end it displays the same set of information. Moreover, you can use WPA to open ETW traces recorded with xperf – the file format is, of course, completely interoperable. As an example, let’s record a trace with the Base kernel group (that group includes sampling profiling events) and stackwalks for the Profile kernel flag:

xperf –on Base –stackwalk Profile

Now, after performing some activity (I chose to run a dir /s command in a command prompt window), turn off the data collection and merge the log:

xperf –d profile.etl

Finally, open the resulting file in WPA:

wpa profile.etl

image

The window looks a bit empty, so go ahead and expand some graphs on the left. When you encounter an interesting graph, drag it to the main view. In my case, I would like to see the stack activity for the cmd.exe and conhost.exe processes, so I’ll drag out the System Activity > Stacks Counts and System Activity > Processes Lifetime graphs:

image

Notice how after selecting a process in the lower graph, I get the same time interval highlighted in the upper graph. That’s a feature I was direly missing in xperfview.

Finally, to see detailed stack information for the relevant processes, click the toolbar icon on the upper left that says “Display graph and table”. The resulting table is quite similar to what xperf had to offer – you can drag columns to the left of the gold bar for grouping, and expand stack traces (after loading symbols with Trace > Load Symbols) to see the weight for each individual function. For example, after drilling down into the conhost.exe process, I found that it spends most of its time asking gdi32.dll to draw text on the screen – what a surprise!

image

To conclude, ETW is still very awesome and WPR/WPA make it somewhat easier to record and analyze ETW traces. For managed applications, you really should consider looking at PerfMonitor and PerfView, and Vance Morrison has a great set of blog posts and videos covering their various features. Chapter 2 of the Pro .NET Performance book covers some of these tools and concepts as well.


I am posting short links and updates on Twitter as well as on this blog. You can follow me: @goldshtn
More Posts Next page »