Unroll skia ARGB8888 blitter loops.

Measured a 25% performance improvement on Tegra 2 for
S32_Opaque_BlitRow32.

skia_bench -config 8888 -repeat 100 -match bitmap_8888_A
Iteration time is reduced from ~130ms to ~100ms.

Change-Id: I7345a04b21d5e0c696b1f4aca763bfeda822c7b5
1 file changed