Apply ARM patches from NVidia for improved drawing performance.

Add S32A_Opaque_BlitRow32 with TEST_SRC_ALPHA
Add optimization for 32bit blits on neon
Optimize S32A_D565 pixel loop, non-NEON CPUs

bug: 6467331
Change-Id: I3e0b0a8f711bf2ed97b480b81232a52f6f94dbe3
2 files changed