Both getVarint32Ptr and getVarint32PtrFallback return nullptr when
p >= limit. Avoid the call to getVarint32PtrFallback and immediately
return from getVarint32.
As it is safe to call getVarint32 with p >= limit, skip the extra check in
each getDifferentialVarInt32 loop iteration, speeds up decoding results
for the 3 benchmark cases by ~10%.
Remove unnecessary reinterpret_cast<char*>(p), p is a char*.