Optimize the loop in gdalwarp which builds a mask. Handle the majority of an image in a loop, creating 8 bits at a time. Later, handle the edge case where less than 8 bits are packed.