It might be because the images are ambiguous and the brain is not capable of discern how to interpret the image. But if more clues are given to the brain then it becomes easy to see the cube.
Ambiguous image, one can not determine which side is at the front and which at the back. Both views are correct
Your brain have extra clues to interpret the image