To understand the grounding of cognitive mechanisms in perception and action, we used a simple detection task to determine how long it takes to predict an action goal from the perception of grasp postures and whether this prediction is under strategic control. Healthy observers detected visual probes over small or large objects after seeing either a precision grip or a power grip posture. Although the posture was uninformative it induced attention shifts to the grasp-congruent object within 350 ms. When the posture predicted target appearance over the grasp-incongruent object, observers' initial strategic allocation of attention was overruled by the congruency between grasp and object. These results might help to characterize the human mirror neuron system and reveal how joint attention tunes early perceptual processes toward action prediction.