Current text-to-speech systems have very good intelligibility, but most are still easily identified as artificial voices and no commercial system incorporates prosodic variation resulting from emotion and related factors. This is largely due to the complexity of identifying and categorising the emotion factors in natural human speech, and implementing these factors within synthetic speech. However, prosodic content in synthetic speech is seen as increasingly important, and there is presently renewed interest in the investigation of human vocal emotion and the expansion of synthesis models to allow greater prosodic variation. Such models could also be used as practical tools in the investigation and validation of models of emotion and other speech-altering stressors. This paper reviews progress to date in the investigation of human vocal emotions and their simulation in synthetic speech, and requirements for future research which is required to develop this area are also presented.