Title of Paper: The segmentation of speech and the emergence of language structure: survival of the fittest in a statistical environment Abstract: Empirical observations indicate that the ability to communicate with structured language rather than an unstructured stream of words confers an evolutionary advantage. This paper looks at the process by which this ability could have developed, and shows how an intermediate stage was likely to evolve, with its own advantage. Well known tools from Information Theory are used in a novel way to evaluate the effectiveness of communication when language is or is not structured. The structure of language can be seen as a tertiary form. First there are relationships between adjacent words, which can partially be modelled by Markov processes. Secondly, words can be grouped together creating constituents, which can then be organised in a hierarchical syntactic structure. Thirdly, there are relationships between elements of constituents. In this paper we consider the grouping together of words into constituents. We will show that the transmission of information is more efficient when a stream of words is appropriately segmented, and that such segments are related to syntactic elements. Human speech capabilities have evolved for speed, reliability and scope, but at a significant physiological cost. The ability to communicate effectively must have outweighed the concomitant disadvantages. Given this evidence of selection for efficient communication we assume that speech will have developed so that messages can be encoded and decoded as efficiently as possible. To evaluate ease of decoding, the entropy of sequences of spoken words are compared under varying conditions. Lower entropy means that successive words in a sequence are more easily predicted, and entropy declines as more of the context is taken into account. The contribution we make is to show that entropy can also be reduced by explicitly representing appropriate segmentation markers. This appropriate segmentation is produced by observed discontinuities in spontaneous spoken English. There is a relationship between prosody and syntactic structure, and these prosodic segments can usually be seen as syntactic elements. This investigation was carried out using the MARSEC corpus of spoken English, annotated with prosodic markers. Unscripted News broadcasts were used. Words were mapped onto part-of speech tags, then the entropy was calculated with and without major and minor discontinuities. Entropy declines when discontinuities are represented. If an arbitrary pattern of segmentation is imposed, the entropy rises sharply. As language has evolved, we expect selection pressure to encourage methods of efficient communication, and our work indicates that this includes decomposing the speech stream into constituents with some syntactic cohesion. This process is a step on the way to the development of a hierarchical language structure, but the evolutionary advantage of reaching this stage can be explained without invoking the merits of a fully developed language structure. References include Arnfield; Bell, Cleary and Witten; Charniak; Cover and Thomas; Fang and Huckvale; Jelinek; Lieberman; Lyon and Brown; Morgan and Demuth; Ostendorf and Vielleux; Shannon; Taylor and Black.