Information complexity in animals is an indicator of advanced communication and an intricate socio-ecology. Zipf's Law of least effort has been used to assess the potential information content of animal repertoires, including whether or not a particular animal communication could be ‘language-like’. As all human languages follow Zipf's law, with a power law coefficient (PLC) close to −1, animal signals with similar probability distributions are postulated to possess similar information characteristics to language. However, estimation of the PLC from limited empirical datasets (e.g. most animal communication studies) is problematic because of biases from small sample sizes. The traditional approach to estimating Zipf's law PLC is to find the slope of a log–log rank-frequency plot. Our alternative option uses the underlying equivalence between Shannon entropy (i.e. whether successive elements of a sequence are unpredictable, or repetitive) and PLC. Here, we test whether an entropy approach yields more robust estimates of Zipf's law PLC than the traditional approach. We examined the efficacy of the entropy approach in two ways. First, we estimated the PLC from synthetic datasets generated with a priori known power law probability distributions. This revealed that the estimated PLC using the traditional method is particularly inaccurate for highly stereotyped sequences, even at modest repertoire sizes. Estimation via Shannon entropy is accurate with modest sample sizes even for repertoires with thousands of distinct elements. Second, we applied these approaches to empirical data taken from 11 animal species. Shannon entropy produced a more robust estimate of PLC with lower variance than the traditional method, even when the true PLC is unknown. Our approach for the first time reveals Zipf's law operating in the vocal systems of multiple lineages: songbirds, hyraxes and cetaceans. As different methods of estimating the PLC can lead to misleading results in real data, estimating the balance of a communication system between simplicity and complexity is best performed using the entropy approach. This provides a more robust way to investigate the evolutionary constraints and processes that have acted on animal communication systems, and the parallels between these processes and the evolution of language.
Bibliographical notePublisher Copyright:
© 2020 British Ecological Society
- Shannon entropy
- Zipf's Law
- animal communication
- information theory