Md. Izhar Ashraf , Sitabhra SinhaPublished: January 17, 2018https://doi.org/10.1371/journal.pone.0190735
Language, which allows complex ideas to be communicated through symbolic sequences, is a characteristic feature of our species and manifested in a multitude of forms. Using large written corpora for many different languages and scripts, we show that the occurrence probability distributions of signs at the left and right ends of words have a distinct heterogeneous nature. Characterizing this asymmetry using quantitative inequality measures, viz. information entropy and the Gini index, we show that the beginning of a word is less restrictive in sign usage than the end. This property is not simply attributable to the use of common affixes as it is seen even when only word roots are considered. We use the existence of this asymmetry to infer the direction of writing in undeciphered inscriptions that agrees with the archaeological evidence. Unlike traditional investigations of phonotactic constraints which focus on language-specific patterns, our study reveals a property valid across languages and writing systems. As both language and writing are unique aspects of our species, this universal signature may reflect an innate feature of the human cognitive phenomenon.
From the paper :
We have used a database where the relatively few sequences which are believed to have been written from left to right have been reversed so as to be oriented in the same direction as the majority, following standard procedure used for constructing concordances for Indus Valley Civilization inscriptions. We observe from Fig 3 that the ΔG for sign usage distribution is positive, indicating that the choice of signs is less restricted in the right terminal position than the left. This would suggest, based on the connection previously seen between the sign of ΔG and the direction of writing, that the IVC inscriptions are written from right-to-left, which corroborates the consensus view as mentioned above.Yog .