In an earlier post, I mentioned that the Ashtadhyayi has 3959 rules. I don’t know where I got this number from, but apparently it’s incorrect. Every source I’ve checked, however, simply says the text has “slightly less than 4000 rules.” Perhaps I’m not looking hard enough. But if 4000 is the hard upper limit, then I’ve certainly memorized a quarter of the rules so far.
Still: am I really a quarter way through the text? Perhaps all of the long rules are waiting for me in later chapters.
An answer
Let’s try to answer this question. First, we have to quantify how long a rule is. We can use:
- Alphabetic length (+1 per letter)
- Syllabic length (+1 per syllable)
- Prosodic length (+2 per heavy syllable, +1 per light syllable)
Syllable length seems most intuitive to me, so that’s the measure I chose to use. Since each syllable has exactly one vowel, we can just count the number of vowels that a rule has. That number will define the length of a rule.
I downloaded the full text of the Ashtadhyayi from Sanskrit Documents and converted the ITRANS scheme to SLP1 using Sanscript. SLP1 is easier to deal with because each vowel is represented by exactly one character. That way, we don’t have to worry about symbols like ai or R^i.
With that, it’s just a matter of counting up lines:
#!/usr/bin/python text = """ 1.1.1 vfdDirAdEc . 1.1.2 adeN guRaH . ... 8.4.67 nodAttasvaritodayamagArgyakASyapagAlavAnAm . 8.4.68 a a iti . """ import re from collections import Counter vowels = set('aAiIuUfFxXeEoO') pattern = re.compile("([0-9].[0-9]).([0-9]+) (.+)") mal = Counter() # mass-adjusted length prev = None for line in (x.strip() for x in text.split("\n") if x.strip()): results = pattern.match(line) if results: # Start of a rule prefix, num, rule = results.groups() prev = (prefix, num) else: # Wrapped rule prefix, num = prev rule = line mal[prefix] += len([x for x in rule if x in vowels]) # My current progress total = sum(mal.values()) subtotal = sum(mal[x] for x in '1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4 3.1 3.2'.split()) frac = (subtotal*100.0)/total print 'Total mass: {total} vowels.'.format(**locals()) print 'Subtotal: {subtotal} vowels ({frac}%)'.format(**locals()) # Stats for each chapter print 'Per chapter composition:' for prefix, value in mal.most_common(): frac = (value*100.0)/total print ' {prefix} : {value} ({frac}%)'.format(**locals())</pre>
Results
Running the script above, we get the following:
Total mass: 30496 vowels. Subtotal: 7888 vowels (25.8656873033%) Per chapter composition: 6.2 : 1656 (5.43022035677%) 6.1 : 1561 (5.11870409234%) 3.2 : 1540 (5.04984260231%) 4.1 : 1356 (4.44648478489%) 5.4 : 1326 (4.3481112277%) 3.1 : 1265 (4.14808499475%) 3.3 : 1235 (4.04971143757%) 4.3 : 1217 (3.99068730325%) 6.4 : 1210 (3.96773347324%) 6.3 : 1196 (3.92182581322%) 5.2 : 1161 (3.80705666317%) 4.2 : 1066 (3.49554039874%) 5.1 : 1036 (3.39716684155%) 3.4 : 948 (3.10860440714%) 4.4 : 933 (3.05941762854%) 8.3 : 906 (2.97088142707%) 5.3 : 888 (2.91185729276%) 7.3 : 864 (2.83315844701%) 7.2 : 857 (2.810204617%) 1.4 : 835 (2.73806400839%) 8.2 : 766 (2.51180482686%) 1.3 : 717 (2.35112801679%) 2.3 : 717 (2.35112801679%) 7.4 : 681 (2.23307974816%) 2.1 : 676 (2.2166841553%) 2.4 : 673 (2.20684679958%) 1.2 : 644 (2.11175236097%) 8.1 : 627 (2.05600734523%) 7.1 : 619 (2.02977439664%) 1.1 : 540 (1.77072402938%) 8.4 : 499 (1.63628016789%) 2.2 : 281 (0.921432318993%)
So excluding the 40 rules I know from 3.3, I am a little more than a quarter way through the text.
I’ve found some vindication in knowing that 3.2 is as monstrously long as I thought it was. But it’s not especially encouraging to see two chapters that are longer. And it’s a little disheartening to find that 3.1 and 3.2 are the only chapters I’ve seen that are in the top half. I guess I just have to keep pressing onward.
Leave a comment