Aryan genes, European Pride and shoddy genetics

The literal meaning of the name “Sanskrit” (actually Samskruta bhasha) is “perfect language”. European philologists who learned this language in India in the 18th century were impressed by the perfection in its structure, grammar and antiquity and were surprised to find that Sanskrit opened up and revealed the links between most European languages. Soon thereafter, European linguists began a search for a common ancestral language that could be stated to be older than Sanskrit, which would be the “mother language” of all European languages.

By the 19th century, European linguists had already decided, using fond personal biases, minimal logic and zero evidence, that the ancestral language must have originated in some area between Europe and India. An origin of the mother language outside the borders of an India that was full of dark-skinned non-Christians was important for European pride. Pre-Nazi German linguists claimed that the Sanskrit language Mitanni texts of Syria were not Sanskrit and could only have come from Europe. A 20th century linguist, JP Mallory said that the origins of the language was a search for European origins. The place of origin of Indo-European languages was, and remains, a matter of European pride and cannot willy nilly be gifted away to some faraway heathen oriental race that happened to have preserved a “perfect language”. In this way, philologists arrived at the conclusion that Sanskrit had been carried to India by some people and they assigned a date for its arrival in India based on imagination and guesswork. This was subsequently recorded in multiple texts as “historical fact”. The language, they said, was taken to India by “Aryan” conquerors, and when unproven ideas of conquest and invasion became embarrassing, the erstwhile conquerors were given the name “migrants”. All that remained was to find the place of origin or “urheimat”.

Unfortunately for this concocted linguistic story, there is no textual or archaeological evidence for such a migration towards India. This did not stop philologists from cooking up a place of origin of the migrants, with the Eurasian steppe region being the current “favourite horse”, so to speak. The fact that there is no evidence of any ancient language in this imaginary place of origin did not prevent linguists from claiming that an ancestral language did exist in that area. Absence of evidence probably aided the creation of evidence. This too was duly noted in hallowed European scholarly circles as “established fact”. This is where genetics researchers stepped in. The concoctions of philologists, recorded as “facts” in reference books and papers, along with dates and routes of migration were swallowed whole by genetics researchers who were trying to search for human migrations using genetics. The inconvenient fact that the whole story was imaginary did not concern geneticists as they set about trying to “prove” that people really did migrate as imagined in fanciful linguistics texts. Since then a series of genetics researchers have desperately tried to force-fit facts to theory conjuring up a migration of people to India in the time frame claimed by philologists, paying no attention to the fact that movement of people and genes does not mean that the people spoke any particular language. Human genes do not reveal the language people spoke or the mode of transport they used, be it horses or flying saucers. This oversight can only be shoddiness in research or deliberate attempts to conjure up proof for falsified history.

Genetics research has blossomed in the last three decades or so. Somewhere along the way a gene marker called “R1a1” was discovered. R1a1 is a “genetic marker”. It is not a gene. A genetic marker can change with time without affecting the human being carrying that marker, just like a dent or scratch on a car body does not affect the function of the car in any way. R1 is a marker that is found only in men, and it was discovered that R1a1 was present in many men over huge areas of Europe and India. Immediately, some geneticists, supported by breathless and eager philologists and their fans claimed that R1a1 was the “Aryan gene” that carried Indo European languages all over Europe and India. Hilariously, all these scholarly geneticists and linguists failed to notice that genes do not say what language people speak, so eager were they to announce the “Aryan gene”. Since then R1a1 as “Aryan gene” has died a slow death, with everyone avoiding that story like bad odour.

Figure 1 shows the area of spread of R1a1 over Europe (green) and India (blue). The area does not correspond accurately to the area where Indo-European languages are spoken.

Figure 1

The first blow was the sobering discovery that R1a1 was nearly absent from western Europe – the home ground of the Indo-European speaking people who wanted a claim on the mother language. What was worse for the Aryan gene story was that it was present over vast areas of south India where people were not supposed to have the “Aryan” gene. Figure 1 shows the areas where R1 is found. The whole of Western Europe is missing from the R1a1 footprint (green), while it is present in South India (blue), killing the idea that it is an Aryan gene related to Indo-European languages.

But the R1 “Aryan gene” story does not end here. The malicious racist story cooked up by European linguists about language coming to India with “Aryans” had a nasty sub-plot in which those Aryans became Hindus, speaking Aryan languages. These Aryans, after allegedly driving out Dravidians who did not speak Aryan languages, were accused of creating a caste system to stay separate from Dravidians. This fake but politically explosive fable led to a desperate search to separate “Aryan genes” in India from “Dravidian genes” or tribal or “aboriginal” genes in India.

The proof that some genetics researchers were highly biased with regard to the people they selected for analysis is freely available for those who look for it. In fact one well intentioned genetics paper noted in 2012 that “In contrast to Pakistani populations, populations of Indian origin have been underrepresented in previous genomic scans ….” One only needs to dig a little deeper into published genetics papers. One must look at the data in those papers listing the people who were selected to provide blood samples for research. In my search for detailed data about the R1 gene I collected up many papers and selected those that provided details about which Indian groups had been selected for genetic analysis. What I found is an interesting commentary on how genetics researchers pick what they feel would support their biases to prove the biases. One paper claimed that the R1a1 gene found in India originated near Iran, but this has been contradicted by others. So I looked at the people who had been studied for the paper that placed R1a1 (Z93) origin in Iran. Out of over 1300 people analysed in this paper, 72 were from Iran, 44 from Pakistan, 30 from Nepal. 12 Indians were from Gujarat and Punjab and only 45 from the rest of India. Not only are Indians under-represented in this paper, the selected populations show a massive bias towards people from the Northwest of India and from Pakistan. This is not surprising because many genetics studies in western laboratories have depended on blood samples taken from the large communities of Gujaratis and Punjabis now living in Europe and the US who were early labour and business migrants. Pakistani migrants in large numbers in the West as well as Gurkha (Nepali) soldiers have also provided samples for Western researchers who have reached conclusions about Indian genetics without having to step outside their home laboratories. This has conveniently avoided the need to travel to India or involve Indian researchers in India. The results of this paper simply cannot claim to reflect genetics of the Indian population or the origin of the R1a1 gene marker. Like Alexander’s soldiers this paper does not enter India beyond the Indus River, but claims to speak about genes that Indians have.

I found 3 genetics papers with a list of over 2600 Indians which I used for my analysis. Here again I found a highly biased selection. Of those 2600 people, 24% were Brahmins, 37% Scheduled tribes and 39% Scheduled castes. According to the latest census figures, Brahmins form a mere 5% of the Indian population, Tribes are 7.6% and the scheduled castes constitute 15% of the population. These three groups form only about 25% of the Indian population. The other 75% of Indians, like non Brahmin forward castes, OBCs and minority groups were not represented at all. This is totally unrepresentative of India. Clearly the selection of people by those genetics researchers has been biased in deliberately picking Brahmins on one side and tribes on the other side to try and detect the maximum possible difference in genetic make up to “prove” that “Aryan” language speakers created castes to subjugate and discriminate against “indigenous tribes”. Despite bending over backwards to prove what they already believe to be true, these genetics researchers have failed to demonstrate anything of the sort, as my analysis demonstrates.

Let me first place on record a few details of what I analysed using the data of over 2600 people from a collection of genetics papers. In genetics research no one has any doubt that most Indians have been born from women who originated in India. In an ironic example of male chauvinism the argument is all about the fathers, the men. If the men are claimed to have come from “outside” his descendants are dubbed as “foreign” inputs. Maternal ancestry does not seem to provide the same proof of indigenous origin. People have been desperately trying to prove that R1a1 in Indian men came from somewhere outside for two reasons. The first is to try and show that Indo-European languages came with the R1a1 marker carrying men from outside India. The second is to prove that these people were the originators of a caste system in India. R1a1 is certainly not the only paternal gene marker among Indian men. There are at least 14 known markers. Of these, 4 markers occur in nearly 80% of Indian men and these are called L, H, J and R (R1a1 is a subset of R). The three papers that I selected for my analysis all had data for H, J and R for 2600 men, but not for L, so I had to exclude L from my analysis. I organized castes and tribes as groups based on region (North, South, East, Central or West) rather than as individual caste/tribe names like “Konkanastha Brahmins” or “Mullukurumba tribe”.

The graph in the Figure 2 shows the levels of H (blue line), J (red line) and R (yellow line) among Indian men from various regions like North, South, East, Central and West India, and also indicates whether the men were Brahmins (Br), members of tribes (Tr) or scheduled castes (SC).

The first thing that should strike anyone looking at this image is that every group, North, South, East or West, Tribal, Brahmin or SC have paternal genes from all the markers H, J and R. It is not as if there were some “Aryan genes” preserved by Brahmins that were not shared with others. Every group has every marker, all mixed up among castes and tribes. Also note that the much advertised “Aryan gene” R1a1 which was supposed to have come from the Northwest and stayed with upper castes is an epic fail in that department. R1a1 (yellow line) is most common among in the eastern Bengal and Bihar area among all three groups, Brahmins, Tribes and Scheduled Castes. As one moves West and South, the proportion of R1a1 decreases. The marker called “H” (blue line) is most common in the South of India and gradually decreases going towards central India and is least common over Bengal and Bihar. The marker called “J” (red line) is said to have originated in West Asia about 15,000 years ago. Every Indian in this group has this and it is, as expected, most common among Indians of the Western regions, less common in the central, North and South and least prevalent over Bengal and Bihar. It is obvious that the distribution of paternal genes in India is more related to geography and the region rather than caste. It is also clear that all castes and tribes have all the markers and that the R1a1 marker is most common in the Eastern parts of India rather than the West as would be expected if the R1a1 marker had come from the Eurasian steppe via Afghanistan and the Khyber pass into Western India as claimed by linguists.

Figure 2

It is important to note that not all genetics papers are biased or shoddy. I found some papers that were both well written and had no agenda of trying to force-fit genetic proof on to linguistics stories. Among these, there was at least one paper that found that the R1a1 branch in India was older than that outside India suggesting movement out of India. Another paper, written as far back as 2006 by Sanghamitra Sahoo and colleagues was particularly gratifying to me because their analysis included 900 people of all groups from all over India and the findings reflect what I found in my analysis above. The distribution of male lineage is more related to geography and region where the people hail from rather than caste. In addition this paper goes on to show that the R1 marker in India is unique to India and has no fathers or uncles outside India to suggest migration from the Eurasian Steppe or anywhere else. They also point out that the high proportion of the marker R2 in India suggests and Indian origin for the R marker in India rather than influx from outside.

One genetics paper in 2018 co-authored by 92 authors received a great deal of media attention and spawned a controversy. This paper claimed that Indians had ancestry from the steppe region of Eurasia and that this proved that people from the steppe came to India with Indo-European languages. This is shoddy genetics at its lowest ebb. The study reaches broad conclusions about Indian genetic ancestry without testing a single gene from India. It is astounding that the paper is co-authored by 92 purported scholars who do not manage to recall that genes cannot indicate language spoken. What is worse is that those 92 names have not even bothered to check the veracity of claims by linguists about the language in the steppe before 1500 BCE. There is absolutely no evidence of the language spoken in the steppe in that era. The idea that people there spoke an Indo-European language that was a “mother language” to Sanskrit is an outright bluff. This needs to be contrasted with the evidence available on multiple counts in Sanskrit texts of events that took place in India before 2000 BCE. However even bumbling genetics conclusions using faulty or fake inputs manage to make the grade for publication in academic media in these days of dog-eat-dog “publish or perish” rivalry in academia. Linguists who led the charge about “Aryan migration”, leading genetics research up the garden path are now fighting back against criticism using their last resort – polemic. In a real life example of European pot calling Indian kettle black, a fake Aryan theory created for jingoistic European pride is now being defended by accusing critics of being agents of Hindutva. Irony could not have died a more certain death.


  1. There are no “Aryan genes”. There was no Aryan race.
  2. The R1a1 gene marker does not indicate speakers of Indo-European languages
  3. There is no genetic evidence of 1500 BCE migrations into India contributing to the genetic signatures of Indians.
  4. “Evidence” for migration is created when representative Indian gene samples are excluded from research
  5. The genetic picture of Indian males shows a mixture of multiple types of markers no matter what caste group they are from
  6. The proportion of various markers in Indian men varies with geography rather than caste
  7. There is evidence that the H marker originated in India while R markers in India are unique and older than those found in central Asia.

Finally a bit of 20/20 hindsight in the light of the mix of paternal genes that Brahmin populations show. It has been claimed that the ancient sage Manu decreed that Brahmin men should not produce children from women of any other caste and any such children could never be Brahmins. Since male genes in India can be one of many, including H, J, R, L, O and others – the first Brahmin could have had only one of these. Subsequently – all his descendants should have had the same marker and no Brahmin should have any other marker if sage Manu’s instructions were real or serious. Clearly this is not the case. Alternatively if we make the claim that Brahmins already had a mix of paternal genes when Manu came along and made his laws – then it would mean that every single Brahmin group should have retained the same relative proportions of markers like H, J, R or others. This is also not true. From these genetic facts it follows that the story that Manu’s words were laws that were followed strictly by all Brahmins is a fake allegation most probably made for political ends. When science proves something, it is incumbent on people to discard fake stories like an Aryan migration bringing language to India or that Manu’s laws were being followed to the letter exactly as written.

An incurable patriot, Dr. Shivsankar Sastry is a surgeon by profession; and a historian, thinker, sociologist and military aviation enthusiast by choice

3 thoughts on “Aryan genes, European Pride and shoddy genetics

  1. “Alternatively if we make the claim that Brahmins already had a mix of paternal genes when Manu came along and made his laws –  then it would mean that every single Brahmin group should have retained the same relative proportions of markers like H, J, R or others” —-> yes but Y-chromosome is only passed down from father to son. What if some of the H,J males males had girl children while R had male children ? In such a case,wouldn’t the proportion of haplogroup change in a few generstions ?

    “It has been claimed that the ancient sage Manu decreed that Brahmin men should not produce children from women of any other caste and any such children could never be Brahmins.” —-> There is another way to debunk and that is from mtDNA haplogroup analysis(which comes mother) . If brahmin men didn’t marry women from other communities we would have same set of mtDNA haplogroups across the brahmin subgroups but this is defintely not the case from all the papers i have read. You will find some ‘west-eurasian’ mtDNA haplogroups in brahmims from western states like punjab, kashmir , ‘east-eurasian’/tibetan mtDNA haplogroups from states of uttarakhand and ‘south-indian’ mtDNA haplogroups in south-indian brahmins.
    Dr Shiv shastry, try checking the mtDNA haplogroup distribution . Take that as an exercise 🙂

  2. Very well argued, doc. You laid out the case quite well and logically. Vagheesh et al argue that elevated steppe (assuming it is not decomposed into even more adna components in the future and given a new name and geographic location) among the priestly classes were IE carriers from the NW but we see that is not the case. Also the whole dramatic language displacement of IVC is just waved off, like it just happened.

    I am more and more convinced that Sarianidi’s BMAC as IE homeland hypothesis is indeed closest to the truth.

  3. “then it would mean that every single Brahmin group should have retained the same relative proportions of markers like H, J, R or others” —> Yes, but Y-chromosome is only passed from father to son. If some H or L men have more girl children compared to boys, wouldn’t the haplogroup % in successive generations change ?

    “that Brahmin men should not produce children from women of any other caste and any such children could never be Brahmins” —> There is another way to debunk this BS and that’s through mtDNA haplogroup analysis. If Brahmin men were only supposed to marry women from ‘brahmin’ varna then that would mean that the mtDNA haplogroups across all the brahmins of India should remain same !! But this is not the case as we see Brahmin from the western most states(like Kashmir, Punjab,Gujarat) have more % of ‘West-Eurasian’ mtDNA , Brahmins from Uttarakhand,West Bengal have ‘East Eurasian mtDNA’ while Brahmins from Tamil Nadu will have mtDNA haplogroups corresponding to that region.

