There are now sufficient high-quality Y-chromosome data available (especially Poznik et al. [58]) to be able to draw clear conclusions about the timing and direction of dispersal of R1a (Fig. 5). The indigenous South Asian subclades are too young to signal Early Neolithic dispersals from Iran, and strongly support Bronze Age incursions from Central Asia. The derived R1a-Z93 and the further derived R1a-Z94 subclades harbour the bulk of Central and South Asian R1a lineages [55, 58], as well as including some Russian and European lineages, and have been variously dated to 5.6 [4.0;7.3] ka [55], 4.5-5.3 ka with expansions ~4.0-4.5 ka [58], or 4.7 [4.0;5.5] ka (Yfull tree v4.10 [54]). The South Asian R1a-L657, dated to ~4.2 ka [3.3;5.1] (Yfull tree v4.10 [54]]), is the largest (in the 1KG dataset) of several closely related subclades within R1a-Z94 of very similar time depth. Moreover, not only has R1a been found in all Sintashta and Sintashta-derived Andronovo and Srubnaya remains analysed to date at the genome-wide level (nine in total) [76, 77], and been previously identified in a majority of Andronovo (2/3) and post-Andronovo Iron Age (Tagar and Tachtyk: 6/6) male samples from southern central Siberia tested using microsatellite analysis [101], it has also been identified in other remains across Europe and Central Asia ranging from the Mesolithic up until the Iron Age (Fig. 5).
The other major member of haplogroup R in South Asia, R2, shows a strikingly different pattern. It also has deep non-Subcontinental branches, nesting a South Asian specific subclade. But the deep lineages are mainly seen in the eastern part of the Near East, rather than Central Asia or eastern Europe, and the Subcontinental specific subclade is older, dating to ~8 ka [55].
Altogether, therefore, the recently refined Y-chromosome tree strongly suggests that R1a is indeed a highly plausible marker for the long-contested Bronze Age spread of Indo-Aryan speakers into South Asia, although dated aDNA evidence will be needed for a precise estimate of its arrival in various parts of the Subcontinent. aDNA will also be needed to test the hypothesis that there were several streams of Indo-Aryan immigration (each with a different pantheon), for example with the earliest arriving ~3.4 ka and those following the Rigveda several centuries later [12]. Although they are closely related, suggesting they likely spread from a single Central Asian source pool, there do seem to be at least three and probably more R1a founder clades within the Subcontinent [58], consistent with multiple waves of arrival. Genomic Y-chromosome phylogeography is in its infancy compared to mito-genome analysis so it is of course likely that the picture will evolve with sequencing of further South Asian Y-chromosomes, but the picture is already sufficiently clear that we do not expect it to change drastically.