Significant TMRCA discrepancy between tsdate and Relate on 1KGP YRI (Chr 22) #508
Replies: 10 comments 1 reply
-
|
Out of interest, what if you don't specify the recombination rate in tsinfer (or set the mismatch ratio to 0, which is the same thing?) What is the correlation between log(tsdate) and log(tsinfer) time in both cases? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
BTW, I think it's standard to use 27 or 29 years per generation in humans nowadays? |
Beta Was this translation helpful? Give feedback.
-
|
(also, could you show some log-time plots of tsdate vs tsinfer ages for the mutations?) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Thanks for pointing this out! Estimating the allele age of ancient variants is particularly challenging, so allele age estimation methods can provide inconsistent results in that case. For " You can see the discordance between methods in the pairwise comparison of allele ages across five methods in the paper (Figure S10). Age estimates can disagree by several orders of magnitude in some cases. For Relate, I obtained the allele ages used in the plot by doing a weighted average of population-specific estimates. In the relevant subplot (copied below - bottom axis is for tsinfer+tsdate) you can see that
The comparison above was based only on sites that were present in all five method datasets, which included SINGER estimates for YRI individuals only. Thus, the comparison between |
Beta Was this translation helpful? Give feedback.
-
|
it would be an interesting project to use a different method, such as PSMC (or better, PHLASH or GammaSMC) to estimate the deepest divergence dates between any pair of samples, and compare that to the expected values from e.g. Relate or tsinfer+tsdate. @Jesson-mark : if you manage to do this, and get any useful results, please do post them here. |
Beta Was this translation helpful? Give feedback.
-
|
@Duncan-JR, thank you so much for the detailed explanation and for pointing me to the preprint and Figure S10. The distinction regarding the constraints on recent vs. deep ancestry really clarifies why the absolute estimates can vary so much. It is reassuring to know that despite the offset in absolute ages, the correlation between tsinfer+tsdate and Relate remains the highest among the methods. This gives me more confidence in using these tools for relative comparisons. Thanks again for your help! @hyanwong, that sounds like a fascinating project! Comparing the deep divergence dates against methods like PSMC or GammaSMC would indeed provide a valuable benchmark. I’m definitely interested in trying this out. I’ll add this to my to-do list and look into it when I have some spare time. I will certainly post an update here if I manage to get some useful results. |
Beta Was this translation helpful? Give feedback.
-
Thanks, Jie Wang: keep us posted! I'm converting this to a discussion, so feel free to post here again if you have any follow-ups. |
Beta Was this translation helpful? Give feedback.








Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am currently using tsinfer + tsdate to reconstruct ARGs and estimate coalescence times. I recently ran this pipeline on Chromosome 22 of the Yoruba (YRI) samples from the 1000 Genomes Project and compared the results with those inferred by Relate.
I noticed a substantial discrepancy in the TMRCA estimates for local trees between the two methods. Specifically, tsdate produces significantly older estimates compared to Relate, especially in the deep past.
Observations
Here is the distribution of TMRCAs across Chromosome 22:
tsdate:
Relate:
As shown, the maximum TMRCA inferred by tsdate reaches ~40 MYA, whereas Relate caps around ~15 MYA. The median age is also nearly double in tsdate.
Parameters & Reproducibility
I used the following parameters:
tsinfer:
tsdate:
Questions:
Is such a large discrepancy expected for African populations which have deep coalescence times?
Could this be related to the prior$N_e$ used in tsdate? If I haven't specified a demographic history, would the default prior cause this overestimation in deep time?
Any insights or suggestions on how to align these estimates would be greatly appreciated.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions