-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vcfR2genlight() - variation in ploidy determination #128
Comments
Hi @fdchevalier, thanks for the detailed response! The solution I'm pursuing for #117 is to us the first non-NA variant to determine a sample's ploidy. So I think we could include 'ploidy' as a parameter for |
I agree, this makes perfect sense. |
Ok, we'll put it on the to-do list. With the caveat that I have collaborators asking me for manuscript updates so I haven't found a lot of time to code lately. But we'll get it done. Thanks! |
Hi @knausb I used vcfR2genlight to convert my mixed-ploidy( diploid-tetraploid) vcf data (obtained by GATK). Unfortuantely, all the tetraploid sites shows "NA" after conversion (BSJ_1-diploid, BSJ_2-diploid, BSJ_3-tetraploid): require(vcfR) Do you have any solution? Thanks |
Hi @lihe-s , I do not work on mixed ploidy data sets. So I don't have much experience with this. If you could create a reproducible example we might be able to come up with a solution. Thanks! |
Hi Brian,
When using
vcfR2genlight()
with a very small dataset, the function may set different ploidy for each sample. This behavior comes from the automatic determination of the ploidy when creating an object of genlight class (see ploidy section from the genlight-class documentation). Below a reproducible example and a solution.The reproducible example
When
vcfR2genlight()
translates the genotypes in binary code, the maximum allele code will be 0 (0|0
), 1 (1|0
or0|1
), and 2 (1/1
) for the first, second and third sample, respectively. So genlight will determine that the two first samples are haploid and the second is diploid.The solution
The ploidy can be indicated when creating a genlight object. So the solution is to add this argument in the
vcfR_conversion.R
file:But I think it would be better to have the
ploidy
as an argument of thevcfR2genlight()
. This would allow:If the
ploidy
is an argument, a warning message about mixed ploidy in the resulting genlight object could be a plus.Comment
A more elegant way could be to determine the ploidy from the data. I wonder if it would be worth writing a general function to estimate the ploidy from the VCF because this has been raised several times (#106, #117, #121).
Fred
The text was updated successfully, but these errors were encountered: