Hi,
I am curious if there is a quantifiable way of determining the number of genes one will ultimately choose to filter and use for the gene tree & species tree construction?
I really like the way phyling is able to determine "informativeness" of the genes using treeness/rcv. However, when deciding how many genes to retain, do you have a way of choosing the number of genes to use?
What I have done previously was I looked at the RF distance between resulting species trees, using an increasing number of filtered genes for the construction (up to the max w/ no filtering). Then I compare each tree to the "maximum tree" to see where RF distance plateau's with increasing number of genes used.
Do you suggest a better way to approach this?
Hi,
I am curious if there is a quantifiable way of determining the number of genes one will ultimately choose to filter and use for the gene tree & species tree construction?
I really like the way phyling is able to determine "informativeness" of the genes using treeness/rcv. However, when deciding how many genes to retain, do you have a way of choosing the number of genes to use?
What I have done previously was I looked at the RF distance between resulting species trees, using an increasing number of filtered genes for the construction (up to the max w/ no filtering). Then I compare each tree to the "maximum tree" to see where RF distance plateau's with increasing number of genes used.
Do you suggest a better way to approach this?