Computes the Adjusted Rand Index of the clusterings of the population created by the two trees. In the case of correlated covariates, two trees that split on entirely different variables may actually describe similar partitions of the population. This metric allows us to detect when two trees are partitioning the population similarly. A value close to 1 indicates a similar clustering.
treeSimilarity(tree1, tree2)
tree1 | a model returned from splineTree() |
---|---|
tree2 | a model returned from splineTree() |
The Adjusted Rand Index of the clusterings created by the two trees.
mclust::adjustedRandIndex
splitForm <- ~SEX+Num_sibs+HGC_MOTHER+HGC_FATHER nlsySubset <- nlsySample[nlsySample$ID %in% sample(unique(nlsySample$ID), 400),] tree1 <- splineTree(splitForm, BMI~AGE, "ID", nlsySubset, degree=1, df=2, intercept=FALSE, cp=0.005) tree2 <- splineTree(splitForm, BMI~AGE, "ID", nlsySubset, degree=1, df=3, intercept=TRUE, cp=0.005) treeSimilarity(tree1, tree2)#> [1] 0.1177811