Title: | A Tidyverse Extension for Ordinations and Biplots |
---|---|
Description: | Ordination comprises several multivariate exploratory and explanatory techniques with theoretical foundations in geometric data analysis; see Podani (2000, ISBN:90-5782-067-6) for techniques and applications and Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0> for foundations. Greenacre (2010, ISBN:978-84-923846) shows how the most established of these, including principal components analysis, correspondence analysis, multidimensional scaling, factor analysis, and discriminant analysis, rely on eigen-decompositions or singular value decompositions of pre-processed numeric matrix data. These decompositions give rise to a set of shared coordinates along which the row and column elements can be measured. The overlay of their scatterplots on these axes, introduced by Gabriel (1971) <doi:10.1093/biomet/58.3.453>, is called a biplot. 'ordr' provides inspection, extraction, manipulation, and visualization tools for several popular ordination classes supported by a set of recovery methods. It is inspired by and designed to integrate into 'tidyverse' workflows provided by Wickham et al (2019) <doi:10.21105/joss.01686>. |
Authors: | Jason Cory Brunson [aut, cre] , Emily Paul [ctb], John Gracey [aut] |
Maintainer: | Jason Cory Brunson <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1.0002 |
Built: | 2025-02-01 23:22:46 UTC |
Source: | https://github.com/corybrunson/ordr |
These functions annotate the matrix factors of tbl_ords with additional variables, and retrieve these annotations.
The unexported annotation_*()
and set_annotation_*()
functions assign and
retrieve values of the "*_annotation"
attributes of x
, which must have
the same number of rows as get_*(x)
.
annot |
A data.frame having the same number of rows
as |
augmentation methods that must interface with annotation.
These functions return data associated with the cases, variables, and coordinates of an ordination object, and attach it to the object.
recover_aug_rows(x) recover_aug_cols(x) recover_aug_coord(x) augment_ord(x, .matrix = "dims")
recover_aug_rows(x) recover_aug_cols(x) recover_aug_coord(x) augment_ord(x, .matrix = "dims")
x |
An object of class 'tbl_ord'. |
.matrix |
A character string partially matched (lowercase) to several
indicators for one or both matrices in a matrix decomposition used for
ordination. The standard values are |
The recover_aug_*()
S3 methods produce
tibbles of values associated with the rows, columns, and
artificial coordinates of an object of class 'tbl_ord'. The first field of
each tibble is name
, which contains the row, column, or coordinate names.
Additional fields contain information about the rows, columns, or coordinates
extracted from the ordination object.
The function augment_ord()
returns the ordination with either or both
matrix factors annotated with the result of recover_aug_*()
. In this way
augment_ord()
works like generics::augment()
, as popularized by the
broom package, by extracting information about the rows and columns, but
it differs in returning an annotated 'tbl_ord' rather than a
'tbl_df' object. The advantage of implementing separate
methods for the rows, columns, and artificial coordinates is that more
information contained in the original object becomes accessible to the user.
The recover_aug_*()
functions return tibbles
having the same numbers of rows as recover_*()
. augment_ord()
returns
an augmented tbl_ord with the wrapped model unchanged.
tidiers and annotation methods that interface with augmentation.
Other generic recoverers:
conference
,
recoverers
,
supplementation
These geometric element layers (geoms) pair
conventional ggplot2 geoms with stat_rows()
or
stat_cols()
in order to render elements for one or the other
matrix factor of a tbl_ord. They understand the same aesthetics
as their corresponding conventional geoms.
geom_rows_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, point.padding = 1e-06, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_cols_text_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, point.padding = 1e-06, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_rows_label_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, label.padding = 0.25, point.padding = 1e-06, label.r = 0.15, label.size = 0.25, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_cols_label_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, label.padding = 0.25, point.padding = 1e-06, label.r = 0.15, label.size = 0.25, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_rows_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_rows_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_point( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_path( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., lineend = "butt", linejoin = "round", linemitre = 10, arrow = NULL, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_polygon( mapping = NULL, data = NULL, stat = "identity", position = "identity", rule = "evenodd", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_contour( mapping = NULL, data = NULL, stat = "contour", position = "identity", ..., bins = NULL, binwidth = NULL, breaks = NULL, lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_density_2d( mapping = NULL, data = NULL, stat = "density_2d", position = "identity", ..., contour_var = "density", lineend = "butt", linejoin = "round", linemitre = 10, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_density_2d_filled( mapping = NULL, data = NULL, stat = "density_2d_filled", position = "identity", ..., contour_var = "density", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_text( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, check_overlap = FALSE, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_label( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.25, "lines"), label.r = unit(0.15, "lines"), label.size = 0.25, size.unit = "mm", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, point.padding = 1e-06, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_cols_text_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, point.padding = 1e-06, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_rows_label_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, label.padding = 0.25, point.padding = 1e-06, label.r = 0.15, label.size = 0.25, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_cols_label_repel( mapping = NULL, data = NULL, stat = "identity", position = "identity", parse = FALSE, ..., box.padding = 0.25, label.padding = 0.25, point.padding = 1e-06, label.r = 0.15, label.size = 0.25, min.segment.length = 0.5, arrow = NULL, force = 1, force_pull = 1, max.time = 0.5, max.iter = 10000, max.overlaps = getOption("ggrepel.max.overlaps", default = 10), nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), na.rm = FALSE, show.legend = NA, direction = c("both", "y", "x"), seed = NA, verbose = FALSE, inherit.aes = TRUE ) geom_rows_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_rows_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_cols_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Additional arguments passed to |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
linemitre |
Line mitre limit (number greater than 1). |
arrow |
Arrow specification, as created by |
rule |
Either |
bins |
Number of contour bins. Overridden by |
binwidth |
The width of the contour bins. Overridden by |
breaks |
One of:
Overrides |
contour_var |
Character string identifying the variable to contour
by. Can be one of |
parse |
If |
nudge_x , nudge_y
|
Horizontal and vertical adjustment to nudge labels by.
Useful for offsetting text from points, particularly on discrete scales.
Cannot be jointly specified with |
check_overlap |
If |
size.unit |
How the |
label.padding |
Amount of padding around label. Defaults to 0.25 lines. |
label.r |
Radius of rounded corners. Defaults to 0.15 lines. |
label.size |
Size of label border, in mm. |
box.padding |
Amount of padding around bounding box, as unit or number.
Defaults to 0.25. (Default unit is lines, but other units can be specified
by passing |
point.padding |
Amount of padding around labeled point, as unit or
number. Defaults to 0. (Default unit is lines, but other units can be
specified by passing |
min.segment.length |
Skip drawing segments shorter than this, as unit or
number. Defaults to 0.5. (Default unit is lines, but other units can be
specified by passing |
force |
Force of repulsion between overlapping text labels. Defaults to 1. |
force_pull |
Force of attraction between a text label and its corresponding data point. Defaults to 1. |
max.time |
Maximum number of seconds to try to resolve overlaps. Defaults to 0.5. |
max.iter |
Maximum number of iterations to try to resolve overlaps. Defaults to 10000. |
max.overlaps |
Exclude text labels when they overlap too many other things. For each text label, we count how many other text labels or other data points it overlaps, and exclude the text label if it has too many overlaps. Defaults to 10. |
xlim , ylim
|
Limits for the x and y axes. Text labels will be constrained to these limits. By default, text labels are constrained to the entire plot area. |
direction |
direction of stairs: 'vh' for vertical then horizontal, 'hv' for horizontal then vertical, or 'mid' for step half-way between adjacent x-values. |
seed |
Random seed passed to |
verbose |
If |
axis_labels , axis_ticks , axis_text
|
Logical; whether to include labels, tick marks, and text value marks along the axes. |
by , num
|
Intervals between elements or number of elements; specify only one. |
tick_length |
Numeric; the length of the tick marks, as a proportion of the minimum of the plot width and height. |
text_dodge |
Numeric; the orthogonal distance of tick mark text from the axis, as a proportion of the minimum of the plot width and height. |
label_dodge |
Numeric; the orthogonal distance of the axis label from the axis, as a proportion of the minimum of the plot width and height. |
axis.colour , axis.color , axis.alpha
|
Default aesthetics for axes. Set to NULL to inherit from the data's aesthetics. |
label.angle , label.colour , label.color , label.alpha
|
Default aesthetics for labels. Set to NULL to inherit from the data's aesthetics. |
tick.linewidth , tick.colour , tick.color , tick.alpha
|
Default aesthetics for tick marks. Set to NULL to inherit from the data's aesthetics. |
text.size , text.angle , text.hjust , text.vjust , text.family , text.fontface , text.colour , text.color , text.alpha
|
Default aesthetics for tick mark labels. Set to NULL to inherit from the data's aesthetics. |
bag.linetype , bag.linewidth , bag.colour , bag.color , bag.fill , bag.alpha
|
Default aesthetics for bags. Set to |
median.shape , median.stroke , median.size , median.colour , median.color , median.fill , median.alpha
|
Default aesthetics for medians. Set to |
fence.linetype , fence.linewidth , fence.colour , fence.color , fence.fill , fence.alpha
|
Default aesthetics for fences. Set to |
outlier.shape , outlier.stroke , outlier.size , outlier.colour , outlier.color , outlier.fill , outlier.alpha
|
Default aesthetics for outliers. Set to |
new_data |
A list (best structured as a data.frame)
of row ( |
type |
Character value matched to |
point.fill |
Default aesthetics for markers. Set to NULL to inherit from the data's aesthetics. |
isoline_text |
Logical; whether to include text value marks along the isolines. |
snap_rule |
Logical; whether to snap rule segments to grid values. |
vector_labels |
Logical; whether to include labels radiating outward from the vectors. |
A ggproto layer.
Other biplot layers:
biplot-stats
,
stat_referent()
,
stat_rows()
# compute log-ratio analysis of Freestone primary class composition measurements glass %>% ordinate(cols = c(SiO2, Al2O3, CaO, FeO, MgO), model = lra, compositional = TRUE) %>% confer_inertia("rows") %>% print() -> glass_lra # row-principal biplot with ordinate-wise standard deviations glass_lra %>% ggbiplot(aes(color = Site), sec.axes = "cols") + theme_biplot() + scale_color_brewer(type = "qual", palette = 6) + geom_cols_text(stat = "chull", aes(label = name), color = "#444444") + geom_rows_lineranges(fun.data = mean_sdl, linewidth = .75) + geom_rows_point(alpha = .5) + ggtitle( "Row-principal LRA biplot of Freestone glass measurements", "Ranges 2 sample standard deviations from centroids" ) # principal components analysis of glass composition measurements glass[, c(5L, 7L, 8L, 10L, 11L)] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% cbind_rows(site = glass$Site, form = glass$Form) %>% augment_ord() %>% print() -> glass_pca # note that column standard coordinates are unit vectors rowSums(get_cols(glass_pca) ^ 2) # plot column standard coordinates with a unit circle underlaid glass_pca %>% ggbiplot(aes(label = name), sec.axes = "cols") + theme_biplot() + geom_rows_point(aes(color = site, shape = form), elements = "score") + geom_unit_circle(alpha = .5, scale.factor = 3) + geom_cols_vector()
# compute log-ratio analysis of Freestone primary class composition measurements glass %>% ordinate(cols = c(SiO2, Al2O3, CaO, FeO, MgO), model = lra, compositional = TRUE) %>% confer_inertia("rows") %>% print() -> glass_lra # row-principal biplot with ordinate-wise standard deviations glass_lra %>% ggbiplot(aes(color = Site), sec.axes = "cols") + theme_biplot() + scale_color_brewer(type = "qual", palette = 6) + geom_cols_text(stat = "chull", aes(label = name), color = "#444444") + geom_rows_lineranges(fun.data = mean_sdl, linewidth = .75) + geom_rows_point(alpha = .5) + ggtitle( "Row-principal LRA biplot of Freestone glass measurements", "Ranges 2 sample standard deviations from centroids" ) # principal components analysis of glass composition measurements glass[, c(5L, 7L, 8L, 10L, 11L)] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% cbind_rows(site = glass$Site, form = glass$Form) %>% augment_ord() %>% print() -> glass_pca # note that column standard coordinates are unit vectors rowSums(get_cols(glass_pca) ^ 2) # plot column standard coordinates with a unit circle underlaid glass_pca %>% ggbiplot(aes(label = name), sec.axes = "cols") + theme_biplot() + geom_rows_point(aes(color = site, shape = form), elements = "score") + geom_unit_circle(alpha = .5, scale.factor = 3) + geom_cols_vector()
These statistical transformations (stats) adapt
conventional ggplot2 stats to one or the other matrix factor
of a tbl_ord, in lieu of stat_rows()
or stat_cols()
. They
accept the same parameters as their corresponding conventional
stats.
stat_rows_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_ellipse( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_ellipse( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_cols_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_rows_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() ) stat_cols_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() ) stat_rows_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ref_subset = NULL, ref_elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_cols_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ref_subset = NULL, ref_elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_rows_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ref_subset = NULL, ref_elements = "active", ... ) stat_cols_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ref_subset = NULL, ref_elements = "active", ... ) stat_rows_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 ) stat_cols_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 ) stat_rows_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... )
stat_rows_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_density_2d( mapping = NULL, data = NULL, geom = "density_2d", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_density_2d_filled( mapping = NULL, data = NULL, geom = "density_2d_filled", position = "identity", ..., contour = TRUE, contour_var = "density", n = 100, h = NULL, adjust = c(1, 1), na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_ellipse( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_cols_ellipse( mapping = NULL, data = NULL, geom = "path", position = "identity", ..., type = "t", level = 0.95, segments = 51, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) stat_rows_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_cols_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_rows_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() ) stat_cols_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() ) stat_rows_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_rows_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ref_subset = NULL, ref_elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_cols_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ref_subset = NULL, ref_elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_rows_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ref_subset = NULL, ref_elements = "active", ... ) stat_cols_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ref_subset = NULL, ref_elements = "active", ... ) stat_rows_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 ) stat_cols_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 ) stat_rows_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... ) stat_cols_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Additional arguments passed to |
contour |
If |
contour_var |
Character string identifying the variable to contour
by. Can be one of |
n |
Number of grid points in each direction. |
h |
Bandwidth (vector of length two). If |
adjust |
A multiplicative bandwidth adjustment to be used if 'h' is
'NULL'. This makes it possible to adjust the bandwidth while still
using the a bandwidth estimator. For example, |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
type |
The type of ellipse.
The default |
level |
The level at which to draw an ellipse,
or, if |
segments |
The number of segments to be used in drawing the ellipse. |
fraction |
Fraction of the data to include in the bag. |
coef |
Scale factor of the fence relative to the bag. |
median , fence , outliers
|
Logical indicators whether to include median, fence, and outliers in the composite output. |
fun.data |
A function that is given the complete data and should
return a data frame with variables |
fun.center |
Deprecated alias to |
fun.min , fun , fun.max
|
Alternatively, supply three individual functions that are each passed a vector of values and should return a single number. |
fun.ord |
Alternatively to the |
fun.args |
Optional additional arguments passed on to the functions. |
breaks |
A numeric vector of fractions (between |
cut |
Character; one of |
origin |
Logical; whether to include the origin with the transformed
data. Defaults to |
notion |
Character; the name of the depth function (passed to
|
notion_params |
List of additional parameters passed via |
referent |
The reference data set; see Details. |
ref_elements , ref_subset
|
Analogues of |
fun.lower , fun.upper , fun.offset
|
Functions used to determine the limits
of the rules and the translations of the axes from the projections of
|
mult |
Numeric value used to scale the coordinates. |
engine |
A single character string specifying the package implementation
to use; |
method |
Passed to |
A ggproto layer.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
Other biplot layers:
biplot-geoms
,
stat_referent()
,
stat_rows()
iris_pca <- ordinate(iris, prcomp, cols = seq(4), scale. = TRUE) # NB: Non-standard aesthetics are handled as in version > 3.5.1; see: # https://github.com/tidyverse/ggplot2/issues/6191 # This prevents `scale_color_discrete(aesthetics = ...)` from synching them. ggbiplot(iris_pca) + stat_rows_bagplot( aes(fill = Species), median_gp = list(color = sync()), fence_gp = list(linewidth = 0.25), outlier_gp = list(shape = "asterisk") ) + scale_color_discrete(name = "Species", aesthetics = c("color", "fill")) + geom_cols_vector(aes(label = name)) ggbiplot(iris_pca) + stat_rows_bagplot( aes(fill = Species, color = Species), median_gp = list(color = sync()), fence_gp = list(linewidth = 0.25), outlier_gp = list(shape = "asterisk") ) + geom_cols_vector(aes(label = name)) USJudgeRatings |> ordinate(prcomp) |> mutate_rows(surname = gsub("(,|\\.).*$", "", name)) -> judges_pca ggbiplot(judges_pca, sec.axes = "cols") + geom_rows_bagplot() + geom_rows_text(aes(label = surname), size = 2) + geom_cols_vector(aes(label = name), size = 3, alpha = .5) # scaled PCA of Anderson iris measurements iris[, -5] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% mutate_rows(species = iris$Species) %>% print() -> iris_pca # row-principal biplot with centroid-based stars iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + stat_rows_star(alpha = .5, fun = "mean") + geom_rows_point(alpha = .5) + stat_rows_center(fun = "mean", size = 4, shape = 1L) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Segments connect each observation to its within-species centroid" ) # row-principal biplot with depth median-based stars iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + stat_rows_star(alpha = .5, fun.ord = "depth_median") + geom_rows_point(alpha = .5) + stat_rows_center(fun.ord = "depth_median", size = 4, shape = 1L) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Segments connect each observation to its within-species depth median" ) # correspondence analysis of combined female and male hair and eye color data HairEyeColor %>% rowSums(dims = 2L) %>% MASS::corresp(nf = 2L) %>% as_tbl_ord() %>% augment_ord() %>% print() -> hec_ca # inertia across artificial coordinates (all singular values < 1) get_inertia(hec_ca) # in row-principal biplot, row coordinates are weighted averages of columns hec_ca %>% confer_inertia("rows") %>% ggbiplot(aes(color = .matrix, fill = .matrix, shape = .matrix)) + theme_bw() + stat_cols_chull(alpha = .1) + geom_cols_point() + geom_rows_point() + ggtitle("Row-principal CA of hair & eye color") # in column-principal biplot, column coordinates are weighted averages of rows hec_ca %>% confer_inertia("cols") %>% ggbiplot(aes(color = .matrix, fill = .matrix, shape = .matrix)) + theme_bw() + stat_rows_chull(alpha = .1) + geom_rows_point() + geom_cols_point() + ggtitle("Column-principal CA of hair & eye color") # centered principal components analysis of U.S. personal expenditure data USPersonalExpenditure %>% prcomp() %>% as_tbl_ord() %>% augment_ord() %>% # allow radiating text to exceed plotting window ggbiplot(aes(label = name), clip = "off", sec.axes = "cols", scale.factor = 50) + geom_rows_label(size = 3) + # omit labels in the conical hull without the origin geom_cols_vector(vector_labels = FALSE) + stat_cols_cone(linetype = "dotted") + geom_cols_vector(stat = "cone", vector_labels = TRUE, color = "transparent") + ggtitle( "U.S. Personal Expenditure data, 1940-1960", "Row-principal biplot of centered PCA" ) # compute row-principal components of scaled iris measurements iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% mutate_rows(species = iris$Species) %>% print() -> iris_pca # row-principal biplot with centroids and confidence elliptical disks iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + geom_rows_point() + geom_polygon( aes(fill = species), color = NA, alpha = .25, stat = "rows_ellipse" ) + geom_cols_vector(color = "#444444") + scale_color_brewer( type = "qual", palette = 2, aesthetics = c("color", "fill") ) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Overlaid with 95% confidence disks" ) judge_pca <- ordinate(USJudgeRatings, princomp, cols = -c(1, 12)) ggbiplot(judge_pca, axis.type = "predictive") + geom_cols_axis() + geom_rows_point(elements = "score") + stat_rows_peel( aes(alpha = after_stat(hull)), color = "black", elements = "score", breaks = c(.9, .5, .1) ) ggbiplot(judge_pca, axis.type = "predictive") + geom_cols_axis() + geom_rows_point(elements = "score") + stat_rows_peel( aes(alpha = after_stat(hull)), color = "black", elements = "score", breaks = c(.9, .5, .1), cut = "below" ) iris_pca <- ordinate(iris, cols = 1:4, model = prcomp) ggbiplot(iris_pca) + geom_rows_point(aes(color = Species), shape = "circle open") + stat_rows_peel( aes(fill = Species, alpha = after_stat(hull)), breaks = c(.9, .5, .1) ) # unscaled PCA iris_pca <- ordinate(iris, cols = 1:4, model = prcomp) # biplot canvas iris_biplot <- iris_pca |> ggbiplot(aes(color = Species, label = name), axis.type = "predictive") + geom_rows_point() + geom_cols_axis(aes(center = center)) # print select cases top_cases <- c(1, 51, 101) iris[top_cases, ] # subset variables length_vars <- c(1, 3) iris[, length_vars] |> aggregate(by = iris[, "Species", drop = FALSE], FUN = mean) # project all cases onto all axes iris_biplot + stat_rows_projection() # project all cases onto select axes iris_biplot + stat_rows_projection(ref_subset = length_vars) # project select cases onto all axes iris_biplot + stat_rows_projection(subset = top_cases) # project select cases onto select axes iris_biplot + stat_rows_projection(subset = top_cases, ref_subset = length_vars) # project select cases onto manually provided axes iris_cols <- as.data.frame(get_cols(iris_pca)) iris_biplot + stat_rows_projection(subset = top_cases, referent = iris_cols) # project selected cases onto selected axes in full-dimensional space iris_pca |> ggbiplot(ord_aes(iris_pca, color = Species, label = name), axis.type = "predictive") + geom_rows_point() + geom_cols_axis(aes(center = center)) + stat_rows_projection(subset = top_cases, ref_subset = length_vars) # Freestone primary glass measurements print(glass) # default (standardized) linear discriminant analysis of sites on measurements glass_lda <- MASS::lda(Site ~ SiO2 + Al2O3 + FeO + MgO + CaO, glass) # bestow 'tbl_ord' class & augment observation, centroid, and variable fields as_tbl_ord(glass_lda) %>% augment_ord() %>% print() -> glass_lda # row-standard biplot glass_lda %>% confer_inertia(1) %>% ggbiplot(aes(shape = grouping)) + theme_bw() + theme_biplot() + geom_rows_point(size = 4) + geom_rows_point(elements = "score") + stat_cols_rule( aes(label = name), color = "#888888", num = 8L, ref_elements = "score", fun.offset = \(x) minabspp(x, p = .1), text.size = 2.5, label_dodge = .04 ) + scale_shape_manual(values = c(2L, 3L, 0L, 5L)) + ggtitle( "LDA of Freestone glass measurements", "Row-standard biplot of standardized LDA" ) # contribution LDA of sites on measurements glass_lda <- lda_ord(Site ~ SiO2 + Al2O3 + FeO + MgO + CaO, glass, axes.scale = "contribution") # bestow 'tbl_ord' class & augment observation, centroid, and variable fields as_tbl_ord(glass_lda) %>% augment_ord() %>% print() -> glass_lda # symmetric biplot glass_lda %>% confer_inertia(.5) %>% ggbiplot(aes(shape = grouping)) + theme_bw() + theme_biplot() + geom_rows_point() + stat_rows_density_2d(elements = "score", alpha = .5, color = "#444444") + stat_cols_rule( aes(label = name), geom = "axis", color = "#888888", num = 8L, ref_elements = "active", fun.offset = \(x) minabspp(x, p = .1), label_dodge = 0.04, text.size = 2.5, text_dodge = .025 ) + scale_shape_manual(values = c(16L, 17L, 15L, 18L)) + ggtitle( "LDA of Freestone glass measurements", "Symmetric biplot of contribution LDA" ) ## Not run: # classical multidimensional scaling of road distances between European cities euro_mds <- ordinate(eurodist, cmdscale_ord, k = 11) # monoplot of city locations euro_plot <- euro_mds %>% negate_ord("PCo2") %>% ggbiplot() + geom_cols_text(aes(label = name), size = 3) print(euro_plot) # biplot with minimal spanning tree based on plotting window distances euro_plot + stat_cols_spantree( engine = "mlpack", alpha = .5, linetype = "dotted" ) # biplot with minimal spanning tree based on full-dimensional distances euro_plot + stat_cols_spantree( ord_aes(euro_mds), engine = "mlpack", alpha = .5, linetype = "dotted" ) ## End(Not run)
iris_pca <- ordinate(iris, prcomp, cols = seq(4), scale. = TRUE) # NB: Non-standard aesthetics are handled as in version > 3.5.1; see: # https://github.com/tidyverse/ggplot2/issues/6191 # This prevents `scale_color_discrete(aesthetics = ...)` from synching them. ggbiplot(iris_pca) + stat_rows_bagplot( aes(fill = Species), median_gp = list(color = sync()), fence_gp = list(linewidth = 0.25), outlier_gp = list(shape = "asterisk") ) + scale_color_discrete(name = "Species", aesthetics = c("color", "fill")) + geom_cols_vector(aes(label = name)) ggbiplot(iris_pca) + stat_rows_bagplot( aes(fill = Species, color = Species), median_gp = list(color = sync()), fence_gp = list(linewidth = 0.25), outlier_gp = list(shape = "asterisk") ) + geom_cols_vector(aes(label = name)) USJudgeRatings |> ordinate(prcomp) |> mutate_rows(surname = gsub("(,|\\.).*$", "", name)) -> judges_pca ggbiplot(judges_pca, sec.axes = "cols") + geom_rows_bagplot() + geom_rows_text(aes(label = surname), size = 2) + geom_cols_vector(aes(label = name), size = 3, alpha = .5) # scaled PCA of Anderson iris measurements iris[, -5] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% mutate_rows(species = iris$Species) %>% print() -> iris_pca # row-principal biplot with centroid-based stars iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + stat_rows_star(alpha = .5, fun = "mean") + geom_rows_point(alpha = .5) + stat_rows_center(fun = "mean", size = 4, shape = 1L) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Segments connect each observation to its within-species centroid" ) # row-principal biplot with depth median-based stars iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + stat_rows_star(alpha = .5, fun.ord = "depth_median") + geom_rows_point(alpha = .5) + stat_rows_center(fun.ord = "depth_median", size = 4, shape = 1L) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Segments connect each observation to its within-species depth median" ) # correspondence analysis of combined female and male hair and eye color data HairEyeColor %>% rowSums(dims = 2L) %>% MASS::corresp(nf = 2L) %>% as_tbl_ord() %>% augment_ord() %>% print() -> hec_ca # inertia across artificial coordinates (all singular values < 1) get_inertia(hec_ca) # in row-principal biplot, row coordinates are weighted averages of columns hec_ca %>% confer_inertia("rows") %>% ggbiplot(aes(color = .matrix, fill = .matrix, shape = .matrix)) + theme_bw() + stat_cols_chull(alpha = .1) + geom_cols_point() + geom_rows_point() + ggtitle("Row-principal CA of hair & eye color") # in column-principal biplot, column coordinates are weighted averages of rows hec_ca %>% confer_inertia("cols") %>% ggbiplot(aes(color = .matrix, fill = .matrix, shape = .matrix)) + theme_bw() + stat_rows_chull(alpha = .1) + geom_rows_point() + geom_cols_point() + ggtitle("Column-principal CA of hair & eye color") # centered principal components analysis of U.S. personal expenditure data USPersonalExpenditure %>% prcomp() %>% as_tbl_ord() %>% augment_ord() %>% # allow radiating text to exceed plotting window ggbiplot(aes(label = name), clip = "off", sec.axes = "cols", scale.factor = 50) + geom_rows_label(size = 3) + # omit labels in the conical hull without the origin geom_cols_vector(vector_labels = FALSE) + stat_cols_cone(linetype = "dotted") + geom_cols_vector(stat = "cone", vector_labels = TRUE, color = "transparent") + ggtitle( "U.S. Personal Expenditure data, 1940-1960", "Row-principal biplot of centered PCA" ) # compute row-principal components of scaled iris measurements iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% mutate_rows(species = iris$Species) %>% print() -> iris_pca # row-principal biplot with centroids and confidence elliptical disks iris_pca %>% ggbiplot(aes(color = species)) + theme_bw() + geom_rows_point() + geom_polygon( aes(fill = species), color = NA, alpha = .25, stat = "rows_ellipse" ) + geom_cols_vector(color = "#444444") + scale_color_brewer( type = "qual", palette = 2, aesthetics = c("color", "fill") ) + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Overlaid with 95% confidence disks" ) judge_pca <- ordinate(USJudgeRatings, princomp, cols = -c(1, 12)) ggbiplot(judge_pca, axis.type = "predictive") + geom_cols_axis() + geom_rows_point(elements = "score") + stat_rows_peel( aes(alpha = after_stat(hull)), color = "black", elements = "score", breaks = c(.9, .5, .1) ) ggbiplot(judge_pca, axis.type = "predictive") + geom_cols_axis() + geom_rows_point(elements = "score") + stat_rows_peel( aes(alpha = after_stat(hull)), color = "black", elements = "score", breaks = c(.9, .5, .1), cut = "below" ) iris_pca <- ordinate(iris, cols = 1:4, model = prcomp) ggbiplot(iris_pca) + geom_rows_point(aes(color = Species), shape = "circle open") + stat_rows_peel( aes(fill = Species, alpha = after_stat(hull)), breaks = c(.9, .5, .1) ) # unscaled PCA iris_pca <- ordinate(iris, cols = 1:4, model = prcomp) # biplot canvas iris_biplot <- iris_pca |> ggbiplot(aes(color = Species, label = name), axis.type = "predictive") + geom_rows_point() + geom_cols_axis(aes(center = center)) # print select cases top_cases <- c(1, 51, 101) iris[top_cases, ] # subset variables length_vars <- c(1, 3) iris[, length_vars] |> aggregate(by = iris[, "Species", drop = FALSE], FUN = mean) # project all cases onto all axes iris_biplot + stat_rows_projection() # project all cases onto select axes iris_biplot + stat_rows_projection(ref_subset = length_vars) # project select cases onto all axes iris_biplot + stat_rows_projection(subset = top_cases) # project select cases onto select axes iris_biplot + stat_rows_projection(subset = top_cases, ref_subset = length_vars) # project select cases onto manually provided axes iris_cols <- as.data.frame(get_cols(iris_pca)) iris_biplot + stat_rows_projection(subset = top_cases, referent = iris_cols) # project selected cases onto selected axes in full-dimensional space iris_pca |> ggbiplot(ord_aes(iris_pca, color = Species, label = name), axis.type = "predictive") + geom_rows_point() + geom_cols_axis(aes(center = center)) + stat_rows_projection(subset = top_cases, ref_subset = length_vars) # Freestone primary glass measurements print(glass) # default (standardized) linear discriminant analysis of sites on measurements glass_lda <- MASS::lda(Site ~ SiO2 + Al2O3 + FeO + MgO + CaO, glass) # bestow 'tbl_ord' class & augment observation, centroid, and variable fields as_tbl_ord(glass_lda) %>% augment_ord() %>% print() -> glass_lda # row-standard biplot glass_lda %>% confer_inertia(1) %>% ggbiplot(aes(shape = grouping)) + theme_bw() + theme_biplot() + geom_rows_point(size = 4) + geom_rows_point(elements = "score") + stat_cols_rule( aes(label = name), color = "#888888", num = 8L, ref_elements = "score", fun.offset = \(x) minabspp(x, p = .1), text.size = 2.5, label_dodge = .04 ) + scale_shape_manual(values = c(2L, 3L, 0L, 5L)) + ggtitle( "LDA of Freestone glass measurements", "Row-standard biplot of standardized LDA" ) # contribution LDA of sites on measurements glass_lda <- lda_ord(Site ~ SiO2 + Al2O3 + FeO + MgO + CaO, glass, axes.scale = "contribution") # bestow 'tbl_ord' class & augment observation, centroid, and variable fields as_tbl_ord(glass_lda) %>% augment_ord() %>% print() -> glass_lda # symmetric biplot glass_lda %>% confer_inertia(.5) %>% ggbiplot(aes(shape = grouping)) + theme_bw() + theme_biplot() + geom_rows_point() + stat_rows_density_2d(elements = "score", alpha = .5, color = "#444444") + stat_cols_rule( aes(label = name), geom = "axis", color = "#888888", num = 8L, ref_elements = "active", fun.offset = \(x) minabspp(x, p = .1), label_dodge = 0.04, text.size = 2.5, text_dodge = .025 ) + scale_shape_manual(values = c(16L, 17L, 15L, 18L)) + ggtitle( "LDA of Freestone glass measurements", "Symmetric biplot of contribution LDA" ) ## Not run: # classical multidimensional scaling of road distances between European cities euro_mds <- ordinate(eurodist, cmdscale_ord, k = 11) # monoplot of city locations euro_plot <- euro_mds %>% negate_ord("PCo2") %>% ggbiplot() + geom_cols_text(aes(label = name), size = 3) print(euro_plot) # biplot with minimal spanning tree based on plotting window distances euro_plot + stat_cols_spantree( engine = "mlpack", alpha = .5, linetype = "dotted" ) # biplot with minimal spanning tree based on full-dimensional distances euro_plot + stat_cols_spantree( ord_aes(euro_mds), engine = "mlpack", alpha = .5, linetype = "dotted" ) ## End(Not run)
Re-distribute inertia between rows and columns in an ordination.
recover_conference(x) ## Default S3 method: recover_conference(x) get_conference(x) revert_conference(x) confer_inertia(x, p)
recover_conference(x) ## Default S3 method: recover_conference(x) get_conference(x) revert_conference(x) confer_inertia(x, p)
x |
A tbl_ord. |
p |
Numeric vector of length 1 or 2. If length 1, the proportion of the
inertia assigned to the cases, with the remainder |
The inertia of a singular value decomposition consists in the
squares of the singular values (the diagonal elements of
), and
represents the variance, likened to the physical inertia, in the directions
of the orthogonal singular vectors (the columns of
or of
).
Biplots superimpose the projections of the rows and the columns of
onto these coordinate vectors, scaled by some proportion of the total
inertia:
and
. A biplot is balanced if
.
Read Orlov (2013) for more on conferring inertia in PCA.
recover_conference()
, like the other recoverers, is an S3 method that is exported for convenience but not intended to
be used directly.
Note: In case the "inertia"
attribute is a rectangular matrix, one may
only be able to confer it entirely to the cases (p = 1
) or entirely to the
variables (p = 0
).
recover_conference()
returns the (statically implemented)
distribution of inertia between the rows and the columns as stored in the
model. confer_inertia()
returns a tbl_ord with a specified distribution
of inertia but the wrapped model unchanged. get_conference()
returns the
distribution currently conferred.
Orlov K (2013) Answer to "Algebra of LDA. Fisher discrimination power of a variable and Linear Discriminant Analysis". CrossValidated, accessed 2019-07-26. https://stats.stackexchange.com/a/83114/68743
Other generic recoverers:
augmentation
,
recoverers
,
supplementation
# illustrative ordination: correspendence analysis of hair & eye data haireye_ca <- ordinate( as.data.frame(rowSums(HairEyeColor, dims = 2L)), cols = everything(), model = MASS::corresp ) print(haireye_ca) # check distribution of inertia get_conference(haireye_ca) # confer inertia to rows, then to columns confer_inertia(haireye_ca, "rows") confer_inertia(haireye_ca, "columns") # confer inertia symmetrically (haireye_ca <- confer_inertia(haireye_ca, "symmetric")) # check redistributed inertia get_conference(haireye_ca) # restore default distribution of inertia revert_conference(haireye_ca)
# illustrative ordination: correspendence analysis of hair & eye data haireye_ca <- ordinate( as.data.frame(rowSums(HairEyeColor, dims = 2L)), cols = everything(), model = MASS::corresp ) print(haireye_ca) # check distribution of inertia get_conference(haireye_ca) # confer inertia to rows, then to columns confer_inertia(haireye_ca, "rows") confer_inertia(haireye_ca, "columns") # confer inertia symmetrically (haireye_ca <- confer_inertia(haireye_ca, "symmetric")) # check redistributed inertia get_conference(haireye_ca) # restore default distribution of inertia revert_conference(haireye_ca)
Geometric data analysis often requires that coordinates lie on
the same scale. The coordinate system CoordRect
, alias CoordSquare
,
provides control of both coordinate and window aspect ratios.
coord_rect( ratio = 1, window_ratio = ratio, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on" )
coord_rect( ratio = 1, window_ratio = ratio, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on" )
ratio |
aspect ratio, expressed as |
window_ratio |
aspect ratio of plotting window |
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
# ensures that the resolutions of the axes and the dimensions of the plotting # window respect the specified aspect ratios p <- ggplot(mtcars, aes(mpg, hp/10)) + geom_point() p + coord_rect(ratio = 1) p + coord_rect(ratio = 1, window_ratio = 2) p + coord_rect(ratio = 1, window_ratio = 1/2) p + coord_rect(ratio = 5) p + coord_rect(ratio = 1/5) p + coord_rect(xlim = c(15, 30)) p + coord_rect(ylim = c(15, 30))
# ensures that the resolutions of the axes and the dimensions of the plotting # window respect the specified aspect ratios p <- ggplot(mtcars, aes(mpg, hp/10)) + geom_point() p + coord_rect(ratio = 1) p + coord_rect(ratio = 1, window_ratio = 2) p + coord_rect(ratio = 1, window_ratio = 1/2) p + coord_rect(ratio = 5) p + coord_rect(ratio = 1/5) p + coord_rect(xlim = c(15, 30)) p + coord_rect(ylim = c(15, 30))
2- (and 3-) dimensional biplots require that coordinates lie on
the same scale but may additionally benefit from a square plotting window.
While CoordRect
provides control of coordinate and window aspect ratios,
the convenience CoordScaffold
system also fixes the coordinate aspect
ratio at 1
and gives the user control only of the plotting window.
coord_scaffold( window_ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on" )
coord_scaffold( window_ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on" )
window_ratio |
aspect ratio of plotting window |
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
# resize the plot to see that the specified aspect ratio is maintained p <- ggplot(mtcars, aes(mpg, hp/10)) + geom_point() p + coord_scaffold() p + coord_scaffold(window_ratio = 2) # prevent rescaling in response to `theme()` aspect ratio p <- ggplot(mtcars, aes(mpg, hp/5)) + geom_point() p + coord_equal() + theme(aspect.ratio = 1) p + coord_scaffold() + theme(aspect.ratio = 1) # NB: `theme(aspect.ratio = )` overrides `Coord*$aspect`: p + coord_fixed(ratio = 1) + theme(aspect.ratio = 1) p + coord_scaffold(window_ratio = 2) + theme(aspect.ratio = 1)
# resize the plot to see that the specified aspect ratio is maintained p <- ggplot(mtcars, aes(mpg, hp/10)) + geom_point() p + coord_scaffold() p + coord_scaffold(window_ratio = 2) # prevent rescaling in response to `theme()` aspect ratio p <- ggplot(mtcars, aes(mpg, hp/5)) + geom_point() p + coord_equal() + theme(aspect.ratio = 1) p + coord_scaffold() + theme(aspect.ratio = 1) # NB: `theme(aspect.ratio = )` overrides `Coord*$aspect`: p + coord_fixed(ratio = 1) + theme(aspect.ratio = 1) p + coord_scaffold(window_ratio = 2) + theme(aspect.ratio = 1)
These functions adapt dplyr verbs to the factors of a tbl_ord.
The raw verbs are not defined for tbl_ords; instead, each verb
has two analogues, corresponding to the two matrix factors. They each rely
on a common workhorse function, which takes the composition of the
dplyr verb with annotation_*
, applied to the factor, removes any
variables corresponding to coordinates or already annotated, and only then
assigns it as the new "*_annotation"
attribute of .data
(see
annotation). Note that these functions are not generics and so cannot be
extended to other classes.
pull_factor(.data, var = -1, .matrix) pull_rows(.data, var = -1) pull_cols(.data, var = -1) rename_rows(.data, ...) rename_cols(.data, ...) select_rows(.data, ...) select_cols(.data, ...) mutate_rows(.data, ...) mutate_cols(.data, ...) transmute_rows(.data, ...) transmute_cols(.data, ...) cbind_rows(.data, ..., elements = "all") cbind_cols(.data, ..., elements = "all") left_join_rows(.data, ...) left_join_cols(.data, ...)
pull_factor(.data, var = -1, .matrix) pull_rows(.data, var = -1) pull_cols(.data, var = -1) rename_rows(.data, ...) rename_cols(.data, ...) select_rows(.data, ...) select_cols(.data, ...) mutate_rows(.data, ...) mutate_cols(.data, ...) transmute_rows(.data, ...) transmute_cols(.data, ...) cbind_rows(.data, ..., elements = "all") cbind_cols(.data, ..., elements = "all") left_join_rows(.data, ...) left_join_cols(.data, ...)
.data |
An object of class 'tbl_ord'. |
var |
A variable specified as in |
.matrix |
A character string partially matched (lowercase) to several
indicators for one or both matrices in a matrix decomposition used for
ordination. The standard values are |
... |
Comma-separated unquoted expressions as in, e.g.,
|
elements |
Character vector; which elements of each factor for which to
render graphical elements. One of |
A tbl_ord; the wrapped model is unchanged.
# illustrative ordination: LDA of iris data (iris_lda <- ordinate(iris, cols = 1:4, lda_ord, grouping = iris$Species)) # extract a coordinate or annotation head(pull_rows(iris_lda, Species)) pull_cols(iris_lda, LD2) # rename an annotation rename_cols(iris_lda, species = name) # select annotations select_rows(iris_lda, species = name, .element) # create, modify, and delete annotations mutate_cols(iris_lda, vec.length = sqrt(LD1^2 + LD2^2)) transmute_cols(iris_lda, vec.length = sqrt(LD1^2 + LD2^2)) # bind data frames of annotations iris_medians <- stats::aggregate(iris[, 1:4], median, by = iris[, 5, drop = FALSE]) # TODO: Requirement of `.elements` for matching is fragile. iris_lda %>% # retain '.element' in order to match by `elements` select_rows(.element) %>% cbind_rows(iris_medians, elements = "active") iris_lda %>% select_rows(name, Species) %>% left_join_rows(iris_medians, by = c("name" = "Species"))
# illustrative ordination: LDA of iris data (iris_lda <- ordinate(iris, cols = 1:4, lda_ord, grouping = iris$Species)) # extract a coordinate or annotation head(pull_rows(iris_lda, Species)) pull_cols(iris_lda, LD2) # rename an annotation rename_cols(iris_lda, species = name) # select annotations select_rows(iris_lda, species = name, .element) # create, modify, and delete annotations mutate_cols(iris_lda, vec.length = sqrt(LD1^2 + LD2^2)) transmute_cols(iris_lda, vec.length = sqrt(LD1^2 + LD2^2)) # bind data frames of annotations iris_medians <- stats::aggregate(iris[, 1:4], median, by = iris[, 5, drop = FALSE]) # TODO: Requirement of `.elements` for matching is fragile. iris_lda %>% # retain '.element' in order to match by `elements` select_rows(.element) %>% cbind_rows(iris_medians, elements = "active") iris_lda %>% select_rows(name, Species) %>% left_join_rows(iris_medians, by = c("name" = "Species"))
These key drawing functions supplement those built into ggplot2 for producing legends suitable to biplots.
draw_key_line(data, params, size) draw_key_crosslines(data, params, size) draw_key_crosspoint(data, params, size)
draw_key_line(data, params, size) draw_key_crosslines(data, params, size) draw_key_crosspoint(data, params, size)
data |
A single row data frame containing the scaled aesthetics to display in this key |
params |
A list of additional parameters supplied to the geom. |
size |
Width and height of key in mm. |
draw_key_line()
is a horizontal counterpart to ggplot2::draw_key_vline()
.
draw_key_crosslines()
superimposes these two keys, and
draw_key_crosspoint()
additionally superimposes an oversized
ggplot2::draw_key_point()
.
A grid grob.
ggplot2::draw_key for key glyphs installed with ggplot2.
# scaled PCA of Anderson iris data with ranges and confidence intervals iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% confer_inertia(1) %>% augment_ord() %>% mutate_rows(species = iris$Species) %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_lineranges(fun.data = mean_sdl, linewidth = .75) + geom_rows_density_2d(contour = TRUE, alpha = .5) + geom_cols_vector(aes(label = name), color = "#444444", size = 3) + ggtitle( "Row-principal PCA biplot of Anderson iris data", "Ranges 2 sample standard deviations from centroids" )
# scaled PCA of Anderson iris data with ranges and confidence intervals iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% confer_inertia(1) %>% augment_ord() %>% mutate_rows(species = iris$Species) %>% ggbiplot(aes(color = species)) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_lineranges(fun.data = mean_sdl, linewidth = .75) + geom_rows_density_2d(contour = TRUE, alpha = .5) + geom_cols_vector(aes(label = name), color = "#444444", size = 3) + ggtitle( "Row-principal PCA biplot of Anderson iris data", "Ranges 2 sample standard deviations from centroids" )
These methods of base::format()
and base::print()
render a
(usually more) tidy readout of a tbl_ord that is consistent across all
original ordination classes.
## S3 method for class 'tbl_ord' format( x, width = NULL, ..., n = NULL, max_extra_cols = NULL, max_footer_lines = NULL ) ## S3 method for class 'tbl_ord' print( x, width = NULL, ..., n = NULL, max_extra_cols = NULL, max_footer_lines = NULL )
## S3 method for class 'tbl_ord' format( x, width = NULL, ..., n = NULL, max_extra_cols = NULL, max_footer_lines = NULL ) ## S3 method for class 'tbl_ord' print( x, width = NULL, ..., n = NULL, max_extra_cols = NULL, max_footer_lines = NULL )
x |
A tbl_ord. |
width |
Width of text output to generate. This defaults to |
... |
Additional arguments. |
n |
Number of rows to show. If |
max_extra_cols |
Number of extra columns to print abbreviated information for,
if the width is too small for the entire tibble. If |
max_footer_lines |
Maximum number of footer lines. If |
The format
and print
methods for class 'tbl_ord' are adapted from those
for class 'tbl_df' and for class 'tbl_graph' from the
tidygraph package.
Note: The format()
function is tedius but cannot be easily modularized
without invoking recoverers, annotation, and augmentation multiple
times, thereby significantly reducing performance.
The format()
method returns a vector of strings that are more
elegantly printed by the print()
method, which itself returns the tbl_ord
invisibly.
geom_axis()
renders lines through or orthogonally translated
from the origin and the position of each case or variable.
geom_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_axis( mapping = NULL, data = NULL, stat = "identity", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
axis_labels , axis_ticks , axis_text
|
Logical; whether to include labels, tick marks, and text value marks along the axes. |
by , num
|
Intervals between elements or number of elements; specify only one. |
tick_length |
Numeric; the length of the tick marks, as a proportion of the minimum of the plot width and height. |
text_dodge |
Numeric; the orthogonal distance of tick mark text from the axis, as a proportion of the minimum of the plot width and height. |
label_dodge |
Numeric; the orthogonal distance of the axis label from the axis, as a proportion of the minimum of the plot width and height. |
... |
Additional arguments passed to |
axis.colour , axis.color , axis.alpha
|
Default aesthetics for axes. Set to NULL to inherit from the data's aesthetics. |
label.angle , label.colour , label.color , label.alpha
|
Default aesthetics for labels. Set to NULL to inherit from the data's aesthetics. |
tick.linewidth , tick.colour , tick.color , tick.alpha
|
Default aesthetics for tick marks. Set to NULL to inherit from the data's aesthetics. |
text.size , text.angle , text.hjust , text.vjust , text.family , text.fontface , text.colour , text.color , text.alpha
|
Default aesthetics for tick mark labels. Set to NULL to inherit from the data's aesthetics. |
parse |
If |
check_overlap |
If |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Axes are lines that track the values of linear variables across a plot. Multivariate scatterplots may include more axes than plotting dimensions, in which case the plot may display only a fraction of the total variation in the data.
Gower & Hand (1996) recommend using axes to represent numerical variables in biplots. Consequently, Gardner & le Roux (2002) refer to these as Gower biplots.
Axes positioned orthogonally at the origin are a ubiquitous feature of
scatterplots and used both to recover variable values from case markers
(prediction) and to position new case markers from variables
(interpolation). When they are not orthogonal, these two uses conflict, so
interpolative versus predictive axes must be used appropriately; see
ggbiplot()
.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_axis()
understands the following aesthetics (required aesthetics are
in bold):
x
y
lower
upper
yintercept
or xintercept
or xend
and yend
linetype
linewidth
size
hjust
vjust
colour
alpha
label
family
fontface
center
, scale
group
Gower JC & Hand DJ (1996) Biplots. Chapman & Hall, ISBN: 0-412-71630-5.
Gardner S, le Roux N (2002) "Biplot Methodology for Discriminant Analysis Based upon Robust Methods and Principal Curves". Classification, Clustering, and Data Analysis: Recent Advances and Applications: 169–176. https://link.springer.com/chapter/10.1007/978-3-642-56181-8_18
Other geom layers:
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # gradient axis with respect to two predictors scale(stackloss, scale = FALSE) |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + geom_axis(data = coef_data) # unlimited axes with window forcing stackloss_centered <- scale(stackloss, scale = FALSE) stackloss_centered |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + stat_rule( geom = "axis", data = coef_data, referent = stackloss_centered, fun.lower = \(x) minpp(x, p = 1), fun.upper = \(x) maxpp(x, p = 1), fun.offset = \(x) minabspp(x, p = 1) ) # NB: `geom_axis(stat = "rule")` would fail to pass positional aesthetics. # eigen-decomposition of covariance matrix ability.cov$cov |> cov2cor() |> eigen() |> getElement("vectors") |> as.data.frame() |> transform(test = rownames(ability.cov$cov)) -> ability_cor_eigen # test axes in best-approximation space ability_cor_eigen |> transform(E3 = ifelse(V3 > 0, "rise", "fall")) |> # FIXME: Component aesthetic data values aren't mapped to color values. ggplot(aes(V1, V2, color = E3)) + coord_square() + geom_axis(aes(label = test), text.color = "black", text.alpha = .5) + expand_limits(x = c(-1, 1), y = c(-1, 1))
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # gradient axis with respect to two predictors scale(stackloss, scale = FALSE) |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + geom_axis(data = coef_data) # unlimited axes with window forcing stackloss_centered <- scale(stackloss, scale = FALSE) stackloss_centered |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + stat_rule( geom = "axis", data = coef_data, referent = stackloss_centered, fun.lower = \(x) minpp(x, p = 1), fun.upper = \(x) maxpp(x, p = 1), fun.offset = \(x) minabspp(x, p = 1) ) # NB: `geom_axis(stat = "rule")` would fail to pass positional aesthetics. # eigen-decomposition of covariance matrix ability.cov$cov |> cov2cor() |> eigen() |> getElement("vectors") |> as.data.frame() |> transform(test = rownames(ability.cov$cov)) -> ability_cor_eigen # test axes in best-approximation space ability_cor_eigen |> transform(E3 = ifelse(V3 > 0, "rise", "fall")) |> # FIXME: Component aesthetic data values aren't mapped to color values. ggplot(aes(V1, V2, color = E3)) + coord_square() + geom_axis(aes(label = test), text.color = "black", text.alpha = .5) + expand_limits(x = c(-1, 1), y = c(-1, 1))
Render bagplots from tagged data comprising medians, hulls, contours, and outlier specifications.
geom_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_bagplot( mapping = NULL, data = NULL, stat = "bagplot", position = "identity", ..., bag.linewidth = sync(), bag.linetype = sync(), bag.colour = "black", bag.color = NULL, bag.fill = sync(), bag.alpha = NA, median.shape = 21L, median.stroke = sync(), median.size = 5, median.colour = sync(), median.color = NULL, median.fill = "white", median.alpha = NA, fence.linewidth = 0.25, fence.linetype = 0L, fence.colour = sync(), fence.color = NULL, fence.fill = sync(), fence.alpha = 0.25, outlier.shape = sync(), outlier.stroke = sync(), outlier.size = sync(), outlier.colour = sync(), outlier.color = NULL, outlier.fill = NA, outlier.alpha = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Additional arguments passed to |
bag.linetype , bag.linewidth , bag.colour , bag.color , bag.fill , bag.alpha
|
Default aesthetics for bags. Set to |
median.shape , median.stroke , median.size , median.colour , median.color , median.fill , median.alpha
|
Default aesthetics for medians. Set to |
fence.linetype , fence.linewidth , fence.colour , fence.color , fence.fill , fence.alpha
|
Default aesthetics for fences. Set to |
outlier.shape , outlier.stroke , outlier.size , outlier.colour , outlier.color , outlier.fill , outlier.alpha
|
Default aesthetics for outliers. Set to |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
geom_bagplot()
is designed to pair with stat_bagplot()
,
analogously to the pairing of ggplot2::geom_boxplot()
with
ggplot2::stat_boxplot()
.
Because the optional components are more expensive to compute in this
setting, they are controlled by parameters passed to the stat. Auxiliary
aesthetics like median.colour
are available that override auxiliary
defaults, and these in turn override the standard defaults. Auxiliary
defaults also take effect when auxiliary aesthetics are passed NULL
, so
that stat_bagplot()
and geom_bagplot()
have the same default behavior.
Pass sync()
(instead of NULL
, as in ggplot2::geom_boxplot()
) to
synchronize an auxiliary aesthetic with its standard counterpart.
WARNING:
The trade-off between precision and runtime is greater for depth estimation
than for density estimation. At the resolution of the default grid, basic examples may vary noticeably when starting from
different random seeds.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_bagplot()
understands the following aesthetics (required aesthetics
are in bold):
x
y
component
linewidth
linetype
colour
fill
alpha
shape
stroke
size
group
Other geom layers:
geom_axis()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
# Motor Trends base plot with factorized cylinder counts p <- mtcars |> transform(cyl = factor(cyl)) |> ggplot(aes(x = wt, y = disp)) + theme_bw() # basic bagplot p + geom_bagplot() # group by cylinder count p + geom_bagplot( fraction = 0.4, coef = 1.2, aes(fill = cyl, linetype = cyl, color = cyl) ) # using normally unmapped aesthetics p + geom_bagplot( fraction = 0.4, coef = 1.2, aes(fill = cyl, linetype = cyl, color = cyl), median.color = "black", fence.linetype = sync(), fence.colour = "black", outlier.shape = "asterisk", outlier.colour = "black" )
# Motor Trends base plot with factorized cylinder counts p <- mtcars |> transform(cyl = factor(cyl)) |> ggplot(aes(x = wt, y = disp)) + theme_bw() # basic bagplot p + geom_bagplot() # group by cylinder count p + geom_bagplot( fraction = 0.4, coef = 1.2, aes(fill = cyl, linetype = cyl, color = cyl) ) # using normally unmapped aesthetics p + geom_bagplot( fraction = 0.4, coef = 1.2, aes(fill = cyl, linetype = cyl, color = cyl), median.color = "black", fence.linetype = sync(), fence.colour = "black", outlier.shape = "asterisk", outlier.colour = "black" )
geom_interpolation()
renders a geometric construction that
interpolates a new data matrix (row or column) element from its entries to
its artificial coordinates.
geom_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_interpolation( mapping = NULL, data = NULL, stat = "identity", position = "identity", new_data = NULL, type = c("centroid", "sequence"), arrow = default_arrow, ..., point.fill = NA, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
new_data |
A list (best structured as a data.frame)
of row ( |
type |
Character value matched to |
arrow |
Specification for arrows, as created by |
... |
Additional arguments passed to |
point.fill |
Default aesthetics for markers. Set to NULL to inherit from the data's aesthetics. |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Interpolation answers the following question: Given a new data
element that might have appeared as a row (respectively, column) in the
singular-value-decomposed data matrix, where should we expect the marker
for this element to appear in the biplot? The solution is the vector sum of
the column (row) units weighted by their values in the new row (column).
Gower, Gardner–Lubbe, & le Roux (2011) provide two visualizations of this
calculation: a tail-to-head sequence of weighted units (type = "sequence"
), and a centroid of the weighted units scaled by the number of
units (type = "centroid"
).
WARNING:
This layer is appropriate only with axes in standard coordinates (usually
confer_inertia(p = "rows")
) and interpolative
calibration (ggbiplot(axis.type = "interpolative")
).
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_interpolation()
requires the custom interpolate
aesthetic, which
tells the internals which columns of the new_data
parameter contain the
variables to be used for interpolation. Except in rare cases, new_data
should contain the same rows or columns as the ordinated data and
interpolate
should be set to name
(procured by augment_ord()
).
geom_interpolation()
additionally understands the following aesthetics
(required aesthetics are in bold):
alpha
colour
linetype
size
fill
shape
stroke
center
, scale
group
Gower JC, Gardner–Lubbe S, & le Roux NJ (2011) Understanding Biplots. Wiley, ISBN: 978-0-470-01255-0. https://www.wiley.com/go/biplots
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% print() -> iris_pca iris_pca <- mutate_rows(iris_pca, species = iris$Species) iris_pca <- augment_ord(iris_pca) # sample of one of each species, with some missing measurements new_data <- iris[c(42, 61, 110), seq(5, 1), drop = FALSE] new_data[3L, "Sepal.Width"] <- NA new_data[1L, "Petal.Length"] <- NA print(new_data) # centroid interpolation method iris_pca %>% augment_ord() %>% mutate_rows(obs = dplyr::row_number()) %>% mutate_cols(measure = name) %>% ggbiplot() + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin(marker = "cross", alpha = .5) + geom_cols_interpolation( aes(center = center, scale = scale, interpolate = name), size = 3, new_data = new_data, type = "centroid", alpha = .5 ) + geom_rows_text(aes(label = obs, color = species), alpha = .5, size = 3) # missing an entire variable new_data$Petal.Length <- NULL # sequence interpolation method iris_pca %>% augment_ord() %>% mutate_rows(obs = dplyr::row_number()) %>% mutate_cols(measure = name) %>% ggbiplot() + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin(marker = "circle", alpha = .5) + geom_cols_interpolation( aes(center = center, scale = scale, interpolate = name, linetype = measure), new_data = new_data, type = "sequence", alpha = .5 ) + geom_rows_text(aes(label = obs, color = species), alpha = .5, size = 3)
iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% print() -> iris_pca iris_pca <- mutate_rows(iris_pca, species = iris$Species) iris_pca <- augment_ord(iris_pca) # sample of one of each species, with some missing measurements new_data <- iris[c(42, 61, 110), seq(5, 1), drop = FALSE] new_data[3L, "Sepal.Width"] <- NA new_data[1L, "Petal.Length"] <- NA print(new_data) # centroid interpolation method iris_pca %>% augment_ord() %>% mutate_rows(obs = dplyr::row_number()) %>% mutate_cols(measure = name) %>% ggbiplot() + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin(marker = "cross", alpha = .5) + geom_cols_interpolation( aes(center = center, scale = scale, interpolate = name), size = 3, new_data = new_data, type = "centroid", alpha = .5 ) + geom_rows_text(aes(label = obs, color = species), alpha = .5, size = 3) # missing an entire variable new_data$Petal.Length <- NULL # sequence interpolation method iris_pca %>% augment_ord() %>% mutate_rows(obs = dplyr::row_number()) %>% mutate_cols(measure = name) %>% ggbiplot() + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin(marker = "circle", alpha = .5) + geom_cols_interpolation( aes(center = center, scale = scale, interpolate = name, linetype = measure), new_data = new_data, type = "sequence", alpha = .5 ) + geom_rows_text(aes(label = obs, color = species), alpha = .5, size = 3)
geom_isoline()
renders isolines along row or column axes.
geom_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_isoline( mapping = NULL, data = NULL, stat = "identity", position = "identity", isoline_text = TRUE, by = NULL, num = NULL, text_dodge = 0.03, ..., text.size = 3, text.angle = 0, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
isoline_text |
Logical; whether to include text value marks along the isolines. |
by , num
|
Intervals between elements or number of elements; specify only one. |
text_dodge |
Numeric; the orthogonal distance of the text from the axis or isoline, as a proportion of the minimum of the plot width and height. |
... |
Additional arguments passed to |
text.size , text.angle , text.colour , text.color , text.alpha
|
Default aesthetics for tick mark labels. Set to NULL to inherit from the data's aesthetics. |
parse |
If |
check_overlap |
If |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Isolines are topographical features that separate a plot into regions in which a gradient of interest falls within a specified range. Greenacre (2010) uses them effectively to assist with the projection of markers onto axes.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_isoline()
understands the following aesthetics (required aesthetics
are in bold):
x
y
colour
alpha
linewidth
linetype
center
, scale
hjust
vjust
family
fontface
group
Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # isolines along strongest predictors scale(stackloss, scale = FALSE) |> ggplot(aes(x = Water.Temp, y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss)) + scale_size_area() + geom_isoline(data = coef_data)
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # isolines along strongest predictors scale(stackloss, scale = FALSE) |> ggplot(aes(x = Water.Temp, y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss)) + scale_size_area() + geom_isoline(data = coef_data)
geom_lineranges()
renders horizontal and vertical intervals
for a specified subject or variable; geom_pointranges()
additionally
renders a point at their crosshairs.
geom_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_lineranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE ) geom_pointranges( mapping = NULL, data = NULL, stat = "center", position = "identity", ..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Additional arguments passed to |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_lineranges()
and geom_pointranges()
understand the following
aesthetics (required aesthetics are in bold):
x
xmin
xmax
y
ymin
ymax
'
alpha
colour
linewidth
linetype
size
group
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) + geom_point(alpha = .25) + geom_lineranges() ggplot(mpg, aes(x = displ, y = hwy, color = drv)) + geom_point(alpha = .25) + geom_pointranges(fun.data = mean_sdl, shape = "circle open")
ggplot(mpg, aes(x = displ, y = hwy, color = drv)) + geom_point(alpha = .25) + geom_lineranges() ggplot(mpg, aes(x = displ, y = hwy, color = drv)) + geom_point(alpha = .25) + geom_pointranges(fun.data = mean_sdl, shape = "circle open")
geom_origin()
renders a symbol, either a set of crosshairs or
a circle, at the origin. geom_unit_circle()
renders the unit circle,
centered at the origin with radius 1.
geom_origin( mapping = NULL, data = NULL, marker = "crosshairs", radius = unit(0.04, "snpc"), ..., na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_unit_circle( mapping = NULL, data = NULL, segments = 60, scale.factor = 1, ..., na.rm = FALSE, show.legend = NA, inherit.aes = FALSE )
geom_origin( mapping = NULL, data = NULL, marker = "crosshairs", radius = unit(0.04, "snpc"), ..., na.rm = FALSE, show.legend = NA, inherit.aes = FALSE ) geom_unit_circle( mapping = NULL, data = NULL, segments = 60, scale.factor = 1, ..., na.rm = FALSE, show.legend = NA, inherit.aes = FALSE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
marker |
The symbol to be drawn at the origin; matched to |
radius |
A |
... |
Additional arguments passed to |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
segments |
The number of segments to be used in drawing the circle. |
scale.factor |
The circle radius; should remain at its default value 1
or passed the same value as |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_origin()
accepts no aesthetics.
geom_unit_circle()
understands the following aesthetics (none required):
alpha
colour
linetype
size
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_rule()
,
geom_text_radiate()
,
geom_vector()
ggplot(seals, aes(delta_long, delta_lat)) + theme_void() + geom_origin() + geom_point(alpha = .5) iris |> split(~ Species) |> lapply(subset, select = -c(Species)) |> lapply(scale, center = TRUE, scale = FALSE) |> lapply(as.data.frame) |> unsplit(iris$Species) |> transform(Species = iris$Species) -> iris_ctr ggplot(iris_ctr, aes(Petal.Width, Petal.Length)) + coord_equal() + facet_wrap(vars(Species)) + geom_unit_circle() + geom_point()
ggplot(seals, aes(delta_long, delta_lat)) + theme_void() + geom_origin() + geom_point(alpha = .5) iris |> split(~ Species) |> lapply(subset, select = -c(Species)) |> lapply(scale, center = TRUE, scale = FALSE) |> lapply(as.data.frame) |> unsplit(iris$Species) |> transform(Species = iris$Species) -> iris_ctr ggplot(iris_ctr, aes(Petal.Width, Petal.Length)) + coord_equal() + facet_wrap(vars(Species)) + geom_unit_circle() + geom_point()
geom_rule()
renders segments through or orthogonally
translated from the origin.
geom_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_rule( mapping = NULL, data = NULL, stat = "rule", position = "identity", axis_labels = TRUE, axis_ticks = TRUE, axis_text = TRUE, by = NULL, num = NULL, snap_rule = TRUE, tick_length = 0.025, text_dodge = 0.03, label_dodge = 0.03, ..., axis.colour = NULL, axis.color = NULL, axis.alpha = NULL, label.angle = 0, label.colour = NULL, label.color = NULL, label.alpha = NULL, tick.linewidth = 0.25, tick.colour = NULL, tick.color = NULL, tick.alpha = NULL, text.size = 2.6, text.angle = 0, text.hjust = 0.5, text.vjust = 0.5, text.family = NULL, text.fontface = NULL, text.colour = NULL, text.color = NULL, text.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
axis_labels , axis_ticks , axis_text
|
Logical; whether to include labels, tick marks, and text value marks along the axes. |
by , num
|
Intervals between elements or number of elements; specify only one. |
snap_rule |
Logical; whether to snap rule segments to grid values. |
tick_length |
Numeric; the length of the tick marks, as a proportion of the minimum of the plot width and height. |
text_dodge |
Numeric; the orthogonal distance of tick mark text from the axis, as a proportion of the minimum of the plot width and height. |
label_dodge |
Numeric; the orthogonal distance of the axis label from the axis, as a proportion of the minimum of the plot width and height. |
... |
Additional arguments passed to |
axis.colour , axis.color , axis.alpha
|
Default aesthetics for axes. Set to NULL to inherit from the data's aesthetics. |
label.angle , label.colour , label.color , label.alpha
|
Default aesthetics for labels. Set to NULL to inherit from the data's aesthetics. |
tick.linewidth , tick.colour , tick.color , tick.alpha
|
Default aesthetics for tick marks. Set to NULL to inherit from the data's aesthetics. |
text.size , text.angle , text.hjust , text.vjust , text.family , text.fontface , text.colour , text.color , text.alpha
|
Default aesthetics for tick mark labels. Set to NULL to inherit from the data's aesthetics. |
parse |
If |
check_overlap |
If |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
As implemented here, a rule is just an axis that has a
fixed range, usually the limits of the data. See stat_rule()
for further
details.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_rule()
understands the following aesthetics (required aesthetics are
in bold):
x
y
lower
upper
yintercept
or xintercept
or xend
and yend
linetype
linewidth
size
hjust
vjust
colour
alpha
label
family
fontface
center
, scale
group
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_text_radiate()
,
geom_vector()
USJudgeRatings |> subset(select = -c(1, 12)) |> dist(method = "maximum") |> cmdscale() |> as.data.frame() |> setNames(c("PCo1", "PCo2")) |> transform(name = rownames(USJudgeRatings)) -> judge_mds USJudgeRatings |> subset(select = c(CONT, RTEN)) |> setNames(c("contacts", "recommendation")) -> judge_meta lm(as.matrix(judge_meta) ~ as.matrix(judge_mds[, seq(2)])) |> getElement("coefficients") |> unname() |> t() |> as.data.frame() |> setNames(c("Intercept", "PCo1", "PCo2")) |> transform(variable = names(judge_meta)) -> judge_lm ggplot(judge_mds, aes(x = PCo1, y = PCo2)) + coord_equal() + theme_void() + geom_text(aes(label = name), size = 3) + stat_rule( data = judge_lm, referent = judge_mds, aes(center = Intercept, label = variable) ) # NB: `geom_rule(stat = "rule")` would fail to pass positional aesthetics.
USJudgeRatings |> subset(select = -c(1, 12)) |> dist(method = "maximum") |> cmdscale() |> as.data.frame() |> setNames(c("PCo1", "PCo2")) |> transform(name = rownames(USJudgeRatings)) -> judge_mds USJudgeRatings |> subset(select = c(CONT, RTEN)) |> setNames(c("contacts", "recommendation")) -> judge_meta lm(as.matrix(judge_meta) ~ as.matrix(judge_mds[, seq(2)])) |> getElement("coefficients") |> unname() |> t() |> as.data.frame() |> setNames(c("Intercept", "PCo1", "PCo2")) |> transform(variable = names(judge_meta)) -> judge_lm ggplot(judge_mds, aes(x = PCo1, y = PCo2)) + coord_equal() + theme_void() + geom_text(aes(label = name), size = 3) + stat_rule( data = judge_lm, referent = judge_mds, aes(center = Intercept, label = variable) ) # NB: `geom_rule(stat = "rule")` would fail to pass positional aesthetics.
geom_text_radiate()
is adapted from ggbiplot()
in the
off-CRAN extensions of the same name (Vu, 2014; Telford, 2017; Gegzna,
2018). It renders text at specified positions and angles that radiate out
from the origin. This layer and its associated ggproto are deprecated.
geom_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_text_radiate( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer.
Cannot be jointy specified with
|
... |
Other arguments passed on to
|
parse |
If |
check_overlap |
If |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_text_radiate()
understands the following aesthetics (required
aesthetics are in bold):
x
y
label
alpha
angle
colour
family
fontface
hjust
lineheight
size
vjust
group
Vincent Q. Vu (2014). ggbiplot: A 'ggplot2' based biplot. R package version
0.55. https://github.com/vqv/ggbiplot, experimental
branch
Richard J Telford (2017). ggbiplot: A 'ggplot2' based biplot. R package
version 0.6. https://github.com/richardjtelford/ggbiplot (fork),
experimental
branch
Vilmantas Gegzna (2018). ggbiplot: A 'ggplot2' based biplot. R package
version 0.55. https://github.com/forked-packages/ggbiplot (fork), experimental
branch
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_vector()
geom_vector()
renders arrows from the origin to points,
optionally with text radiating outward.
geom_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_vector( mapping = NULL, data = NULL, stat = "identity", position = "identity", arrow = default_arrow, lineend = "round", linejoin = "mitre", vector_labels = TRUE, ..., label.colour = NULL, label.color = NULL, label.alpha = NULL, parse = FALSE, check_overlap = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
arrow |
Specification for arrows, as created by |
lineend |
Line end style (round, butt, square). |
linejoin |
Line join style (round, mitre, bevel). |
vector_labels |
Logical; whether to include labels radiating outward from the vectors. |
... |
Additional arguments passed to |
label.colour , label.color , label.alpha
|
Default aesthetics for labels. Set to NULL to inherit from the data's aesthetics. |
parse |
If |
check_overlap |
If |
na.rm |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Vectors are positions relative to some common reference point, in this case the origin; they comprise direction and magnitude. Vectors are usually represented with arrows rather than markers (points).
Vectors are commonly used to represent numerical variables in biplots, as
by Gabriel (1971) and Greenacre (2010). Gardner & le Roux (2002) refer to
these as Gabriel biplots. This layer, with optional radiating text labels,
is adapted from ggbiplot()
in the off-CRAN extensions of the same name
(Vu, 2014; Telford, 2017; Gegzna, 2018).
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
geom_vector()
understands the following aesthetics (required aesthetics
are in bold):
x
y
alpha
colour
linetype
label
size
angle
hjust
vjust
family
fontface
lineheight
group
Gabriel KR (1971) "The biplot graphic display of matrices with application to principal component analysis". Biometrika 58(3), 453–467. doi:10.1093/biomet/58.3.453
Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html
Gardner S, le Roux N (2002) "Biplot Methodology for Discriminant Analysis Based upon Robust Methods and Principal Curves". Classification, Clustering, and Data Analysis: Recent Advances and Applications: 169–176. https://link.springer.com/chapter/10.1007/978-3-642-56181-8_18
Vincent Q. Vu (2014). ggbiplot: A 'ggplot2' based biplot. R package version
0.55. https://github.com/vqv/ggbiplot, experimental
branch
Richard J Telford (2017). ggbiplot: A 'ggplot2' based biplot. R package
version 0.6. https://github.com/richardjtelford/ggbiplot (fork),
experimental
branch
Vilmantas Gegzna (2018). ggbiplot: A 'ggplot2' based biplot. R package
version 0.55. https://github.com/forked-packages/ggbiplot (fork), experimental
branch
Other geom layers:
geom_axis()
,
geom_bagplot()
,
geom_interpolation()
,
geom_isoline()
,
geom_lineranges()
,
geom_origin()
,
geom_rule()
,
geom_text_radiate()
us_center <- sapply(state.center, \(x) (min(x) + max(x)) / 2) state_center <- cbind( state = state.abb, sweep(as.data.frame(state.center), 2, us_center, "-") ) ggplot(state_center, aes(x, y, label = state)) + coord_equal() + geom_vector()
us_center <- sapply(state.center, \(x) (min(x) + max(x)) / 2) state_center <- cbind( state = state.abb, sweep(as.data.frame(state.center), 2, us_center, "-") ) ggplot(state_center, aes(x, y, label = state)) + coord_equal() + geom_vector()
Build a biplot visualization from ordination data wrapped as a tbl_ord object.
ggbiplot( ordination = NULL, mapping = aes(x = 1, y = 2), axis.type = "interpolative", xlim = NULL, ylim = NULL, expand = TRUE, clip = "on", axis.percents = TRUE, sec.axes = NULL, scale.factor = "inertia", scale_rows = NULL, scale_cols = NULL, ... ) ord_aes(ordination, ...)
ggbiplot( ordination = NULL, mapping = aes(x = 1, y = 2), axis.type = "interpolative", xlim = NULL, ylim = NULL, expand = TRUE, clip = "on", axis.percents = TRUE, sec.axes = NULL, scale.factor = "inertia", scale_rows = NULL, scale_cols = NULL, ... ) ord_aes(ordination, ...)
ordination |
A tbl_ord. |
mapping |
List of default aesthetic mappings to use for the biplot. The
default assigns the first two coordinates to the aesthetics |
axis.type |
Character, partially matched; whether to build an
|
xlim , ylim
|
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
axis.percents |
Whether to concatenate default axis labels with inertia percentages. |
sec.axes |
Matrix factor character to specify a secondary set of axes. |
scale.factor |
Either a numeric value, used to scale the secondary axes
against the primary axes, or the name of a harmonizing function (currently
|
scale_rows , scale_cols
|
Either the character name of a numeric variable
in |
... |
Additional arguments passed to |
ggbiplot()
produces a ggplot object from a tbl_ord
object ordination
. The baseline object is the default unadorned
"ggplot"
-class object p
with the following differences from what
ggplot2::ggplot()
returns:
p$mapping
is augmented with .matrix = .matrix
, which expects either
.matrix = "rows"
or .matrix = "cols"
from the biplot.
p$coordinates
is defaulted to ggplot2::coord_equal()
in order to
faithfully render the geometry of an ordination. The optional parameters
xlim
, ylim
, expand
, and clip
are passed to coord_equal()
and
default to its ggplot2 defaults.
When x
or y
are mapped to coordinates of ordination
, and if
axis.percents
is TRUE
, p$labels$x
or p$labels$y
are defaulted to the
coordinate names concatenated with the percentages of inertia
captured by the coordinates.
p
is assigned the class "ggbiplot"
in addition to "ggplot"
. This
serves no functional purpose currently.
Furthermore, the user may feed single integer values to the x
and y
aesthetics, which will be interpreted as the corresponding coordinates in the
ordination. Currently only 2-dimensional biplots are supported, so both x
and y
must take coordinate values.
ord_aes()
is a convenience function that generates a full-rank set of
coordinate aesthetics ..coord1
, ..coord2
, etc. mapped to the shared
coordinates of the ordination object, along with any additional aesthetics
that are processed internally by ggplot2::aes()
.
The axis.type
parameter controls whether the biplot is interpolative or
predictive, though predictive biplots are still experimental and limited to
linear methods like PCA. Gower & Hand (1996) and Gower, Gardner–Lubbe, & le
Roux (2011) thoroughly explain the construction and interpretation of
predictive biplots.
A ggplot object.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
Gower JC & Hand DJ (1996) Biplots. Chapman & Hall, ISBN: 0-412-71630-5.
Gower JC, Gardner–Lubbe S, & le Roux NJ (2011) Understanding Biplots. Wiley, ISBN: 978-0-470-01255-0. https://www.wiley.com/go/biplots
ggplot2::ggplot2()
, on which ggbiplot()
is built
# compute PCA of Anderson iris measurements iris[, -5] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% confer_inertia(1) %>% mutate_rows(species = iris$Species) %>% mutate_cols(measure = gsub("\\.", " ", tolower(names(iris)[-5]))) %>% print() -> iris_pca # row-principal biplot with range-harmonized secondary axis iris_pca %>% ggbiplot(aes(color = species), sec.axes = "cols", scale.factor = "range") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_point() + geom_cols_vector(aes(label = measure), color = "#444444") + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Variable loadings scaled to secondary axes" ) + expand_limits(y = c(-1, 3.5)) # row-principal biplot with manually rescaled secondary axis iris_pca %>% ggbiplot(aes(color = species), sec.axes = "cols", scale.factor = 2) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_point() + geom_cols_vector(aes(label = measure), color = "#444444") + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Variable loadings scaled to secondary axes" ) + expand_limits(y = c(-1, 3.5)) # Performance measures can be regressed on the artificial coordinates of # ordinated vehicle specs. Because the ordination of specs ignores performance, # these coordinates will probably not be highly predictive. The gradient of each # performance measure along the artificial axes is visualized by projecting the # regression coefficients onto the ordination biplot. # scaled principal components analysis of vehicle specs mtcars_specs_pca <- ordinate( mtcars, cols = c(cyl, disp, hp, drat, wt, vs, carb), model = ~ princomp(., cor = TRUE) ) # data frame of vehicle performance measures mtcars %>% subset(select = c(mpg, qsec)) %>% as.matrix() %>% print() -> mtcars_perf # regress performance measures on principal components lm(mtcars_perf ~ get_rows(mtcars_specs_pca)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_pca_lm # regression biplot ggbiplot(mtcars_specs_pca, aes(label = name), sec.axes = "rows", scale.factor = .5) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_pca_lm) + expand_limits(x = c(-2.5, 2)) # multidimensional scaling based on a scaled cosine distance of vehicle specs cosine_dist <- function(x) { x <- as.matrix(x) num <- x %*% t(x) denom_rt <- as.matrix(rowSums(x^2)) denom <- sqrt(denom_rt %*% t(denom_rt)) as.dist(1 - num / denom) } mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale() %>% as.data.frame() -> mtcars_specs_cmds # names must be consistent with `cmdscale_ord()` below names(mtcars_specs_cmds) <- c("PCo1", "PCo2") # regress performance measures on principal coordinates lm(mtcars_perf ~ as.matrix(mtcars_specs_cmds)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_cmds_lm # multidimensional scaling using `cmdscale_ord()` mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale_ord() %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_specs_cmds_ord # regression biplot ggbiplot(mtcars_specs_cmds_ord, aes(label = name), sec.axes = "rows", scale.factor = 3) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_cmds_lm) + expand_limits(x = c(-2.25, 1.25), y = c(-2, 1.5)) # PCA of iris data iris_pca <- ordinate(iris, cols = 1:4, prcomp, scale = TRUE) # row-principal predictive biplot iris_pca %>% ggbiplot(axis.type = "predictive") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_cols_axis(aes(label = name, center = center, scale = scale)) + geom_rows_point(aes(color = Species), alpha = .5) + ggtitle("Predictive biplot of Anderson iris measurements") # with two calibrated axes iris_pca %>% ggbiplot(axis.type = "predictive") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin() + stat_cols_rule( subset = c(2, 4), fontface = "bold", text.fontface = "plain", aes(label = name, center = center, scale = scale) ) + geom_rows_point(aes(color = Species), alpha = .5) + expand_limits(x = c(-5, 5), y = c(-5, 5)) + ggtitle("Predictive biplot of Anderson iris measurements")
# compute PCA of Anderson iris measurements iris[, -5] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% confer_inertia(1) %>% mutate_rows(species = iris$Species) %>% mutate_cols(measure = gsub("\\.", " ", tolower(names(iris)[-5]))) %>% print() -> iris_pca # row-principal biplot with range-harmonized secondary axis iris_pca %>% ggbiplot(aes(color = species), sec.axes = "cols", scale.factor = "range") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_point() + geom_cols_vector(aes(label = measure), color = "#444444") + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Variable loadings scaled to secondary axes" ) + expand_limits(y = c(-1, 3.5)) # row-principal biplot with manually rescaled secondary axis iris_pca %>% ggbiplot(aes(color = species), sec.axes = "cols", scale.factor = 2) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_point() + geom_cols_vector(aes(label = measure), color = "#444444") + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Variable loadings scaled to secondary axes" ) + expand_limits(y = c(-1, 3.5)) # Performance measures can be regressed on the artificial coordinates of # ordinated vehicle specs. Because the ordination of specs ignores performance, # these coordinates will probably not be highly predictive. The gradient of each # performance measure along the artificial axes is visualized by projecting the # regression coefficients onto the ordination biplot. # scaled principal components analysis of vehicle specs mtcars_specs_pca <- ordinate( mtcars, cols = c(cyl, disp, hp, drat, wt, vs, carb), model = ~ princomp(., cor = TRUE) ) # data frame of vehicle performance measures mtcars %>% subset(select = c(mpg, qsec)) %>% as.matrix() %>% print() -> mtcars_perf # regress performance measures on principal components lm(mtcars_perf ~ get_rows(mtcars_specs_pca)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_pca_lm # regression biplot ggbiplot(mtcars_specs_pca, aes(label = name), sec.axes = "rows", scale.factor = .5) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_pca_lm) + expand_limits(x = c(-2.5, 2)) # multidimensional scaling based on a scaled cosine distance of vehicle specs cosine_dist <- function(x) { x <- as.matrix(x) num <- x %*% t(x) denom_rt <- as.matrix(rowSums(x^2)) denom <- sqrt(denom_rt %*% t(denom_rt)) as.dist(1 - num / denom) } mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale() %>% as.data.frame() -> mtcars_specs_cmds # names must be consistent with `cmdscale_ord()` below names(mtcars_specs_cmds) <- c("PCo1", "PCo2") # regress performance measures on principal coordinates lm(mtcars_perf ~ as.matrix(mtcars_specs_cmds)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_cmds_lm # multidimensional scaling using `cmdscale_ord()` mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale_ord() %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_specs_cmds_ord # regression biplot ggbiplot(mtcars_specs_cmds_ord, aes(label = name), sec.axes = "rows", scale.factor = 3) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_cmds_lm) + expand_limits(x = c(-2.25, 1.25), y = c(-2, 1.5)) # PCA of iris data iris_pca <- ordinate(iris, cols = 1:4, prcomp, scale = TRUE) # row-principal predictive biplot iris_pca %>% ggbiplot(axis.type = "predictive") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_cols_axis(aes(label = name, center = center, scale = scale)) + geom_rows_point(aes(color = Species), alpha = .5) + ggtitle("Predictive biplot of Anderson iris measurements") # with two calibrated axes iris_pca %>% ggbiplot(axis.type = "predictive") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_origin() + stat_cols_rule( subset = c(2, 4), fontface = "bold", text.fontface = "plain", aes(label = name, center = center, scale = scale) ) + geom_rows_point(aes(color = Species), alpha = .5) + expand_limits(x = c(-5, 5), y = c(-5, 5)) + ggtitle("Predictive biplot of Anderson iris measurements")
Sites, types, and compositions of glass samples from archaeological sites in Israel.
data(glass)
data(glass)
A tibble with 68 cases and 16 variables:
site at which sample was found
analysis identifier
furnace identifier
type of sample
normalized weight percent oxide of each component
Chunks of unformed glass from several furnaces found at the primary Byzantine-era site of Bet Eli'ezer, along with samples from other sites with weaker evidence of glass-making (Apollonia and Dor) and and from an Islamic-era site (Banias), were analyzed using X-ray spectrometry to determine their major components.
Baxter & Freestone (2006) used these data to illustrate log-ratio analysis.
Freestone &al (2000), Table 2.
Freestone IC, Gorin-Rosen Y, & Hughes MJ (2000) "Primary glass from Israel and the production of glass in Late Antiquity and the early Islamic period". La route du verre: Ateliers primaires et secondaires du second millénaire av. J.-C. au Moyen Âge: 65–83. https://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1158762
Baxter MJ & Freestone IC (2006) "Log-Ratio Compositional Data Analysis in Archaeometry". Archaeometry, 48(3): 511–531. doi:10.1111/j.1475-4754.2006.00270.x
# subset glass data to one site and major components head(glass) glass_main <- subset( glass, Site == "Bet Eli'ezer", select = c("SiO2", "Na2O", "CaO", "Al2O3", "MgO", "K2O") ) # format as a data frame with row names glass_main <- as.data.frame(glass_main) rownames(glass_main) <- subset(glass, Site == "Bet Eli'ezer")$Anal # perform log-ratio analysis glass_lra <- lra(glass_main, compositional = TRUE, weighted = FALSE) # inspect LRA row and column coordinates head(glass_lra$row.coords) glass_lra$column.coords # inspect singular values of LRA glass_lra$sv # plot samples and measurements in a biplot biplot( x = glass_lra$row.coords %*% diag(glass_lra$sv), y = glass_lra$column.coords, xlab = "Sample (principal coord.)", ylab = "" ) mtext("Component (standard coord.)", side = 4L, line = 3L)
# subset glass data to one site and major components head(glass) glass_main <- subset( glass, Site == "Bet Eli'ezer", select = c("SiO2", "Na2O", "CaO", "Al2O3", "MgO", "K2O") ) # format as a data frame with row names glass_main <- as.data.frame(glass_main) rownames(glass_main) <- subset(glass, Site == "Bet Eli'ezer")$Anal # perform log-ratio analysis glass_lra <- lra(glass_main, compositional = TRUE, weighted = FALSE) # inspect LRA row and column coordinates head(glass_lra$row.coords) glass_lra$column.coords # inspect singular values of LRA glass_lra$sv # plot samples and measurements in a biplot biplot( x = glass_lra$row.coords %*% diag(glass_lra$sv), y = glass_lra$column.coords, xlab = "Sample (principal coord.)", ylab = "" ) mtext("Component (standard coord.)", side = 4L, line = 3L)
This function replicates MASS::lda()
with options and defaults
to retain elements useful to the tbl_ord class and biplot calculations.
lda_ord(x, ...) ## S3 method for class 'formula' lda_ord(formula, data, ..., subset, na.action) ## S3 method for class 'data.frame' lda_ord(x, ...) ## S3 method for class 'matrix' lda_ord(x, grouping, ..., subset, na.action) ## Default S3 method: lda_ord( x, grouping, prior = proportions, tol = 1e-04, method = c("moment", "mle", "mve", "t"), CV = FALSE, nu = 5, ..., ret.x = TRUE, ret.grouping = TRUE, axes.scale = "unstandardized" ) ## S3 method for class 'lda_ord' predict( object, newdata, prior = object$prior, dimen, method = c("plug-in", "predictive", "debiased"), ... )
lda_ord(x, ...) ## S3 method for class 'formula' lda_ord(formula, data, ..., subset, na.action) ## S3 method for class 'data.frame' lda_ord(x, ...) ## S3 method for class 'matrix' lda_ord(x, grouping, ..., subset, na.action) ## Default S3 method: lda_ord( x, grouping, prior = proportions, tol = 1e-04, method = c("moment", "mle", "mve", "t"), CV = FALSE, nu = 5, ..., ret.x = TRUE, ret.grouping = TRUE, axes.scale = "unstandardized" ) ## S3 method for class 'lda_ord' predict( object, newdata, prior = object$prior, dimen, method = c("plug-in", "predictive", "debiased"), ... )
x |
(required if no formula is given as the principal argument.) a matrix or data frame or Matrix containing the explanatory variables. |
... |
arguments passed to or from other methods. |
formula |
A formula of the form |
data |
An optional data frame, list or environment from which variables
specified in |
subset |
An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |
na.action |
A function to specify the action to be taken if |
grouping |
(required if no formula principal argument is given.) a factor specifying the class for each observation. |
prior |
the prior probabilities of class membership. If unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels. |
tol |
A tolerance to decide if a matrix is singular; it will reject variables
and linear combinations of unit-variance variables whose variance is
less than |
method |
|
CV |
If true, returns results (classes and posterior probabilities) for leave-one-out cross-validation. Note that if the prior is estimated, the proportions in the whole dataset are used. |
nu |
degrees of freedom for |
ret.x , ret.grouping
|
Logical; whether to retain as attributes the data
matrix ( |
axes.scale |
Character string indicating how to left-transform the
|
object |
object of class |
newdata |
data frame of cases to be classified or, if |
dimen |
the dimension of the space to be used. If this is less than |
Linear discriminant analysis relies on an eigendecomposition of the product
of the inverse of the within-class covariance matrix
by
the between-class covariance matrix
. This eigendecomposition can be
motivated as the right (
) half of the singular value decomposition of
the matrix of Mahalanobis distances between the cases after "sphering"
(linearly transforming them so that the within-class covariance is the
identity matrix). LDA are not traditionally represented as biplots, with some
exceptions (Gardner & le Roux, 2005; Greenacre, 2010, p. 109–117).
LDA is implemented as MASS::lda()
in the MASS package, in which the
variables are transformed by a sphering matrix (Venables & Ripley,
2003, p. 331–333). The returned element
scaling
contains the
unstandardized discriminant coefficients, which define the discriminant
scores of the cases and their centroids as linear combinations of the
original variables.
The discriminant coefficients constitute one of several possible choices of
axes for a biplot representation of the LDA. The slightly modified function
lda_ord()
provides additional options:
The standardized discriminant coefficients are obtained by (re)scaling the coefficients by the variable standard deviations. These coefficients indicate the contributions of the variables to the discriminant scores after controlling for their variances (Orlov, 2013).
The variables' contributions to the Mahalanobis variance along each
discriminant axis are obtained by transforming the coefficients by the
inverse of the sphering matrix . Because the contribution biplot
derives from the eigendecomposition of the Mahalanobis distance matrix, the
projections of the centroids and cases onto the variable axes approximate
their variable values after centering and sphering (Greenacre, 2013).
Finally, in contrast to MASS::lda()
, lda_ord()
defaults both ret.x
and
ret.grouping
to TRUE
, so that these elements can be used to compute and
annotate case scores as supplementary elements.
Output from MASS::lda()
with an additional preceding class
'lda_ord' and up to three attributes:
the input data x
, if ret.x = TRUE
the class assignments grouping
, if ret.grouping = TRUE
if the parameter axes.scale
is not 'unstandardized', a matrix
axes.scale
that encodes the transformation of the row space
Gardner S & le Roux NJ (2005) "Extensions of Biplot Methodology to Discriminant Analysis". Journal of Classification 22(1): 59–86. doi:10.1007/s00357-005-0006-7 https://link.springer.com/article/10.1007/s00357-005-0006-7
Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html
Venables WN & Ripley BD (2003) Modern Applied Statistics with S, Fourth Edition. Springer Science & Business Media, ISBN: 0387954570, 9780387954578. https://www.mimuw.edu.pl/~pokar/StatystykaMgr/Books/VenablesRipley_ModernAppliedStatisticsS02.pdf
Orlov K (2013) Answer to "Algebra of LDA. Fisher discrimination power of a variable and Linear Discriminant Analysis". CrossValidated, accessed 2019-07-26. https://stats.stackexchange.com/a/83114/68743
Greenacre M (2013) "Contribution Biplots". Journal of Computational and Graphical Statistics, 22(1): 107–122. https://amstat.tandfonline.com/doi/full/10.1080/10618600.2012.702494
MASS::lda()
, from which lda_ord()
is adapted
# Anderson iris species data centroid iris_centroid <- t(apply(iris[, 1:4], 2, mean)) # unstandardized discriminant coefficients: the discriminant axes are linear # combinations of the centered variables iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "unstandardized") # linear combinations of centered variables print(sweep(iris_lda$means, 2, iris_centroid, "-") %*% get_cols(iris_lda)) # discriminant centroids print(get_rows(iris_lda, elements = "active")) # unstandardized coefficient LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Unstandardized coefficient biplot of iris LDA") + expand_limits(y = c(-3, 5)) # standardized discriminant coefficients: permit comparisons across the # variables iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "standardized") # standardized variable contributions to discriminant axes iris_lda %>% as_tbl_ord() %>% augment_ord() %>% fortify(.matrix = "cols") %>% dplyr::mutate(variable = name) %>% tidyr::gather(discriminant, coefficient, LD1, LD2) %>% ggplot(aes(x = discriminant, y = coefficient, fill = variable)) + geom_bar(position = "dodge", stat = "identity") + labs(y = "Standardized coefficient", x = "Linear discriminant") + theme_bw() + coord_flip() # standardized coefficient LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Standardized coefficient biplot of iris LDA") + expand_limits(y = c(-2, 3)) # variable contributions (de-sphered discriminant coefficients): recover the # inner product relationship with the centered class centroids iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "contribution") # symmetric square root of within-class covariance C_W_eig <- eigen(cov(iris[, 1:4] - iris_lda$means[iris[, 5], ])) C_W_sqrtinv <- C_W_eig$vectors %*% diag(1/sqrt(C_W_eig$values)) %*% t(C_W_eig$vectors) # product of matrix factors (scores and loadings) print(get_rows(iris_lda, elements = "active") %*% t(get_cols(iris_lda))) # "asymmetric" square roots of Mahalanobis distances between variables print(sweep(iris_lda$means, 2, iris_centroid, "-") %*% C_W_sqrtinv) # contribution LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Contribution biplot of iris LDA") + expand_limits(y = c(-2, 3.5))
# Anderson iris species data centroid iris_centroid <- t(apply(iris[, 1:4], 2, mean)) # unstandardized discriminant coefficients: the discriminant axes are linear # combinations of the centered variables iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "unstandardized") # linear combinations of centered variables print(sweep(iris_lda$means, 2, iris_centroid, "-") %*% get_cols(iris_lda)) # discriminant centroids print(get_rows(iris_lda, elements = "active")) # unstandardized coefficient LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Unstandardized coefficient biplot of iris LDA") + expand_limits(y = c(-3, 5)) # standardized discriminant coefficients: permit comparisons across the # variables iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "standardized") # standardized variable contributions to discriminant axes iris_lda %>% as_tbl_ord() %>% augment_ord() %>% fortify(.matrix = "cols") %>% dplyr::mutate(variable = name) %>% tidyr::gather(discriminant, coefficient, LD1, LD2) %>% ggplot(aes(x = discriminant, y = coefficient, fill = variable)) + geom_bar(position = "dodge", stat = "identity") + labs(y = "Standardized coefficient", x = "Linear discriminant") + theme_bw() + coord_flip() # standardized coefficient LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Standardized coefficient biplot of iris LDA") + expand_limits(y = c(-2, 3)) # variable contributions (de-sphered discriminant coefficients): recover the # inner product relationship with the centered class centroids iris_lda <- lda_ord(iris[, 1:4], iris[, 5], axes.scale = "contribution") # symmetric square root of within-class covariance C_W_eig <- eigen(cov(iris[, 1:4] - iris_lda$means[iris[, 5], ])) C_W_sqrtinv <- C_W_eig$vectors %*% diag(1/sqrt(C_W_eig$values)) %*% t(C_W_eig$vectors) # product of matrix factors (scores and loadings) print(get_rows(iris_lda, elements = "active") %*% t(get_cols(iris_lda))) # "asymmetric" square roots of Mahalanobis distances between variables print(sweep(iris_lda$means, 2, iris_centroid, "-") %*% C_W_sqrtinv) # contribution LDA biplot iris_lda %>% as_tbl_ord() %>% augment_ord() %>% ggbiplot() + theme_bw() + geom_rows_point(aes(color = grouping), elements = "score", alpha = 1/3) + geom_rows_point(aes(color = grouping), size = 3) + geom_cols_vector(aes(label = name), color = "#888888", size = 3) + scale_color_brewer(type = "qual", palette = 2) + ggtitle("Contribution biplot of iris LDA") + expand_limits(y = c(-2, 3.5))
Represent log-ratios between variables based on their values on a population of cases.
lra(x, compositional = FALSE, weighted = TRUE) ## S3 method for class 'lra' print(x, nd = length(x$sv), n = 6L, ...) ## S3 method for class 'lra' screeplot(x, main = deparse1(substitute(x)), ...) ## S3 method for class 'lra' biplot( x, choices = c(1L, 2L), scale = c(0, 0), main = deparse1(substitute(x)), var.axes = FALSE, ... ) ## S3 method for class 'lra' plot(x, main = deparse1(substitute(x)), ...)
lra(x, compositional = FALSE, weighted = TRUE) ## S3 method for class 'lra' print(x, nd = length(x$sv), n = 6L, ...) ## S3 method for class 'lra' screeplot(x, main = deparse1(substitute(x)), ...) ## S3 method for class 'lra' biplot( x, choices = c(1L, 2L), scale = c(0, 0), main = deparse1(substitute(x)), var.axes = FALSE, ... ) ## S3 method for class 'lra' plot(x, main = deparse1(substitute(x)), ...)
x |
A numeric matrix or rectangular data set. |
compositional |
Logical; whether to normalize rows of |
weighted |
Logical; whether to weight rows and columns by their sums. |
nd |
Integer; number of shared dimensions to include in print. |
n |
Integer; number of rows of each factor to print. |
main , var.axes , ...
|
Parameters passed to other plotting methods (in the
case of |
choices |
Integer; length-2 vector specifying the components to plot. |
scale |
Numeric; values between 0 and 1 that control how inertia is
conferred unto the points: Row ( |
Log-ratio analysis (LRA) is based on a double-centering of log-transformed data, usually weighted by row and column totals. The technique is suitable for positive-valued variables on a common scale (e.g. percentages). The distances between variables' coordinates (in the full-dimensional space) are their pairwise log-ratios. The distances between cases' coordinates are called their log-ratio distances, and the total variance is the weighted sum of their squares.
LRA is not implemented in standard R distributions but is a useful member of the ordination toolkit. This is a minimal implementation following Greenacre's (2010) exposition in Chapter 7.
Given an data matrix and setting
,
lra()
returns a list of class "lra"
containing three elements:
svThe singular values
row.coordsThe matrix
of row standard coordinates.
column.coordsThe matrix
of column standard coordinates.
row.weightsThe weights used to scale the row coordinates.
column.weightsThe weights used to scale the column coordinates.
Greenacre MJ (2010) Biplots in Practice. Fundacion BBVA, ISBN: 978-84-923846. https://www.fbbva.es/microsite/multivariate-statistics/biplots.html
# U.S. 1973 violent crime arrests head(USArrests) # row and column subsets state_examples <- c("Hawaii", "Mississippi", "North Dakota") arrests <- c(1L, 2L, 4L) # pairwise log-ratios of violent crime arrests for two states arrest_pairs <- combn(arrests, 2L) arrest_ratios <- USArrests[, arrest_pairs[1L, ]] / USArrests[, arrest_pairs[2L, ]] colnames(arrest_ratios) <- paste( colnames(USArrests)[arrest_pairs[1L, ]], "/", colnames(USArrests)[arrest_pairs[2L, ]], sep = "" ) arrest_logratios <- log(arrest_ratios) arrest_logratios[state_examples, ] # non-compositional log-ratio analysis (arrests_lra <- lra(USArrests[, arrests])) screeplot(arrests_lra) biplot(arrests_lra, scale = c(1, 0)) # compositional log-ratio analysis (arrests_lra <- lra(USArrests[, arrests], compositional = TRUE)) biplot(arrests_lra, scale = c(1, 0))
# U.S. 1973 violent crime arrests head(USArrests) # row and column subsets state_examples <- c("Hawaii", "Mississippi", "North Dakota") arrests <- c(1L, 2L, 4L) # pairwise log-ratios of violent crime arrests for two states arrest_pairs <- combn(arrests, 2L) arrest_ratios <- USArrests[, arrest_pairs[1L, ]] / USArrests[, arrest_pairs[2L, ]] colnames(arrest_ratios) <- paste( colnames(USArrests)[arrest_pairs[1L, ]], "/", colnames(USArrests)[arrest_pairs[2L, ]], sep = "" ) arrest_logratios <- log(arrest_ratios) arrest_logratios[state_examples, ] # non-compositional log-ratio analysis (arrests_lra <- lra(USArrests[, arrests])) screeplot(arrests_lra) biplot(arrests_lra, scale = c(1, 0)) # compositional log-ratio analysis (arrests_lra <- lra(USArrests[, arrests], compositional = TRUE)) biplot(arrests_lra, scale = c(1, 0))
These methods extract data from, and attribute new data to,
objects of class "cancor_ord"
. This is a class introduced in this package
to identify objects returned by cancor_ord()
, which wraps
stats::cancor()
.
## S3 method for class 'cancor_ord' as_tbl_ord(x) ## S3 method for class 'cancor_ord' recover_rows(x) ## S3 method for class 'cancor_ord' recover_cols(x) ## S3 method for class 'cancor_ord' recover_inertia(x) ## S3 method for class 'cancor_ord' recover_coord(x) ## S3 method for class 'cancor_ord' recover_conference(x) ## S3 method for class 'cancor_ord' recover_supp_rows(x) ## S3 method for class 'cancor_ord' recover_supp_cols(x) ## S3 method for class 'cancor_ord' recover_aug_rows(x) ## S3 method for class 'cancor_ord' recover_aug_cols(x) ## S3 method for class 'cancor_ord' recover_aug_coord(x)
## S3 method for class 'cancor_ord' as_tbl_ord(x) ## S3 method for class 'cancor_ord' recover_rows(x) ## S3 method for class 'cancor_ord' recover_cols(x) ## S3 method for class 'cancor_ord' recover_inertia(x) ## S3 method for class 'cancor_ord' recover_coord(x) ## S3 method for class 'cancor_ord' recover_conference(x) ## S3 method for class 'cancor_ord' recover_supp_rows(x) ## S3 method for class 'cancor_ord' recover_supp_cols(x) ## S3 method for class 'cancor_ord' recover_aug_rows(x) ## S3 method for class 'cancor_ord' recover_aug_cols(x) ## S3 method for class 'cancor_ord' recover_aug_coord(x)
x |
An ordination object. |
The canonical coefficients (loadings) are obtained directly from the
underlying singular value decomposition and constitute the active elements.
If canonical scores are returned, then they and the structure correlations
are made available as supplementary elements. ordr takes rows and columns
from the intraset correlations $xstructure
and $ystructure
, on which no
intertia is conferred; the interset correlations can be obtained by
conferring inertia onto these.
A biplot of the canonical coefficients can be interpreted as approximating
the -
inner product matrix, inversely weighted by the
and
variances. The canonical scores and structure coefficients are
available as supplementary points if returned by
cancor_ord()
. These can be
used to create biplots of the case scores as linear combinations of loadings
(the coefficients, in standard coordinates, overlaid with the scores) or of
intraset and interset correlations with respect to either data set (the
correlations with inertia conferred entirely onto rows or onto columns).
Greenacre (1984) and ter Braak (1990) describe these families, though ter
Braak recommends against the first.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Greenacre MJ (1984) Theory and applications of correspondence analysis. London: Academic Press, ISBN 0-12-299050-1. http://www.carme-n.org/?sec=books5
ter Braak CJF (1990) "Interpreting canonical correlation analysis through biplots of structure correlations and weights". Psychometrika 55(3), 519–531. doi:10.1007/BF02294765
Other methods for singular value decomposition-based techniques:
methods-correspondence
,
methods-lda
,
methods-lra
,
methods-mca
,
methods-prcomp
,
methods-svd
Other models from the stats package:
methods-cmds
,
methods-factanal
,
methods-kmeans
,
methods-lm
,
methods-prcomp
,
methods-princomp
# data frame of life-cycle savings across countries class(LifeCycleSavings) head(LifeCycleSavings) savings_pop <- LifeCycleSavings[, c("pop15", "pop75")] savings_oec <- LifeCycleSavings[, c("sr", "dpi", "ddpi")] # canonical correlation analysis with scores and correlations included savings_cca <- cancor_ord(savings_pop, savings_oec, scores = TRUE) savings_cca <- augment_ord(as_tbl_ord(savings_cca)) head(get_cols(savings_cca)) head(get_cols(savings_cca, elements = "score")) get_rows(savings_cca, elements = "structure") get_cols(savings_cca, elements = "structure") # biplot of interset and intraset correlations with the population data # NB: `contour = TRUE` is not automatically set as in `geom_density_2d()` savings_cca %>% confer_inertia("cols") %>% ggbiplot(aes(label = name, color = .matrix)) + theme_bw() + theme_scaffold() + geom_unit_circle() + geom_rows_density_2d(elements = "score", color = "grey", contour = TRUE) + geom_rows_vector(arrow = NULL, elements = "structure") + geom_cols_vector(arrow = NULL, elements = "structure", linetype = "dashed") + geom_rows_text(elements = "structure", hjust = "outward") + geom_cols_text(elements = "structure", hjust = "outward") + scale_color_brewer(limits = c("rows", "cols"), type = "qual") + expand_limits(x = c(-1, 1), y = c(-1, 1)) # situate country scores along financial variables savings_cca %>% confer_inertia("rows") %>% ggbiplot(aes(label = name)) + theme_scaffold() + geom_cols_axis(elements = "active") + geom_rows_text(elements = "score")
# data frame of life-cycle savings across countries class(LifeCycleSavings) head(LifeCycleSavings) savings_pop <- LifeCycleSavings[, c("pop15", "pop75")] savings_oec <- LifeCycleSavings[, c("sr", "dpi", "ddpi")] # canonical correlation analysis with scores and correlations included savings_cca <- cancor_ord(savings_pop, savings_oec, scores = TRUE) savings_cca <- augment_ord(as_tbl_ord(savings_cca)) head(get_cols(savings_cca)) head(get_cols(savings_cca, elements = "score")) get_rows(savings_cca, elements = "structure") get_cols(savings_cca, elements = "structure") # biplot of interset and intraset correlations with the population data # NB: `contour = TRUE` is not automatically set as in `geom_density_2d()` savings_cca %>% confer_inertia("cols") %>% ggbiplot(aes(label = name, color = .matrix)) + theme_bw() + theme_scaffold() + geom_unit_circle() + geom_rows_density_2d(elements = "score", color = "grey", contour = TRUE) + geom_rows_vector(arrow = NULL, elements = "structure") + geom_cols_vector(arrow = NULL, elements = "structure", linetype = "dashed") + geom_rows_text(elements = "structure", hjust = "outward") + geom_cols_text(elements = "structure", hjust = "outward") + scale_color_brewer(limits = c("rows", "cols"), type = "qual") + expand_limits(x = c(-1, 1), y = c(-1, 1)) # situate country scores along financial variables savings_cca %>% confer_inertia("rows") %>% ggbiplot(aes(label = name)) + theme_scaffold() + geom_cols_axis(elements = "active") + geom_rows_text(elements = "score")
These methods extract data from, and attribute new data to,
objects of class "cmds_ord"
. This is a class introduced in this package
to identify objects returned by cmdscale_ord()
, which wraps
stats::cmdscale()
.
## S3 method for class 'cmds_ord' as_tbl_ord(x) ## S3 method for class 'cmds_ord' recover_rows(x) ## S3 method for class 'cmds_ord' recover_cols(x) ## S3 method for class 'cmds_ord' recover_inertia(x) ## S3 method for class 'cmds_ord' recover_coord(x) ## S3 method for class 'cmds_ord' recover_conference(x) ## S3 method for class 'cmds_ord' recover_aug_rows(x) ## S3 method for class 'cmds_ord' recover_aug_cols(x) ## S3 method for class 'cmds_ord' recover_aug_coord(x)
## S3 method for class 'cmds_ord' as_tbl_ord(x) ## S3 method for class 'cmds_ord' recover_rows(x) ## S3 method for class 'cmds_ord' recover_cols(x) ## S3 method for class 'cmds_ord' recover_inertia(x) ## S3 method for class 'cmds_ord' recover_coord(x) ## S3 method for class 'cmds_ord' recover_conference(x) ## S3 method for class 'cmds_ord' recover_aug_rows(x) ## S3 method for class 'cmds_ord' recover_aug_cols(x) ## S3 method for class 'cmds_ord' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for eigen-decomposition-based techniques:
methods-eigen
,
methods-factanal
,
methods-princomp
Other models from the stats package:
methods-cancor
,
methods-factanal
,
methods-kmeans
,
methods-lm
,
methods-prcomp
,
methods-princomp
# 'dist' object (matrix of road distances) of large American cities class(UScitiesD) print(UScitiesD) # use multidimensional scaling to infer artificial planar coordinates UScitiesD %>% cmdscale_ord(k = 2) %>% as_tbl_ord() %>% print() -> usa_mds # recover (equivalent) matrices of row and column artificial coordinates get_rows(usa_mds) get_cols(usa_mds) # augment ordination with point names (usa_mds <- augment_ord(usa_mds)) # reorient biplot to conventional compass usa_mds %>% negate_ord(c(1, 2)) %>% ggbiplot() + geom_cols_text(aes(label = name), size = 3) + ggtitle("MDS biplot of distances between U.S. cities")
# 'dist' object (matrix of road distances) of large American cities class(UScitiesD) print(UScitiesD) # use multidimensional scaling to infer artificial planar coordinates UScitiesD %>% cmdscale_ord(k = 2) %>% as_tbl_ord() %>% print() -> usa_mds # recover (equivalent) matrices of row and column artificial coordinates get_rows(usa_mds) get_cols(usa_mds) # augment ordination with point names (usa_mds <- augment_ord(usa_mds)) # reorient biplot to conventional compass usa_mds %>% negate_ord(c(1, 2)) %>% ggbiplot() + geom_cols_text(aes(label = name), size = 3) + ggtitle("MDS biplot of distances between U.S. cities")
These methods extract data from, and attribute new data to,
objects of class "correspondence"
from the MASS
package.
## S3 method for class 'correspondence' as_tbl_ord(x) ## S3 method for class 'correspondence' recover_rows(x) ## S3 method for class 'correspondence' recover_cols(x) ## S3 method for class 'correspondence' recover_inertia(x) ## S3 method for class 'correspondence' recover_conference(x) ## S3 method for class 'correspondence' recover_coord(x) ## S3 method for class 'correspondence' recover_aug_rows(x) ## S3 method for class 'correspondence' recover_aug_cols(x) ## S3 method for class 'correspondence' recover_aug_coord(x)
## S3 method for class 'correspondence' as_tbl_ord(x) ## S3 method for class 'correspondence' recover_rows(x) ## S3 method for class 'correspondence' recover_cols(x) ## S3 method for class 'correspondence' recover_inertia(x) ## S3 method for class 'correspondence' recover_conference(x) ## S3 method for class 'correspondence' recover_coord(x) ## S3 method for class 'correspondence' recover_aug_rows(x) ## S3 method for class 'correspondence' recover_aug_cols(x) ## S3 method for class 'correspondence' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-lda
,
methods-lra
,
methods-mca
,
methods-prcomp
,
methods-svd
Other models from the MASS package:
methods-lda
,
methods-mca
# table of hair and eye color data collapsed by sex data(quine, package = "MASS") class(quine) head(quine) # use correspondence analysis to construct row and column profiles (quine_ca <- MASS::corresp(~ Age + Eth, data = quine)) (quine_ca <- as_tbl_ord(quine_ca)) # recover row and column profiles get_rows(quine_ca) get_cols(quine_ca) # augment profiles with names, masses, distances, and inertias (quine_ca <- augment_ord(quine_ca))
# table of hair and eye color data collapsed by sex data(quine, package = "MASS") class(quine) head(quine) # use correspondence analysis to construct row and column profiles (quine_ca <- MASS::corresp(~ Age + Eth, data = quine)) (quine_ca <- as_tbl_ord(quine_ca)) # recover row and column profiles get_rows(quine_ca) get_cols(quine_ca) # augment profiles with names, masses, distances, and inertias (quine_ca <- augment_ord(quine_ca))
These methods extract data from, and attribute new data to,
objects of class "eigen"
returned by base::eigen()
when the parameter
only.values
is set to FALSE
or of class "eigen_ord"
returned by
eigen_ord()
.
## S3 method for class 'eigen' as_tbl_ord(x) ## S3 method for class 'eigen' recover_rows(x) ## S3 method for class 'eigen' recover_cols(x) ## S3 method for class 'eigen' recover_inertia(x) ## S3 method for class 'eigen' recover_coord(x) ## S3 method for class 'eigen' recover_conference(x) ## S3 method for class 'eigen_ord' recover_aug_rows(x) ## S3 method for class 'eigen_ord' recover_aug_cols(x) ## S3 method for class 'eigen' recover_aug_coord(x) ## S3 method for class 'eigen_ord' as_tbl_ord(x) ## S3 method for class 'eigen_ord' recover_rows(x) ## S3 method for class 'eigen_ord' recover_cols(x) ## S3 method for class 'eigen_ord' recover_inertia(x) ## S3 method for class 'eigen_ord' recover_coord(x) ## S3 method for class 'eigen_ord' recover_conference(x) ## S3 method for class 'eigen_ord' recover_aug_rows(x) ## S3 method for class 'eigen_ord' recover_aug_cols(x) ## S3 method for class 'eigen_ord' recover_aug_coord(x)
## S3 method for class 'eigen' as_tbl_ord(x) ## S3 method for class 'eigen' recover_rows(x) ## S3 method for class 'eigen' recover_cols(x) ## S3 method for class 'eigen' recover_inertia(x) ## S3 method for class 'eigen' recover_coord(x) ## S3 method for class 'eigen' recover_conference(x) ## S3 method for class 'eigen_ord' recover_aug_rows(x) ## S3 method for class 'eigen_ord' recover_aug_cols(x) ## S3 method for class 'eigen' recover_aug_coord(x) ## S3 method for class 'eigen_ord' as_tbl_ord(x) ## S3 method for class 'eigen_ord' recover_rows(x) ## S3 method for class 'eigen_ord' recover_cols(x) ## S3 method for class 'eigen_ord' recover_inertia(x) ## S3 method for class 'eigen_ord' recover_coord(x) ## S3 method for class 'eigen_ord' recover_conference(x) ## S3 method for class 'eigen_ord' recover_aug_rows(x) ## S3 method for class 'eigen_ord' recover_aug_cols(x) ## S3 method for class 'eigen_ord' recover_aug_coord(x)
x |
An ordination object. |
base::eigen()
usually returns an object of class "eigen"
, which contains
the numerical eigendecomposition without annotations such as row and column
names. To facilitate downstream analysis, eigen_ord()
returns a modified
'eigen' object with row names taken (if available) from the original data and
column names indicating the integer index of each eigenvector.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for eigen-decomposition-based techniques:
methods-cmds
,
methods-factanal
,
methods-princomp
Other models from the base package:
methods-svd
# eigendecompose covariance matrix of ability tests gi_eigen <- eigen(ability.cov$cov) # recover eigenvectors get_rows(gi_eigen) identical(get_cols(gi_eigen), get_rows(gi_eigen)) # wrap as a 'tbl_ord' as_tbl_ord(gi_eigen) # same eigendecomposition, preserving names gi_eigen <- eigen_ord(ability.cov$cov) # wrap as a 'tbl_ord' and augment with dimension names augment_ord(as_tbl_ord(gi_eigen)) # decomposition returns pure eigenvectors get_conference(gi_eigen)
# eigendecompose covariance matrix of ability tests gi_eigen <- eigen(ability.cov$cov) # recover eigenvectors get_rows(gi_eigen) identical(get_cols(gi_eigen), get_rows(gi_eigen)) # wrap as a 'tbl_ord' as_tbl_ord(gi_eigen) # same eigendecomposition, preserving names gi_eigen <- eigen_ord(ability.cov$cov) # wrap as a 'tbl_ord' and augment with dimension names augment_ord(as_tbl_ord(gi_eigen)) # decomposition returns pure eigenvectors get_conference(gi_eigen)
These methods extract data from, and attribute new data to,
objects of class "factanal"
as returned by stats::factanal()
.
## S3 method for class 'factanal' as_tbl_ord(x) ## S3 method for class 'factanal' recover_rows(x) ## S3 method for class 'factanal' recover_cols(x) ## S3 method for class 'factanal' recover_inertia(x) ## S3 method for class 'factanal' recover_coord(x) ## S3 method for class 'factanal' recover_conference(x) ## S3 method for class 'factanal' recover_supp_rows(x) ## S3 method for class 'factanal' recover_aug_rows(x) ## S3 method for class 'factanal' recover_aug_cols(x) ## S3 method for class 'factanal' recover_aug_coord(x)
## S3 method for class 'factanal' as_tbl_ord(x) ## S3 method for class 'factanal' recover_rows(x) ## S3 method for class 'factanal' recover_cols(x) ## S3 method for class 'factanal' recover_inertia(x) ## S3 method for class 'factanal' recover_coord(x) ## S3 method for class 'factanal' recover_conference(x) ## S3 method for class 'factanal' recover_supp_rows(x) ## S3 method for class 'factanal' recover_aug_rows(x) ## S3 method for class 'factanal' recover_aug_cols(x) ## S3 method for class 'factanal' recover_aug_coord(x)
x |
An ordination object. |
Factor analysis of a data matrix relies on an an eigendecomposition of its
correlation matrix, whose eigenvectors (up to weighting) comprise the
variable loadings. For this reason, both row and column recoverers retrieve
the loadings and inertia is evenly distributed between them. When computed
and returned by stats::factanal()
, the case scores are accessible as
supplementary elements. Redistribution of inertia commutes through both
score calculations.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for eigen-decomposition-based techniques:
methods-cmds
,
methods-eigen
,
methods-princomp
Other models from the stats package:
methods-cancor
,
methods-cmds
,
methods-kmeans
,
methods-lm
,
methods-prcomp
,
methods-princomp
# data frame of Swiss fertility and socioeconomic indicators class(swiss) head(swiss) # perform factor analysis swiss_fa <- factanal(~ ., factors = 2L, data = swiss, scores = "regression") # wrap as a 'tbl_ord' object (swiss_fa <- as_tbl_ord(swiss_fa)) # recover loadings get_rows(swiss_fa, elements = "active") get_cols(swiss_fa) # recover scores head(get_rows(swiss_fa, elements = "score")) # augment column loadings with uniquenesses (swiss_fa <- augment_ord(swiss_fa)) # symmetric biplot swiss_fa %>% ggbiplot() + theme_bw() + geom_cols_vector(aes(color = uniqueness, label = name)) + expand_limits(x = c(-2, 2.5), y = c(-1.5, 2))
# data frame of Swiss fertility and socioeconomic indicators class(swiss) head(swiss) # perform factor analysis swiss_fa <- factanal(~ ., factors = 2L, data = swiss, scores = "regression") # wrap as a 'tbl_ord' object (swiss_fa <- as_tbl_ord(swiss_fa)) # recover loadings get_rows(swiss_fa, elements = "active") get_cols(swiss_fa) # recover scores head(get_rows(swiss_fa, elements = "score")) # augment column loadings with uniquenesses (swiss_fa <- augment_ord(swiss_fa)) # symmetric biplot swiss_fa %>% ggbiplot() + theme_bw() + geom_cols_vector(aes(color = uniqueness, label = name)) + expand_limits(x = c(-2, 2.5), y = c(-1.5, 2))
These methods extract data from, and attribute new data to,
objects of class "kmeans"
as returned by stats::kmeans()
.
## S3 method for class 'kmeans' as_tbl_ord(x) ## S3 method for class 'kmeans' recover_rows(x) ## S3 method for class 'kmeans' recover_cols(x) ## S3 method for class 'kmeans' recover_coord(x) ## S3 method for class 'kmeans' recover_aug_rows(x) ## S3 method for class 'kmeans' recover_aug_cols(x) ## S3 method for class 'kmeans' recover_aug_coord(x)
## S3 method for class 'kmeans' as_tbl_ord(x) ## S3 method for class 'kmeans' recover_rows(x) ## S3 method for class 'kmeans' recover_cols(x) ## S3 method for class 'kmeans' recover_coord(x) ## S3 method for class 'kmeans' recover_aug_rows(x) ## S3 method for class 'kmeans' recover_aug_cols(x) ## S3 method for class 'kmeans' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for idiosyncratic techniques:
methods-lm
Other models from the stats package:
methods-cancor
,
methods-cmds
,
methods-factanal
,
methods-lm
,
methods-prcomp
,
methods-princomp
# data frame of Anderson iris species measurements class(iris) head(iris) # compute 3-means clustering on scaled iris measurements set.seed(5601L) iris %>% subset(select = -Species) %>% scale() %>% kmeans(centers = 3) %>% print() -> iris_km # visualize clusters using PCA iris %>% subset(select = -Species) %>% prcomp() %>% as_tbl_ord() %>% mutate_rows(cluster = iris_km$cluster) %>% ggbiplot() + geom_rows_point(aes(color = factor(as.character(as.integer(cluster)), levels = as.character(seq(3L))))) + scale_color_brewer(type = "qual", name = "cluster") # wrap as a 'tbl_ord' object (iris_km_ord <- as_tbl_ord(iris_km)) # augment everything with names, observations with cluster assignment (iris_km_ord <- augment_ord(iris_km_ord)) # summarize clusters with standard deviation iris_km_ord %>% tidy() %>% transform(sdev = sqrt(withinss / size)) # discriminate between clusters 2 and 3 iris_km_ord %>% ggbiplot(aes(x = `2`, y = `3`), color = factor(.cluster)) + geom_jitter(stat = "rows", aes(shape = cluster), width = .2, height = .2) + geom_cols_axis(aes(color = `1`, label = name), text.size = 2, text_dodge = .1, size = 3, label.alpha = .5) + scale_x_continuous(expand = expansion(mult = .8)) + scale_y_continuous(expand = expansion(mult = .5)) + ggtitle( "Measurement loadings onto clusters 2 and 3", "Color indicates loadings onto cluster 1" )
# data frame of Anderson iris species measurements class(iris) head(iris) # compute 3-means clustering on scaled iris measurements set.seed(5601L) iris %>% subset(select = -Species) %>% scale() %>% kmeans(centers = 3) %>% print() -> iris_km # visualize clusters using PCA iris %>% subset(select = -Species) %>% prcomp() %>% as_tbl_ord() %>% mutate_rows(cluster = iris_km$cluster) %>% ggbiplot() + geom_rows_point(aes(color = factor(as.character(as.integer(cluster)), levels = as.character(seq(3L))))) + scale_color_brewer(type = "qual", name = "cluster") # wrap as a 'tbl_ord' object (iris_km_ord <- as_tbl_ord(iris_km)) # augment everything with names, observations with cluster assignment (iris_km_ord <- augment_ord(iris_km_ord)) # summarize clusters with standard deviation iris_km_ord %>% tidy() %>% transform(sdev = sqrt(withinss / size)) # discriminate between clusters 2 and 3 iris_km_ord %>% ggbiplot(aes(x = `2`, y = `3`), color = factor(.cluster)) + geom_jitter(stat = "rows", aes(shape = cluster), width = .2, height = .2) + geom_cols_axis(aes(color = `1`, label = name), text.size = 2, text_dodge = .1, size = 3, label.alpha = .5) + scale_x_continuous(expand = expansion(mult = .8)) + scale_y_continuous(expand = expansion(mult = .5)) + ggtitle( "Measurement loadings onto clusters 2 and 3", "Color indicates loadings onto cluster 1" )
These methods extract data from, and attribute new data to,
objects of class "lda"
and "lda_ord"
as returned by MASS::lda()
and
lda_ord()
.
## S3 method for class 'lda' as_tbl_ord(x) ## S3 method for class 'lda_ord' as_tbl_ord(x) ## S3 method for class 'lda' recover_rows(x) ## S3 method for class 'lda_ord' recover_rows(x) ## S3 method for class 'lda' recover_cols(x) ## S3 method for class 'lda_ord' recover_cols(x) ## S3 method for class 'lda' recover_inertia(x) ## S3 method for class 'lda_ord' recover_inertia(x) ## S3 method for class 'lda' recover_coord(x) ## S3 method for class 'lda_ord' recover_coord(x) ## S3 method for class 'lda' recover_conference(x) ## S3 method for class 'lda_ord' recover_conference(x) ## S3 method for class 'lda' recover_aug_rows(x) ## S3 method for class 'lda_ord' recover_aug_rows(x) ## S3 method for class 'lda' recover_aug_cols(x) ## S3 method for class 'lda_ord' recover_aug_cols(x) ## S3 method for class 'lda' recover_aug_coord(x) ## S3 method for class 'lda_ord' recover_aug_coord(x) ## S3 method for class 'lda' recover_supp_rows(x) ## S3 method for class 'lda_ord' recover_supp_rows(x)
## S3 method for class 'lda' as_tbl_ord(x) ## S3 method for class 'lda_ord' as_tbl_ord(x) ## S3 method for class 'lda' recover_rows(x) ## S3 method for class 'lda_ord' recover_rows(x) ## S3 method for class 'lda' recover_cols(x) ## S3 method for class 'lda_ord' recover_cols(x) ## S3 method for class 'lda' recover_inertia(x) ## S3 method for class 'lda_ord' recover_inertia(x) ## S3 method for class 'lda' recover_coord(x) ## S3 method for class 'lda_ord' recover_coord(x) ## S3 method for class 'lda' recover_conference(x) ## S3 method for class 'lda_ord' recover_conference(x) ## S3 method for class 'lda' recover_aug_rows(x) ## S3 method for class 'lda_ord' recover_aug_rows(x) ## S3 method for class 'lda' recover_aug_cols(x) ## S3 method for class 'lda_ord' recover_aug_cols(x) ## S3 method for class 'lda' recover_aug_coord(x) ## S3 method for class 'lda_ord' recover_aug_coord(x) ## S3 method for class 'lda' recover_supp_rows(x) ## S3 method for class 'lda_ord' recover_supp_rows(x)
x |
An ordination object. |
See lda-ord for details.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-correspondence
,
methods-lra
,
methods-mca
,
methods-prcomp
,
methods-svd
Other models from the MASS package:
methods-correspondence
,
methods-mca
# data frame of Anderson iris species measurements class(iris) head(iris) # default (unstandardized discriminant) coefficients lda_ord(iris[, 1:4], iris[, 5]) %>% as_tbl_ord() %>% print() -> iris_lda # recover centroid coordinates and measurement discriminant coefficients get_rows(iris_lda, elements = "active") head(get_rows(iris_lda, elements = "score")) get_cols(iris_lda) # augment ordination with centroid and measurement names augment_ord(iris_lda)
# data frame of Anderson iris species measurements class(iris) head(iris) # default (unstandardized discriminant) coefficients lda_ord(iris[, 1:4], iris[, 5]) %>% as_tbl_ord() %>% print() -> iris_lda # recover centroid coordinates and measurement discriminant coefficients get_rows(iris_lda, elements = "active") head(get_rows(iris_lda, elements = "score")) get_cols(iris_lda) # augment ordination with centroid and measurement names augment_ord(iris_lda)
These methods extract data from, and attribute new data to,
objects of class "lm"
, "glm"
, and "mlm"
as returned by stats::lm()
and stats::glm()
.
## S3 method for class 'lm' as_tbl_ord(x) ## S3 method for class 'lm' recover_rows(x) ## S3 method for class 'lm' recover_cols(x) ## S3 method for class 'lm' recover_coord(x) ## S3 method for class 'lm' recover_aug_rows(x) ## S3 method for class 'lm' recover_aug_cols(x) ## S3 method for class 'lm' recover_aug_coord(x) ## S3 method for class 'glm' recover_aug_rows(x) ## S3 method for class 'mlm' recover_rows(x) ## S3 method for class 'mlm' recover_cols(x) ## S3 method for class 'mlm' recover_coord(x) ## S3 method for class 'mlm' recover_aug_rows(x) ## S3 method for class 'mlm' recover_aug_cols(x) ## S3 method for class 'mlm' recover_aug_coord(x)
## S3 method for class 'lm' as_tbl_ord(x) ## S3 method for class 'lm' recover_rows(x) ## S3 method for class 'lm' recover_cols(x) ## S3 method for class 'lm' recover_coord(x) ## S3 method for class 'lm' recover_aug_rows(x) ## S3 method for class 'lm' recover_aug_cols(x) ## S3 method for class 'lm' recover_aug_coord(x) ## S3 method for class 'glm' recover_aug_rows(x) ## S3 method for class 'mlm' recover_rows(x) ## S3 method for class 'mlm' recover_cols(x) ## S3 method for class 'mlm' recover_coord(x) ## S3 method for class 'mlm' recover_aug_rows(x) ## S3 method for class 'mlm' recover_aug_cols(x) ## S3 method for class 'mlm' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for idiosyncratic techniques:
methods-kmeans
Other models from the stats package:
methods-cancor
,
methods-cmds
,
methods-factanal
,
methods-kmeans
,
methods-prcomp
,
methods-princomp
# Motor Trend design and performance data head(mtcars) # regression analysis of performance measures on design specifications mtcars_centered <- scale(mtcars, scale = FALSE) mtcars_centered %>% as.data.frame() %>% lm(formula = mpg ~ wt + cyl) %>% print() -> mtcars_lm # wrap as a 'tbl_ord' object (mtcars_lm_ord <- as_tbl_ord(mtcars_lm)) # augment everything with names, predictors with observation stats augment_ord(mtcars_lm_ord) # calculate influences as the squares of weighted residuals mutate_rows(augment_ord(mtcars_lm_ord), influence = wt.res^2) # regression biplot with performance isolines mtcars_lm_ord %>% augment_ord() %>% mutate_cols(center = attr(mtcars_centered, "scaled:center")[name]) %>% mutate_rows(influence = wt.res^2) %T>% print() %>% ggbiplot(aes(x = wt, y = cyl, intercept = `(Intercept)`)) + #theme_biplot() + geom_origin(marker = "circle", radius = unit(0.02, "snpc")) + geom_rows_point(aes(color = influence)) + geom_cols_vector() + geom_cols_isoline(aes(center = center), by = .5, hjust = -.1) + ggtitle( "Weight isolines with data colored by importance", "Regressing gas mileage onto weight and number of cylinders" )
# Motor Trend design and performance data head(mtcars) # regression analysis of performance measures on design specifications mtcars_centered <- scale(mtcars, scale = FALSE) mtcars_centered %>% as.data.frame() %>% lm(formula = mpg ~ wt + cyl) %>% print() -> mtcars_lm # wrap as a 'tbl_ord' object (mtcars_lm_ord <- as_tbl_ord(mtcars_lm)) # augment everything with names, predictors with observation stats augment_ord(mtcars_lm_ord) # calculate influences as the squares of weighted residuals mutate_rows(augment_ord(mtcars_lm_ord), influence = wt.res^2) # regression biplot with performance isolines mtcars_lm_ord %>% augment_ord() %>% mutate_cols(center = attr(mtcars_centered, "scaled:center")[name]) %>% mutate_rows(influence = wt.res^2) %T>% print() %>% ggbiplot(aes(x = wt, y = cyl, intercept = `(Intercept)`)) + #theme_biplot() + geom_origin(marker = "circle", radius = unit(0.02, "snpc")) + geom_rows_point(aes(color = influence)) + geom_cols_vector() + geom_cols_isoline(aes(center = center), by = .5, hjust = -.1) + ggtitle( "Weight isolines with data colored by importance", "Regressing gas mileage onto weight and number of cylinders" )
These methods extract data from, and attribute new data to,
objects of class "lra"
, a class introduced in this package to organize
the singular value decomposition of a double-centered log-transformed data
matrix output by lra()
.
## S3 method for class 'lra' as_tbl_ord(x) ## S3 method for class 'lra' recover_rows(x) ## S3 method for class 'lra' recover_cols(x) ## S3 method for class 'lra' recover_inertia(x) ## S3 method for class 'lra' recover_coord(x) ## S3 method for class 'lra' recover_conference(x) ## S3 method for class 'lra' recover_aug_rows(x) ## S3 method for class 'lra' recover_aug_cols(x) ## S3 method for class 'lra' recover_aug_coord(x)
## S3 method for class 'lra' as_tbl_ord(x) ## S3 method for class 'lra' recover_rows(x) ## S3 method for class 'lra' recover_cols(x) ## S3 method for class 'lra' recover_inertia(x) ## S3 method for class 'lra' recover_coord(x) ## S3 method for class 'lra' recover_conference(x) ## S3 method for class 'lra' recover_aug_rows(x) ## S3 method for class 'lra' recover_aug_cols(x) ## S3 method for class 'lra' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-correspondence
,
methods-lda
,
methods-mca
,
methods-prcomp
,
methods-svd
# data frame of violent crime arrests in the United States class(USArrests) head(USArrests) # get state abbreviation data state <- data.frame( name = state.name, abb = state.abb ) # compute (non-compositional, unweighted) log-ratio analysis USArrests %>% subset(select = -UrbanPop) %>% lra() %>% as_tbl_ord() %>% print() -> arrests_lra # augment log-ratio profiles with names and join state abbreviations arrests_lra %>% augment_ord() %>% left_join_rows(state, by = "name") %>% print() -> arrests_lra # recover state and arrest profiles head(get_rows(arrests_lra)) get_cols(arrests_lra) # initially, inertia is conferred on neither factor get_conference(arrests_lra) # row-principal biplot arrests_lra %>% confer_inertia("rows") %>% ggbiplot(aes(color = .matrix), sec.axes = "cols", scale.factor = 1/20) + scale_color_manual(values = c("tomato4", "turquoise4")) + theme_bw() + theme_biplot() + geom_rows_text(aes(label = abb), size = 3, alpha = .75) + geom_cols_polygon(fill = NA, linetype = "dashed") + geom_cols_text(aes(label = name, size = weight), fontface = "bold") + scale_size_area(guide = "none") + ggtitle( "Violent crime arrest rates", "Non-compositional LRA" ) + coord_scaffold() + guides(color = "none")
# data frame of violent crime arrests in the United States class(USArrests) head(USArrests) # get state abbreviation data state <- data.frame( name = state.name, abb = state.abb ) # compute (non-compositional, unweighted) log-ratio analysis USArrests %>% subset(select = -UrbanPop) %>% lra() %>% as_tbl_ord() %>% print() -> arrests_lra # augment log-ratio profiles with names and join state abbreviations arrests_lra %>% augment_ord() %>% left_join_rows(state, by = "name") %>% print() -> arrests_lra # recover state and arrest profiles head(get_rows(arrests_lra)) get_cols(arrests_lra) # initially, inertia is conferred on neither factor get_conference(arrests_lra) # row-principal biplot arrests_lra %>% confer_inertia("rows") %>% ggbiplot(aes(color = .matrix), sec.axes = "cols", scale.factor = 1/20) + scale_color_manual(values = c("tomato4", "turquoise4")) + theme_bw() + theme_biplot() + geom_rows_text(aes(label = abb), size = 3, alpha = .75) + geom_cols_polygon(fill = NA, linetype = "dashed") + geom_cols_text(aes(label = name, size = weight), fontface = "bold") + scale_size_area(guide = "none") + ggtitle( "Violent crime arrest rates", "Non-compositional LRA" ) + coord_scaffold() + guides(color = "none")
These methods extract data from, and attribute new data to,
objects of class "mca"
from the MASS package.
## S3 method for class 'mca' as_tbl_ord(x) ## S3 method for class 'mca' recover_rows(x) ## S3 method for class 'mca' recover_cols(x) ## S3 method for class 'mca' recover_inertia(x) ## S3 method for class 'mca' recover_conference(x) ## S3 method for class 'mca' recover_coord(x) ## S3 method for class 'mca' recover_supp_rows(x) ## S3 method for class 'mca' recover_aug_rows(x) ## S3 method for class 'mca' recover_aug_cols(x) ## S3 method for class 'mca' recover_aug_coord(x)
## S3 method for class 'mca' as_tbl_ord(x) ## S3 method for class 'mca' recover_rows(x) ## S3 method for class 'mca' recover_cols(x) ## S3 method for class 'mca' recover_inertia(x) ## S3 method for class 'mca' recover_conference(x) ## S3 method for class 'mca' recover_coord(x) ## S3 method for class 'mca' recover_supp_rows(x) ## S3 method for class 'mca' recover_aug_rows(x) ## S3 method for class 'mca' recover_aug_cols(x) ## S3 method for class 'mca' recover_aug_coord(x)
x |
An ordination object. |
Multiple correspondence analysis (MCA) relies on a singular value
decomposition of the indicator matrix of a table of several
categorical variables, scaled by its column totals.
MASS::mca()
returns the
SVD factors and
as the row weights
$fs
, on which the
inertia is conferred, and the column coordinates $cs
. The row coordinates
$rs
are obtained as and accessible as supplementary elements.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-correspondence
,
methods-lda
,
methods-lra
,
methods-prcomp
,
methods-svd
Other models from the MASS package:
methods-correspondence
,
methods-lda
# table of admissions and rejections from UC Berkeley class(UCBAdmissions) ucb_admissions <- as.data.frame(UCBAdmissions) ucb_admissions <- ucb_admissions[rep(seq(nrow(ucb_admissions)), ucb_admissions$Freq), -4L] head(ucb_admissions) # perform multiple correspondence analysis ucb_admissions %>% MASS::mca() %>% as_tbl_ord() %>% # augment profiles with names, masses, distances, and inertias augment_ord() %>% print() -> admissions_mca # recover row and column coordinates and row weights head(get_rows(admissions_mca, elements = "score")) get_cols(admissions_mca) head(get_rows(admissions_mca)) # column-standard biplot of factor levels admissions_mca %>% ggbiplot() + theme_bw() + theme_biplot() + geom_origin() + #geom_rows_point(stat = "unique") + geom_cols_point(aes(color = factor, shape = factor)) + geom_cols_text_repel(aes(label = level, color = factor), show.legend = FALSE) + scale_color_brewer(palette = "Dark2") + scale_size_area(guide = "none") + labs(color = "Factor level", shape = "Factor level")
# table of admissions and rejections from UC Berkeley class(UCBAdmissions) ucb_admissions <- as.data.frame(UCBAdmissions) ucb_admissions <- ucb_admissions[rep(seq(nrow(ucb_admissions)), ucb_admissions$Freq), -4L] head(ucb_admissions) # perform multiple correspondence analysis ucb_admissions %>% MASS::mca() %>% as_tbl_ord() %>% # augment profiles with names, masses, distances, and inertias augment_ord() %>% print() -> admissions_mca # recover row and column coordinates and row weights head(get_rows(admissions_mca, elements = "score")) get_cols(admissions_mca) head(get_rows(admissions_mca)) # column-standard biplot of factor levels admissions_mca %>% ggbiplot() + theme_bw() + theme_biplot() + geom_origin() + #geom_rows_point(stat = "unique") + geom_cols_point(aes(color = factor, shape = factor)) + geom_cols_text_repel(aes(label = level, color = factor), show.legend = FALSE) + scale_color_brewer(palette = "Dark2") + scale_size_area(guide = "none") + labs(color = "Factor level", shape = "Factor level")
These methods extract data from, and attribute new data to,
objects of class "prcomp"
as returned by stats::prcomp()
.
## S3 method for class 'prcomp' as_tbl_ord(x) ## S3 method for class 'prcomp' recover_rows(x) ## S3 method for class 'prcomp' recover_cols(x) ## S3 method for class 'prcomp' recover_inertia(x) ## S3 method for class 'prcomp' recover_coord(x) ## S3 method for class 'prcomp' recover_conference(x) ## S3 method for class 'prcomp' recover_aug_rows(x) ## S3 method for class 'prcomp' recover_aug_cols(x) ## S3 method for class 'prcomp' recover_aug_coord(x)
## S3 method for class 'prcomp' as_tbl_ord(x) ## S3 method for class 'prcomp' recover_rows(x) ## S3 method for class 'prcomp' recover_cols(x) ## S3 method for class 'prcomp' recover_inertia(x) ## S3 method for class 'prcomp' recover_coord(x) ## S3 method for class 'prcomp' recover_conference(x) ## S3 method for class 'prcomp' recover_aug_rows(x) ## S3 method for class 'prcomp' recover_aug_cols(x) ## S3 method for class 'prcomp' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Emily Paul
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-correspondence
,
methods-lda
,
methods-lra
,
methods-mca
,
methods-svd
Other models from the stats package:
methods-cancor
,
methods-cmds
,
methods-factanal
,
methods-kmeans
,
methods-lm
,
methods-princomp
# data frame of Anderson iris species measurements class(iris) head(iris) # compute scaled row-principal components of scaled measurements iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% print() -> iris_pca # recover observation principal coordinates and measurement standard coordinates head(get_rows(iris_pca)) get_cols(iris_pca) # augment measurements with names and scaling parameters (iris_pca <- augment_ord(iris_pca))
# data frame of Anderson iris species measurements class(iris) head(iris) # compute scaled row-principal components of scaled measurements iris[, -5] %>% prcomp(scale = TRUE) %>% as_tbl_ord() %>% print() -> iris_pca # recover observation principal coordinates and measurement standard coordinates head(get_rows(iris_pca)) get_cols(iris_pca) # augment measurements with names and scaling parameters (iris_pca <- augment_ord(iris_pca))
These methods extract data from, and attribute new data to,
objects of class "princomp"
as returned by stats::princomp()
.
## S3 method for class 'princomp' as_tbl_ord(x) ## S3 method for class 'princomp' recover_rows(x) ## S3 method for class 'princomp' recover_cols(x) ## S3 method for class 'princomp' recover_inertia(x) ## S3 method for class 'princomp' recover_coord(x) ## S3 method for class 'princomp' recover_conference(x) ## S3 method for class 'princomp' recover_supp_rows(x) ## S3 method for class 'princomp' recover_aug_rows(x) ## S3 method for class 'princomp' recover_aug_cols(x) ## S3 method for class 'princomp' recover_aug_coord(x)
## S3 method for class 'princomp' as_tbl_ord(x) ## S3 method for class 'princomp' recover_rows(x) ## S3 method for class 'princomp' recover_cols(x) ## S3 method for class 'princomp' recover_inertia(x) ## S3 method for class 'princomp' recover_coord(x) ## S3 method for class 'princomp' recover_conference(x) ## S3 method for class 'princomp' recover_supp_rows(x) ## S3 method for class 'princomp' recover_aug_rows(x) ## S3 method for class 'princomp' recover_aug_cols(x) ## S3 method for class 'princomp' recover_aug_coord(x)
x |
An ordination object. |
Principal components analysis (PCA), as performed by stats::princomp()
,
relies on an eigenvalue decomposition (EVD) of the covariance matrix
of a data set
.
stats::princomp()
returns the EVD factor
as the loadings
$loadings
. The scores $scores
are obtained as
and are accessible as supplementary elements.
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Emily Paul, John Gracey
Other methods for eigen-decomposition-based techniques:
methods-cmds
,
methods-eigen
,
methods-factanal
Other models from the stats package:
methods-cancor
,
methods-cmds
,
methods-factanal
,
methods-kmeans
,
methods-lm
,
methods-prcomp
# data frame of Anderson iris species measurements class(iris) head(iris) # compute unscaled row-principal components of scaled measurements iris[, -5] %>% princomp() %>% as_tbl_ord() %>% print() -> iris_pca # recover observation principal coordinates and measurement standard coordinates head(get_rows(iris_pca)) get_cols(iris_pca) # augment measurement coordinates with names and scaling parameters (iris_pca <- augment_ord(iris_pca))
# data frame of Anderson iris species measurements class(iris) head(iris) # compute unscaled row-principal components of scaled measurements iris[, -5] %>% princomp() %>% as_tbl_ord() %>% print() -> iris_pca # recover observation principal coordinates and measurement standard coordinates head(get_rows(iris_pca)) get_cols(iris_pca) # augment measurement coordinates with names and scaling parameters (iris_pca <- augment_ord(iris_pca))
These methods extract data from, and attribute new data to,
objects of class "svd_ord"
returned by svd_ord()
.
## S3 method for class 'svd_ord' as_tbl_ord(x) ## S3 method for class 'svd_ord' recover_rows(x) ## S3 method for class 'svd_ord' recover_cols(x) ## S3 method for class 'svd_ord' recover_inertia(x) ## S3 method for class 'svd_ord' recover_coord(x) ## S3 method for class 'svd_ord' recover_conference(x) ## S3 method for class 'svd_ord' recover_aug_rows(x) ## S3 method for class 'svd_ord' recover_aug_cols(x) ## S3 method for class 'svd_ord' recover_aug_coord(x)
## S3 method for class 'svd_ord' as_tbl_ord(x) ## S3 method for class 'svd_ord' recover_rows(x) ## S3 method for class 'svd_ord' recover_cols(x) ## S3 method for class 'svd_ord' recover_inertia(x) ## S3 method for class 'svd_ord' recover_coord(x) ## S3 method for class 'svd_ord' recover_conference(x) ## S3 method for class 'svd_ord' recover_aug_rows(x) ## S3 method for class 'svd_ord' recover_aug_cols(x) ## S3 method for class 'svd_ord' recover_aug_coord(x)
x |
An ordination object. |
The recovery generics recover_*()
return core model components, distribution of inertia,
supplementary elements, and intrinsic metadata; but they require methods for each model class to
tell them what these components are.
The generic as_tbl_ord()
returns its input wrapped in the 'tbl_ord'
class. Its methods determine what model classes it is allowed to wrap. It
then provides 'tbl_ord' methods with access to the recoverers and hence to
the model components.
Other methods for singular value decomposition-based techniques:
methods-cancor
,
methods-correspondence
,
methods-lda
,
methods-lra
,
methods-mca
,
methods-prcomp
Other models from the base package:
methods-eigen
# matrix of U.S. personal expenditure data class(USPersonalExpenditure) print(USPersonalExpenditure) # singular value decomposition into row and column coordinates USPersonalExpenditure %>% svd_ord() %>% as_tbl_ord() %>% print() -> spend_svd # recover matrices of row and column coordinates get_rows(spend_svd) get_cols(spend_svd) # augment with row and column names augment_ord(spend_svd) # initial matrix decomposition confers no inertia to coordinates get_conference(spend_svd)
# matrix of U.S. personal expenditure data class(USPersonalExpenditure) print(USPersonalExpenditure) # singular value decomposition into row and column coordinates USPersonalExpenditure %>% svd_ord() %>% as_tbl_ord() %>% print() -> spend_svd # recover matrices of row and column coordinates get_rows(spend_svd) get_cols(spend_svd) # augment with row and column names augment_ord(spend_svd) # initial matrix decomposition confers no inertia to coordinates get_conference(spend_svd)
Negate the coordinates of a subset of ordination axes in both row and column singular vectors.
get_negation(x) revert_negation(x) negate_ord(x, negation = NULL) negate_to_first_orthant(x, .matrix)
get_negation(x) revert_negation(x) negate_ord(x, negation = NULL) negate_to_first_orthant(x, .matrix)
x |
A tbl_ord. |
negation |
Integer vector of coordinates to negate. |
.matrix |
A character string partially matched (lowercase) to several
indicators for one or both matrices in a matrix decomposition used for
ordination. The standard values are |
For purposes of comparison and visualization, it can be useful to negate the
(already artificial) coordinates of an ordination, either by fixed criteria
or to better align with another basis (matrix) of coordinates. negate_ord()
allows the user to negate specified coordinates of an ordination.
get_negation()
accesses the negations of an ordination, an integer vector
of 1
s and -1
s stored as a "negate"
attribute.
negate_ord()
and negate_to_first_orthant()
return a tbl_ord with
certain axes negated but the wrapped model unchanged. get_negation()
returns the current negations. revert_negation()
returns the tbl_ord
without any manual negations.
A tbl_ord; the wrapped model is unchanged.
(pca <- ordinate(iris, cols = 1:4, prcomp)) ggbiplot(pca) + geom_rows_point() + geom_cols_vector() # manually negate second coordinate (pca_neg <- negate_ord(pca, 2)) ggbiplot(pca_neg) + geom_rows_point() + geom_cols_vector() # NB: 'prcomp' method takes precedence; negations are part of the wrapper biplot(pca) biplot(pca_neg) # negate to the first orthant (pca_orth <- negate_to_first_orthant(pca, "v")) get_negation(pca_orth)
(pca <- ordinate(iris, cols = 1:4, prcomp)) ggbiplot(pca) + geom_rows_point() + geom_cols_vector() # manually negate second coordinate (pca_neg <- negate_ord(pca, 2)) ggbiplot(pca_neg) + geom_rows_point() + geom_cols_vector() # NB: 'prcomp' method takes precedence; negations are part of the wrapper biplot(pca) biplot(pca_neg) # negate to the first orthant (pca_orth <- negate_to_first_orthant(pca, "v")) get_negation(pca_orth)
This is a convenience function to fit an ordination model to a data object, wrap the result as a tbl_ord, and annotate this output with metadata from the model and possibly from the data.
ordinate(x, model, ...) ## Default S3 method: ordinate(x, model, ...) ## S3 method for class 'array' ordinate(x, model, ...) ## S3 method for class 'table' ordinate(x, model, ...) ## S3 method for class 'data.frame' ordinate(x, model, cols, augment, ...) ## S3 method for class 'dist' ordinate(x, model, ...)
ordinate(x, model, ...) ## Default S3 method: ordinate(x, model, ...) ## S3 method for class 'array' ordinate(x, model, ...) ## S3 method for class 'table' ordinate(x, model, ...) ## S3 method for class 'data.frame' ordinate(x, model, cols, augment, ...) ## S3 method for class 'dist' ordinate(x, model, ...)
x |
A data object to be passed to the |
model |
An ordination function whose output is coercible to class
'tbl_ord', or a symbol or character string (handled by |
... |
Additional arguments passed to |
cols |
< |
augment |
< |
The default method fits the specified model to the provided data object,
wraps the result as a tbl_ord, and augments this output with any intrinsic
metadata from the model via augment_ord()
.
The default method is used for most classes, though this may change in future. The data.frame method allows the user to specify what columns to include in the model and what columns with which to annotate the output.
An augmented tbl_ord.
# LRA of arrest data ordinate(USArrests, cols = c(Murder, Rape, Assault), lra) # CMDS of inter-city distance data ordinate(UScitiesD, cmdscale_ord, k = 3L) # PCA of iris data ordinate(iris, princomp, cols = -Species, augment = c(Sepal.Width, Species)) ordinate(iris, cols = 1:4, ~ prcomp(., center = TRUE, scale. = TRUE)) # CA of hair & eye color data haireye <- as.data.frame(rowSums(HairEyeColor, dims = 2L)) ordinate(haireye, MASS::corresp, cols = everything()) # FA of Swiss social data ordinate(swiss, model = factanal, factors = 2L, scores = "Bartlett") # LDA of iris data ordinate(iris, ~ lda_ord(.[, 1:4], .[, 5])) # CCA of savings data ordinate( LifeCycleSavings[, c("pop15", "pop75")], # second data set must be handled as an additional parameter to `model` y = LifeCycleSavings[, c("sr", "dpi", "ddpi")], model = cancor_ord, scores = TRUE )
# LRA of arrest data ordinate(USArrests, cols = c(Murder, Rape, Assault), lra) # CMDS of inter-city distance data ordinate(UScitiesD, cmdscale_ord, k = 3L) # PCA of iris data ordinate(iris, princomp, cols = -Species, augment = c(Sepal.Width, Species)) ordinate(iris, cols = 1:4, ~ prcomp(., center = TRUE, scale. = TRUE)) # CA of hair & eye color data haireye <- as.data.frame(rowSums(HairEyeColor, dims = 2L)) ordinate(haireye, MASS::corresp, cols = everything()) # FA of Swiss social data ordinate(swiss, model = factanal, factors = 2L, scores = "Bartlett") # LDA of iris data ordinate(iris, ~ lda_ord(.[, 1:4], .[, 5])) # CCA of savings data ordinate( LifeCycleSavings[, c("pop15", "pop75")], # second data set must be handled as an additional parameter to `model` y = LifeCycleSavings[, c("sr", "dpi", "ddpi")], model = cancor_ord, scores = TRUE )
In addition to geometric element layers (geoms) based on
base-ggplot2 layers like geom_point()
but specified to matrix factors
as geom_row_point()
, ordr introduces ggproto
classes for some additional geometric elements commonly used in biplots.
The factor-specific geoms invoke the statistical transformation layers
(stats) stat_rows()
and stat_cols()
, which specify the matrix factor.
Because each ggplot layer consists of only one stat and one geom, this
necessitates that ggproto classes for new stats must also come in *Rows
and *Cols
flavors.
ggplot2::ggplot2-ggproto
and ggplot2::ggproto for explanations
of base ggproto classes in ggplot2 and how to create new ones.
Adapt stats 'prcomp' and 'princomp' methods for plot()
,
screeplot()
, and biplot()
generics to 'tbl_ord' objects.
## S3 method for class 'tbl_ord' plot(x, main = deparse(substitute(x)), ...) ## S3 method for class 'tbl_ord' screeplot(x, main = deparse(substitute(x)), ...) ## S3 method for class 'tbl_ord' biplot(x, main = deparse(substitute(x)), ...)
## S3 method for class 'tbl_ord' plot(x, main = deparse(substitute(x)), ...) ## S3 method for class 'tbl_ord' screeplot(x, main = deparse(substitute(x)), ...) ## S3 method for class 'tbl_ord' biplot(x, main = deparse(substitute(x)), ...)
x |
A 'tbl_ord' object. |
main |
A main title for the plot, passed to other methods (included to enable parsing of object name). |
... |
Additional arguments passed to other methods. |
These methods defer to any plot()
and biplot()
methods for the original,
underlying model classes of 'tbl_ord' objects. If none are found: Following
the examples of stats::plot.prcomp()
and stats::plot.princomp()
,
plot.tbl_ord()
calls on stats::screeplot()
to produce a scree plot of the
decomposition of variance in the singular value decomposition. Similarly
following stats::biplot.prcomp()
and stats::biplot.princomp()
,
biplot.tbl_ord()
produces a biplot of both rows and columns, using text
labels when available and markers otherwise, with rows and columns
distinguished by color and no additional annotation (e.g. vectors). The
biplot confers inertia according to get_conference()
unless the proportions
do not sum to 1, in which case it produces a symmetric biplot (inertia
conferred equally to rows and columns).
Nothing, but a plot is produced on the current graphics device.
# note: behavior depends on installed packages with class-specific methods # class 'prcomp' iris_pca <- prcomp(iris[, -5L], scale = TRUE) iris_pca_ord <- as_tbl_ord(iris_pca) plot(iris_pca) plot(iris_pca_ord) screeplot(iris_pca) screeplot(iris_pca_ord) biplot(iris_pca) biplot(iris_pca_ord) # class 'correspondence' haireye_ca <- MASS::corresp(rowSums(HairEyeColor, dims = 2L), nf = 2L) haireye_ca_ord <- as_tbl_ord(haireye_ca) plot(haireye_ca) plot(haireye_ca_ord) # no `screeplot()` method for class 'correspondence' screeplot(haireye_ca_ord) biplot(haireye_ca) biplot(haireye_ca_ord)
# note: behavior depends on installed packages with class-specific methods # class 'prcomp' iris_pca <- prcomp(iris[, -5L], scale = TRUE) iris_pca_ord <- as_tbl_ord(iris_pca) plot(iris_pca) plot(iris_pca_ord) screeplot(iris_pca) screeplot(iris_pca_ord) biplot(iris_pca) biplot(iris_pca_ord) # class 'correspondence' haireye_ca <- MASS::corresp(rowSums(HairEyeColor, dims = 2L), nf = 2L) haireye_ca_ord <- as_tbl_ord(haireye_ca) plot(haireye_ca) plot(haireye_ca_ord) # no `screeplot()` method for class 'correspondence' screeplot(haireye_ca_ord) biplot(haireye_ca) biplot(haireye_ca_ord)
Classifications and rankings of U.S. universities for the years 2017–2020.
data(qswur_usa)
data(qswur_usa)
A tibble of 13 variables on 612 cases:
year of rankings
institution of higher learning
size category of institution
subject range of institution
research intensity of institution
age classification of institution
status of institution
rank by academic reputation
rank by employer reputation
rank by faculty–student ratio
rank by citations per faculty
rank by international faculty ratio
rank by international student ratio
Ranking data were obtained from the public QS website.
Quacquarelli Symonds (2021).
Quacquarelli Symonds (2021) "University Rankings". TopUniversities.com https://www.topuniversities.com/university-rankings.
# subset QS data to rank variables head(qswur_usa) qs_ranks <- subset( qswur_usa, complete.cases(qswur_usa), select = 8:13 ) # calculate Kendall correlation matrix qs_cor <- cor(qs_ranks, method = "kendall") # calculate eigendecomposition qs_eigen <- eigen_ord(qs_cor) # view correlations as cosines of biplot vectors biplot(x = qs_eigen$vectors, y = qs_eigen$vectors, col = c(NA, "black"))
# subset QS data to rank variables head(qswur_usa) qs_ranks <- subset( qswur_usa, complete.cases(qswur_usa), select = 8:13 ) # calculate Kendall correlation matrix qs_cor <- cor(qs_ranks, method = "kendall") # calculate eigendecomposition qs_eigen <- eigen_ord(qs_cor) # view correlations as cosines of biplot vectors biplot(x = qs_eigen$vectors, y = qs_eigen$vectors, col = c(NA, "black"))
These functions return information about the matrix factorization underlying an ordination.
recover_rows(x) recover_cols(x) ## Default S3 method: recover_rows(x) ## Default S3 method: recover_cols(x) ## S3 method for class 'data.frame' recover_rows(x) ## S3 method for class 'data.frame' recover_cols(x) get_rows(x, elements = "all") get_cols(x, elements = "all") ## S3 method for class 'tbl_ord' as.matrix(x, ..., .matrix, elements = "all") recover_inertia(x) ## Default S3 method: recover_inertia(x) recover_coord(x) ## Default S3 method: recover_coord(x) ## S3 method for class 'data.frame' recover_coord(x) get_coord(x) get_inertia(x) ## S3 method for class 'tbl_ord' dim(x)
recover_rows(x) recover_cols(x) ## Default S3 method: recover_rows(x) ## Default S3 method: recover_cols(x) ## S3 method for class 'data.frame' recover_rows(x) ## S3 method for class 'data.frame' recover_cols(x) get_rows(x, elements = "all") get_cols(x, elements = "all") ## S3 method for class 'tbl_ord' as.matrix(x, ..., .matrix, elements = "all") recover_inertia(x) ## Default S3 method: recover_inertia(x) recover_coord(x) ## Default S3 method: recover_coord(x) ## S3 method for class 'data.frame' recover_coord(x) get_coord(x) get_inertia(x) ## S3 method for class 'tbl_ord' dim(x)
x |
An object of class 'tbl_ord'. |
elements |
Character vector; which elements of each factor for which to
render graphical elements. One of |
... |
Additional arguments from |
.matrix |
A character string partially matched (lowercase) to several
indicators for one or both matrices in a matrix decomposition used for
ordination. The standard values are |
The recover_*()
S3 methods extract one or both of the
row and column matrix factors that constitute the original ordination. These
are interpreted as the case scores (rows) and the variable loadings
(columns). The get_*()
functions optionally (and by default) include any
supplemental observations (see supplementation).
The recover_*()
functions are generics that require methods for each
ordination class. They are not intended to be called directly but are
exported so that users can query methods("recover_*")
.
get_coord()
retrieves the names of the coordinates shared by the matrix
factors on which the original data were ordinated, and get_inertia()
retrieves a vector of the inertia with these names. dim()
retrieves the
dimensions of the row and column factors, which reflect the dimensions of the
matrix they reconstruct—not the original data matrix. (This matters for
techniques that rely on eigendecomposition, for which the decomposed matrix
is square.)
The recover_*()
functions are generics whose methods return base R
objects retrieved from the model wrapped in the 'tbl_ord' class:
rows
: the row matrix as stored in the model
cols
: the column matrix as stored in the model
inertia
: the vector of eigen-values or squared singular values,
often known by other names depending on the model
coord
: names for the artificial axes, from the model if available
The get_*()
functions (which are not generics) return modifications of
these objects:
rows
: the recovered rows,
adjusted according to any negation of axes or conference of inertia
cols
: the recovered columns,
adjusted according to any negation of axes or conference of inertia
inertia
: the recovered inertia, named by the recovered coordinates
coord
: the recovered coordinates (unmodified)
dim()
returns the dimensions of the decomposed matrix, i.e. the numbers of
rows of recover_rows()
and of recover_cols()
.
Other generic recoverers:
augmentation
,
conference
,
supplementation
# example ordination: LRA of U.S. arrests data arrests_lra <- ordinate(USArrests, cols = c(Murder, Rape, Assault), lra) # extract matrix factors as.matrix(arrests_lra, .matrix = "rows") as.matrix(arrests_lra, .matrix = "cols") # special named functions get_rows(arrests_lra) get_cols(arrests_lra) # get dimensions of underlying matrix factorization (not of original data) dim(arrests_lra) # get names of artificial / latent coordinates get_coord(arrests_lra) # get distribution of inertia get_inertia(arrests_lra)
# example ordination: LRA of U.S. arrests data arrests_lra <- ordinate(USArrests, cols = c(Murder, Rape, Assault), lra) # extract matrix factors as.matrix(arrests_lra, .matrix = "rows") as.matrix(arrests_lra, .matrix = "cols") # special named functions get_rows(arrests_lra) get_cols(arrests_lra) # get dimensions of underlying matrix factorization (not of original data) dim(arrests_lra) # get names of artificial / latent coordinates get_coord(arrests_lra) # get distribution of inertia get_inertia(arrests_lra)
Construct medians, bags, fences, and outlier specifications for bagplots.
stat_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... )
stat_bagplot( mapping = NULL, data = NULL, geom = "bagplot", position = "identity", fraction = 0.5, coef = 3, median = TRUE, fence = TRUE, outliers = TRUE, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
fraction |
Fraction of the data to include in the bag. |
coef |
Scale factor of the fence relative to the bag. |
median , fence , outliers
|
Logical indicators whether to include median, fence, and outliers in the composite output. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Arguments passed on to
|
A bagplot comprises a single, often filled, depth contour (the "bag") overlaid upon the hull of its union with the data points contained in its scaled expansion from the depth median (the "fence") and a scatterplot of outliers beyond the fence (the "loop"). Rousseeuw &al (1999) suggest the term "bag-and-bolster plot".
While the depth median can be obtained using stat_center()
, the data
depth values used to compute it are also used to demarcate the bag, so it
is implemented separately in StatBagplot$compute_group()
for efficiency.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
component
the component of the composite plot; used internally
Rousseeuw PJ, Ruts I, & Tukey JW (1999) "The Bagplot: A Bivariate Boxplot". The American Statistician, 53(4): 382–387. doi:10.1080/00031305.1999.10474494
Other stat layers:
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
# petroleum rock base plot p <- ggplot(rock, aes(area, shape, size = peri)) + theme_bw() # scatterplot p + geom_point() # NB: Non-standard aesthetics are handled as in version > 3.5.1; see: # https://github.com/tidyverse/ggplot2/issues/6191 # custom bag fraction, coefficient, and aesthetics p + stat_bagplot(fraction = .4, coef = 1.5, outlier_gp = list(shape = "asterisk")) # invisible fence p + stat_bagplot(fence = FALSE)
# petroleum rock base plot p <- ggplot(rock, aes(area, shape, size = peri)) + theme_bw() # scatterplot p + geom_point() # NB: Non-standard aesthetics are handled as in version > 3.5.1; see: # https://github.com/tidyverse/ggplot2/issues/6191 # custom bag fraction, coefficient, and aesthetics p + stat_bagplot(fraction = .4, coef = 1.5, outlier_gp = list(shape = "asterisk")) # invisible fence p + stat_bagplot(fence = FALSE)
Compute geometric centers and spreads for ordination factors
stat_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() )
stat_center( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.min = NULL, fun.max = NULL, fun.ord = NULL, fun.args = list() ) stat_star( mapping = NULL, data = NULL, geom = "segment", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., fun.data = NULL, fun = NULL, fun.center = NULL, fun.ord = NULL, fun.args = list() )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
fun.data |
A function that is given the complete data and should
return a data frame with variables |
fun.center |
Deprecated alias to |
fun.min , fun , fun.max
|
Alternatively, supply three individual functions that are each passed a vector of values and should return a single number. |
fun.ord |
Alternatively to the |
fun.args |
Optional additional arguments passed on to the functions. |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
xmin,ymin,xmax,ymax
results of fun.min,fun.max
applied to x,y
Other stat layers:
stat_bagplot()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
ggplot(mpg, aes(x = displ, y = cty, shape = drv)) + geom_point() + stat_center(fun = "median", size = 5, alpha = .5) ggplot(mpg, aes(x = displ, y = cty, shape = drv, linetype = drv)) + stat_center(size = 3) + stat_star()
ggplot(mpg, aes(x = displ, y = cty, shape = drv)) + geom_point() + stat_center(fun = "median", size = 5, alpha = .5) ggplot(mpg, aes(x = displ, y = cty, shape = drv, linetype = drv)) + stat_center(size = 3) + stat_star()
Restrict planar data to the boundary points of its convex hull, or of nested convex hulls containing specified fractions of points.
stat_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... )
stat_chull( mapping = NULL, data = NULL, geom = "polygon", position = "identity", show.legend = NA, inherit.aes = TRUE, ... ) stat_peel( mapping = NULL, data = NULL, geom = "polygon", position = "identity", breaks = c(0.5), cut = c("above", "below"), show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
breaks |
A numeric vector of fractions (between |
cut |
Character; one of |
As used in a ggplot2 vignette,
stat_chull()
restricts a dataset with x
and y
variables to the points
that lie on its convex hull.
Building on this, stat_peel()
returns hulls from a convex hull peeling:
a subset of sequentially removed hulls containing specified fractions of
the data.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
hull
the position of breaks
that defines each hull
frac
the value of breaks
that defines each hull
prop
the actual proportion of data within each hull
Barnett V (1976) "The Ordering of Multivariate Data". Journal of the Royal Statistical Society: Series A (General), 139(3): 318–344. doi:10.2307/2344839
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
ggplot(USJudgeRatings, aes(x = INTG, y = PREP)) + geom_point() + stat_chull(alpha = .5) ggplot(USJudgeRatings, aes(x = INTG, y = PREP)) + stat_peel( aes(alpha = after_stat(hull)), breaks = seq(.1, .9, .2), color = "black" )
ggplot(USJudgeRatings, aes(x = INTG, y = PREP)) + geom_point() + stat_chull(alpha = .5) ggplot(USJudgeRatings, aes(x = INTG, y = PREP)) + stat_peel( aes(alpha = after_stat(hull)), breaks = seq(.1, .9, .2), color = "black" )
Restrict planar data to the points that lie on its conical hull (other than the origin).
stat_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
stat_cone( mapping = NULL, data = NULL, geom = "path", position = "identity", origin = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
origin |
Logical; whether to include the origin with the transformed
data. Defaults to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
state_center <- as.data.frame(state.center) # US hull from the perspective of florida fl.center <- state_center[state.abb == "FL", ] as.data.frame(state.center) |> transform(x = x - fl.center$x, y = y - fl.center$y, abbr = state.abb) |> subset(abbr != "HI" & abbr != "AK") |> ggplot(aes(x, y, label = abbr)) + stat_cone(data = \(d) subset(d, abbr != "FL")) + geom_text() # US hull from the perspective of Hawai'i hi.center <- state_center[state.abb == "HI", ] as.data.frame(state.center) |> transform(x = x - hi.center$x, y = y - hi.center$y, abbr = state.abb) |> ggplot(aes(x, y, label = abbr)) + geom_path(stat = "cone", data = \(d) subset(d, abbr != "HI")) + geom_text()
state_center <- as.data.frame(state.center) # US hull from the perspective of florida fl.center <- state_center[state.abb == "FL", ] as.data.frame(state.center) |> transform(x = x - fl.center$x, y = y - fl.center$y, abbr = state.abb) |> subset(abbr != "HI" & abbr != "AK") |> ggplot(aes(x, y, label = abbr)) + stat_cone(data = \(d) subset(d, abbr != "FL")) + geom_text() # US hull from the perspective of Hawai'i hi.center <- state_center[state.abb == "HI", ] as.data.frame(state.center) |> transform(x = x - hi.center$x, y = y - hi.center$y, abbr = state.abb) |> ggplot(aes(x, y, label = abbr)) + geom_path(stat = "cone", data = \(d) subset(d, abbr != "HI")) + geom_text()
Estimate data depth using ddalpha::depth.()
.
stat_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... )
stat_depth( mapping = NULL, data = NULL, geom = "contour", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... ) stat_depth_filled( mapping = NULL, data = NULL, geom = "contour_filled", position = "identity", contour = TRUE, contour_var = "depth", notion = "zonoid", notion_params = list(), n = 100L, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
contour |
If |
contour_var |
Character string identifying the variable to contour by.
Can be one of |
notion |
Character; the name of the depth function (passed to
|
notion_params |
List of additional parameters passed via |
n |
Number of grid points in each direction. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Arguments passed on to
|
Depth is an extension of the univariate notion of rank to bivariate (and sometimes multivariate) data (Rousseeuw &al, 1999). It comes in several flavors and is the basis for bagplots.
stat_depth()
is adapted from ggplot2::stat_density_2d()
and returns
depth values over a grid in the same format, so it is neatly paired with
ggplot2::geom_contour()
.
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
stat_depth()
and stat_depth_filled()
compute different variables
depending on whether contouring is turned on or off. With contouring off
(contour = FALSE
), both stats behave the same, and the following
variables are provided:
depth
the depth estimate
ndepth
depth estimate, scaled to a maximum of 1
With contouring on (contour = TRUE
), either ggplot2::stat_contour()
or
ggplot2::stat_contour_filled()
is run after the depth estimate has been
obtained, and the computed variables are determined by these stats.
Rousseeuw PJ, Ruts I, & Tukey JW (1999) "The Bagplot: A Bivariate Boxplot". The American Statistician, 53(4): 382–387. doi:10.1080/00031305.1999.10474494
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_projection()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
# base Motor Trends plot b <- ggplot(mtcars, aes(wt, disp)) + geom_point() # depth raster b + geom_raster(stat = "depth", aes(fill = after_stat(depth))) # depth grid b + stat_depth( geom = "point", contour = FALSE, aes(size = after_stat(depth)), n = 20 ) # depth contours b + geom_contour(stat = "depth", contour = TRUE) # depth bands b + geom_contour_filled(stat = "depth_filled", contour = TRUE, alpha = .75) # contours colored by group b + stat_depth(aes(color = factor(cyl))) # custom depth notion b + stat_depth( aes(color = factor(cyl)), notion = "halfspace", notion_params = list(exact = TRUE) ) # contours faceted by group b + stat_depth_filled(alpha = .75) + facet_wrap(facets = vars(factor(cyl))) # scaled to the unit interval # FIXME: Some polygons are missing. b + stat_depth_filled(contour_var = "ndepth", alpha = .75) + facet_wrap(facets = vars(factor(cyl)))
# base Motor Trends plot b <- ggplot(mtcars, aes(wt, disp)) + geom_point() # depth raster b + geom_raster(stat = "depth", aes(fill = after_stat(depth))) # depth grid b + stat_depth( geom = "point", contour = FALSE, aes(size = after_stat(depth)), n = 20 ) # depth contours b + geom_contour(stat = "depth", contour = TRUE) # depth bands b + geom_contour_filled(stat = "depth_filled", contour = TRUE, alpha = .75) # contours colored by group b + stat_depth(aes(color = factor(cyl))) # custom depth notion b + stat_depth( aes(color = factor(cyl)), notion = "halfspace", notion_params = list(exact = TRUE) ) # contours faceted by group b + stat_depth_filled(alpha = .75) + facet_wrap(facets = vars(factor(cyl))) # scaled to the unit interval # FIXME: Some polygons are missing. b + stat_depth_filled(contour_var = "ndepth", alpha = .75) + facet_wrap(facets = vars(factor(cyl)))
Compute projections of vectors from one matrix factor onto those of the other.
stat_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ..., show.legend = NA, inherit.aes = TRUE )
stat_projection( mapping = NULL, data = NULL, geom = "segment", position = "identity", referent = NULL, ..., show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
referent |
The reference data set; see Details. |
... |
Additional arguments passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
An ordination model of continuous data can be used to predict values
along one dimension from those along the other, using the artificial axes
as intermediaries. The predictions correspond geometrically to projections
of elements of one matrix factor in principal coordinates onto those of the
other factor in standard coordinates. In the most familiar setting of PCA
biplots, variable (column) values are predicted from case (row) locations
along PC1 and PC2. This transformation obtains the axis projections as
xend,yend
and pairs them with original points x,y
to demarcate segments
visualizing the projections.
WARNING:
This layer is appropriate only with axes in standard coordinates (usually
confer_inertia(p = "rows")
) and predictive calibration
(ggbiplot(axis.type = "predictive")
).
A ggproto layer.
This statistical transformation is done with respect to reference data passed
to referent
(ignored if NULL
, the default, possibly resulting in empty
output). See stat_referent()
for more details. This relies on a sleight of
hand through a new undocumented LayerRef
class and associated
ggplot2::ggplot_add()
method. As a result, only layers constructed using
this stat_*()
shortcut will pass the necessary positional aesthetics to the
$setup_params()
step, making them available to pre-process referent
data.
The biplot shortcuts automatically substitute the complementary matrix factor
for referent = NULL
and will use an integer vector to select a subset from
this factor. These uses do not require the mapping passage.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
xend,yend
projections onto (specified) vectors
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_rule()
,
stat_scale()
,
stat_spantree()
# simplify the Motor Trends data to two predictors legible at aspect ratio 1 mtcars |> transform(hp00 = hp/100) |> subset(select = c(mpg, hp00, wt)) -> subcars # compute the gradient of `mpg` against these two predictors lm(mpg ~ hp00 + wt, subcars) |> coefficients() |> as.list() |> as.data.frame() -> grad # project the data onto the gradient axis (with a reversed gradient vector) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + geom_point(shape = "circle open") + geom_vector(data = -grad) + stat_projection(referent = grad)
# simplify the Motor Trends data to two predictors legible at aspect ratio 1 mtcars |> transform(hp00 = hp/100) |> subset(select = c(mpg, hp00, wt)) -> subcars # compute the gradient of `mpg` against these two predictors lm(mpg ~ hp00 + wt, subcars) |> coefficients() |> as.list() |> as.data.frame() -> grad # project the data onto the gradient axis (with a reversed gradient vector) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + geom_point(shape = "circle open") + geom_vector(data = -grad) + stat_projection(referent = grad)
Compute statistics with respect to a reference data set with shared positional variables.
stat_referent( mapping = NULL, data = NULL, geom = "blank", position = "identity", referent = NULL, show.legend = NA, inherit.aes = TRUE, ... ) ## S3 method for class 'LayerRef' ggplot_add(object, plot, object_name)
stat_referent( mapping = NULL, data = NULL, geom = "blank", position = "identity", referent = NULL, show.legend = NA, inherit.aes = TRUE, ... ) ## S3 method for class 'LayerRef' ggplot_add(object, plot, object_name)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
referent |
The reference data set; see Details. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
object |
An object to add to the plot |
plot |
The ggplot object to add |
object_name |
The name of the object to add |
Often in geometric data analysis a statistical transformation applied to data
will also depend on data
, for example when drawing the
projections of vectors
onto vectors
. The stat layer
stat_referent()
accepts as an argument to the
referent
parameter
and pre-processes them using the existing positional aesthetic mappings to
x
and y
.
The ggproto can be used as a parent to more elaborate statistical
transformations, or the stat can be paired with geoms that expect the
referent
parameter and use it to position their transformations of .
It pairs by default to
[ggplot2::geom_blank()]
so as to prevent possibly
confusing output.
A ggproto layer.
Other biplot layers:
biplot-geoms
,
biplot-stats
,
stat_rows()
# simplify the Motor Trends data to two predictors legible at aspect ratio 1 mtcars |> transform(hp00 = hp/100) |> subset(select = c(mpg, hp00, wt)) -> subcars # compute the gradient of `mpg` against these two predictors lm(mpg ~ hp00 + wt, subcars) |> coefficients() |> as.list() |> as.data.frame() -> grad # use the gradient as a reference (to no effect in this basic ggproto) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + geom_point() + stat_referent(referent = grad) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + stat_referent(geom = "point", referent = grad)
# simplify the Motor Trends data to two predictors legible at aspect ratio 1 mtcars |> transform(hp00 = hp/100) |> subset(select = c(mpg, hp00, wt)) -> subcars # compute the gradient of `mpg` against these two predictors lm(mpg ~ hp00 + wt, subcars) |> coefficients() |> as.list() |> as.data.frame() -> grad # use the gradient as a reference (to no effect in this basic ggproto) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + geom_point() + stat_referent(referent = grad) ggplot(subcars, aes(x = hp00, y = wt)) + coord_equal() + stat_referent(geom = "point", referent = grad)
These stats merely tell ggplot2::ggplot()
which factor of an
ordination to pull data from for a plot layer. They are invoked internally
by the various geom_*_*()
layers.
stat_rows( mapping = NULL, data = data, geom = "point", position = "identity", subset = NULL, elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_cols( mapping = NULL, data = data, geom = "axis", position = "identity", subset = NULL, elements = "active", ..., show.legend = NA, inherit.aes = TRUE )
stat_rows( mapping = NULL, data = data, geom = "point", position = "identity", subset = NULL, elements = "active", ..., show.legend = NA, inherit.aes = TRUE ) stat_cols( mapping = NULL, data = data, geom = "axis", position = "identity", subset = NULL, elements = "active", ..., show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
subset |
An integer, logical, or character vector indicating a subset of
rows or columns for which to render graphical elements. NB: Internally, the
|
elements |
Character vector; which elements of each factor for which to
render graphical elements. One of |
... |
Additional arguments passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
Other biplot layers:
biplot-geoms
,
biplot-stats
,
stat_referent()
# FA of Swiss social data swiss_fa <- ordinate(swiss, model = factanal, factors = 2L, scores = "regression") # active and supplementary elements get_rows(swiss_fa, elements = "active") head(get_rows(swiss_fa, elements = "score")) # biplot using element filters and selection # (note that filter precedes selection) ggbiplot(swiss_fa) + geom_rows_point(elements = "score") + geom_rows_label(aes(label = name), elements = "score", subset = c(1, 4, 18)) + scale_alpha_manual(values = c(0, 1), guide = "none") + geom_cols_vector(aes(label = name))
# FA of Swiss social data swiss_fa <- ordinate(swiss, model = factanal, factors = 2L, scores = "regression") # active and supplementary elements get_rows(swiss_fa, elements = "active") head(get_rows(swiss_fa, elements = "score")) # biplot using element filters and selection # (note that filter precedes selection) ggbiplot(swiss_fa) + geom_rows_point(elements = "score") + geom_rows_label(aes(label = name), elements = "score", subset = c(1, 4, 18)) + scale_alpha_manual(values = c(0, 1), guide = "none") + geom_cols_vector(aes(label = name))
Determine axis limits and offset vectors from reference data.
stat_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ... ) minpp(x, p = 0.1) maxpp(x, p = 0.1) minabspp(x, p = 0.1)
stat_rule( mapping = NULL, data = NULL, geom = "rule", position = "identity", fun.lower = "minpp", fun.upper = "maxpp", fun.offset = "minabspp", fun.args = list(), referent = NULL, show.legend = NA, inherit.aes = TRUE, ... ) minpp(x, p = 0.1) maxpp(x, p = 0.1) minabspp(x, p = 0.1)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
fun.lower , fun.upper , fun.offset
|
Functions used to determine the limits
of the rules and the translations of the axes from the projections of
|
fun.args |
Optional additional arguments passed on to the functions. |
referent |
The reference data set; see Details. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
x |
A numeric vector. |
p |
A numeric value; the proportion of a range used as a buffer. |
Biplots with several axes can become cluttered and illegible. When this happens, Gower, Gardner–Lubbe, & le Roux (2011) recommend to translate the axes to a new point of intersection away from the origin, adjusting the axis markers accordingly. Then the axes converge in a region of the plot offset from most position markers or other elements. An alternative solution, implemented in the bipl5 package (https://github.com/RuanBuys/bipl5), is to translate each axis orthogonally away from the origin, which preserves the axis markers. This is the technique implemented here.
Separately, axes that fill the plotting window are uninformative when they exceed the range of the plotted position markers projected onto them. They may even be misinformative, suggesting that linear relationships extrapolate outside the data range. In these cases, Gower and Harding (1988) recommend using finite ranges determined by the data projection onto each axis.
Three functions control these operations: fun.offset
computes the
orthogonal distance of each axis from the origin, and fun.lower
and
fun.upper
compute the distance along each axis of the endpoints to the
(offset) origin. Both functions depend on what position data is to be offset
from or limited to, which must be passed manually to the referent
parameter.
A ggproto layer.
This statistical transformation is done with respect to reference data passed
to referent
(ignored if NULL
, the default, possibly resulting in empty
output). See stat_referent()
for more details. This relies on a sleight of
hand through a new undocumented LayerRef
class and associated
ggplot2::ggplot_add()
method. As a result, only layers constructed using
this stat_*()
shortcut will pass the necessary positional aesthetics to the
$setup_params()
step, making them available to pre-process referent
data.
The biplot shortcuts automatically substitute the complementary matrix factor
for referent = NULL
and will use an integer vector to select a subset from
this factor. These uses do not require the mapping passage.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
axis
unique axis identifier (integer)
lower,upper
distances to endpoints from origin (before offset)
yintercept,xintercept
intercepts (possibly Inf
) of offset axis
Gower JC, Gardner–Lubbe S, & le Roux NJ (2011) Understanding Biplots. Wiley, ISBN: 978-0-470-01255-0. https://www.wiley.com/go/biplots
Gower JC & Harding SA (1988) "Nonlinear biplots". Biometrika 75(3): 445–455. doi:10.1093/biomet/75.3.445
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_scale()
,
stat_spantree()
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # gradient rule with respect to two predictors stackloss_centered <- scale(stackloss, scale = FALSE) stackloss_centered |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + stat_rule( geom = "axis", data = coef_data, referent = stackloss_centered, fun.offset = \(x) minabspp(x, p = .5) ) # NB: `geom_rule(stat = "rule")` would fail to pass positional aesthetics.
# stack loss gradient stackloss |> lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.) |> coef() |> as.list() |> as.data.frame() |> subset(select = c(Air.Flow, Water.Temp, Acid.Conc.)) -> coef_data # gradient rule with respect to two predictors stackloss_centered <- scale(stackloss, scale = FALSE) stackloss_centered |> ggplot(aes(x = Acid.Conc., y = Air.Flow)) + coord_square() + geom_origin() + geom_point(aes(size = stack.loss, alpha = sign(stack.loss))) + scale_size_area() + scale_alpha_binned(breaks = c(-1, 0, 1)) + stat_rule( geom = "axis", data = coef_data, referent = stackloss_centered, fun.offset = \(x) minabspp(x, p = .5) ) # NB: `geom_rule(stat = "rule")` would fail to pass positional aesthetics.
Multiply artificial coordinates by a scale factor
stat_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 )
stat_scale( mapping = NULL, data = NULL, geom = "point", position = "identity", show.legend = NA, inherit.aes = TRUE, ..., mult = 1 )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
mult |
Numeric value used to scale the coordinates. |
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_spantree()
d <- data.frame(x = c(1, 0), y = c(0, 1)) ggplot(d, aes(x, y)) + geom_point(size = 3) + geom_vector(stat = "scale", mult = 2)
d <- data.frame(x = c(1, 0), y = c(0, 1)) ggplot(d, aes(x, y)) + geom_point(size = 3) + geom_vector(stat = "scale", mult = 2)
This stat layer identifies the pairs among
points that form a minimum spanning tree, then calculates the segments
between these poirs in the two dimensions
x
and y
.
stat_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... )
stat_spantree( mapping = NULL, data = NULL, geom = "segment", position = "identity", engine = "mlpack", method = "euclidean", show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
engine |
A single character string specifying the package implementation
to use; |
method |
Passed to |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Additional arguments passed to |
A minimum spanning tree (MST) on the point cloud is a minimal
connected graph on
with the smallest possible sum of distances (or
dissimilarities) between linked points. These layers call
stats::dist()
to
calculate a distance/dissimilarity object and an engine from mlpack,
vegan, or ade4 to calculate the MST. The result is formatted with
position aesthetics readable by ggplot2::geom_segment()
.
An MST calculated on x
and y
reflects the distances among the points in
in the reduced-dimension plane of the biplot. In contrast, one
calculated on the full set of coordinates reflects distances in
higher-dimensional space. Plotting this high-dimensional MST on the
2-dimensional biplot provides a visual cue as to how faithfully two
dimensions can encapsulate the "true" distances between points (Jolliffe,
2002).
A ggproto layer.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
This statistical transformation is compatible with the convenience function
ord_aes()
.
Some transformations (e.g. stat_center()
) commute with projection to the
lower (1 or 2)-dimensional biplot space. If they detect aesthetics of the
form ..coord[0-9]+
, then ..coord1
and ..coord2
are converted to x
and
y
while any remaining are ignored.
Other transformations (e.g. stat_spantree()
) yield different results in a
lower-dimensional biplot when they are computed before versus after
projection. If the stat layer detects these aesthetics, then the
transformation is performed before projection, and the results in the first
two dimensions are returned as x
and y
.
A small number of transformations (stat_rule()
) are incompatible with
ordination aesthetics but will accept ord_aes()
without warning.
These are calculated during the statistical transformation and can be accessed with delayed evaluation.
xend,yend,x,y
endpoints of tree branches (segments)
Jolliffe IT (2002) Principal Component Analysis, Second Edition. Springer Series in Statistics, ISSN 0172-7397. doi:10.1007/b98835 https://link.springer.com/book/10.1007/b98835
Other stat layers:
stat_bagplot()
,
stat_center()
,
stat_chull()
,
stat_cone()
,
stat_depth()
,
stat_projection()
,
stat_rule()
,
stat_scale()
UScitiesD |> cmdscale() |> as.data.frame() |> tibble::rownames_to_column(var = "city") -> us_mds ggplot(us_mds, aes(-V1, -V2, label = city)) + stat_spantree() + geom_label()
UScitiesD |> cmdscale() |> as.data.frame() |> tibble::rownames_to_column(var = "city") -> us_mds ggplot(us_mds, aes(-V1, -V2, label = city)) + stat_spantree() + geom_label()
These functions attach supplementary rows or columns to an ordination object.
recover_supp_rows(x) ## Default S3 method: recover_supp_rows(x) recover_supp_cols(x) ## Default S3 method: recover_supp_cols(x)
recover_supp_rows(x) ## Default S3 method: recover_supp_rows(x) recover_supp_cols(x) ## Default S3 method: recover_supp_cols(x)
x |
An object of class 'tbl_ord'. |
The recover_supp_*()
S3 methods produce matrices of
supplemental rows or columns of a tbl_ord object from the object itself.
The motivating example is linear discriminant analysis, which produces a
natural biplot of class discriminant centroids and variable axes but is
usually supplemented with case discriminant scores. The supplementary values
are augmented with an .element
column whose value indicates their source
and can be incorporated into a tidied form. If no supplementary
rows of a factor are produced, the functions return NULL
.
Matrices having the same numbers of columns as returned by
recover_rows()
and recover_cols()
, or else NULL
.
Other generic recoverers:
augmentation
,
conference
,
recoverers
These functions wrap ordination objects in the class tbl_ord, create tbl_ords directly from matrices, and test for the class and basic structure.
as_tbl_ord(x) ## S3 method for class 'tbl_ord' as_tbl_ord(x) make_tbl_ord(rows = NULL, cols = NULL, ...) is_tbl_ord(x) is.tbl_ord(x) valid_tbl_ord(x) un_tbl_ord(x)
as_tbl_ord(x) ## S3 method for class 'tbl_ord' as_tbl_ord(x) make_tbl_ord(rows = NULL, cols = NULL, ...) is_tbl_ord(x) is.tbl_ord(x) valid_tbl_ord(x) un_tbl_ord(x)
x |
An ordination object. |
rows , cols
|
Matrices to be used as factors of a tbl_ord. |
... |
Additional elements of a custom tbl_ord. |
The tbl_ord class wraps around a range of ordination classes, making
available a suite of ordination tools that specialize to each original object
class. These tools include format()
and fortify()
methods, which
facilitate the print()
method and the ggbiplot()
function.
No default method is provided for as_tbl_ord()
, despite most defined
methods being equivalent (simply appending 'tbl_ord' to the vector of object
classes). This prevents objects for which other methods are not defined from
being re-classed as tbl_ords.
The function make_tbl_ord()
creates a tbl_ord structured as a list of two
matrices, u
and v
, which must have the same number of columns and the
same column names.
is_tbl_ord()
checks an object x
for the tbl_ord class; valid_tbl_ord()
additionally checks for consistency between recover_coord(x)
and the
columns of recover_rows(x)
and recover_cols(x)
, using the recoverers.
un_tbl_ord()
removes attributes associated with the tbl_ord class in order
to restore an object that was originally passed to as_tbl_ord
.
A tbl_ord (as*()
, make*()
), an S3-class model object that can be
wrapped as one (un*()
), or a logical value (is*()
, value*()
).
# illustrative ordination: FA of Swiss social data swiss_fa <- factanal(swiss, factors = 3L, scores = "regression") print(swiss_fa) # add the 'tbl_ord' wrapper swiss_fa_ord <- as_tbl_ord(swiss_fa) # inspect wrapped model is_tbl_ord(swiss_fa_ord) print(swiss_fa_ord) valid_tbl_ord(swiss_fa_ord) # unwrap the model un_tbl_ord(swiss_fa_ord) # create a 'tbl_ord' directly from row and column factors # (missing inertia & other attributes) swiss_fa_ord2 <- make_tbl_ord(rows = swiss_fa$scores, cols = swiss_fa$loadings) # inspect wrapped factors is_tbl_ord(swiss_fa_ord2) print(swiss_fa_ord2) valid_tbl_ord(swiss_fa_ord2) # unwrap factors un_tbl_ord(swiss_fa_ord2)
# illustrative ordination: FA of Swiss social data swiss_fa <- factanal(swiss, factors = 3L, scores = "regression") print(swiss_fa) # add the 'tbl_ord' wrapper swiss_fa_ord <- as_tbl_ord(swiss_fa) # inspect wrapped model is_tbl_ord(swiss_fa_ord) print(swiss_fa_ord) valid_tbl_ord(swiss_fa_ord) # unwrap the model un_tbl_ord(swiss_fa_ord) # create a 'tbl_ord' directly from row and column factors # (missing inertia & other attributes) swiss_fa_ord2 <- make_tbl_ord(rows = swiss_fa$scores, cols = swiss_fa$loadings) # inspect wrapped factors is_tbl_ord(swiss_fa_ord2) print(swiss_fa_ord2) valid_tbl_ord(swiss_fa_ord2) # unwrap factors un_tbl_ord(swiss_fa_ord2)
Omit cartesian coordinate visual aids.
theme_scaffold() theme_biplot()
theme_scaffold() theme_biplot()
Geometric data analysis concerns the intrinsic geometry of data. Analyses often use artificial or arbitrary coordinate systems that carry no useful interpretation but instead serve as scaffolding, especially for graphical elements like axes that represent other variables (Gardner, 2001). In such cases, the visual aids (tick marks and labels, grid lines) used to recover the coordinates of the row and column markers would add unnecessary clutter and should be omitted. This partial theme updates the current theme by removing these elements. The biplot theme is an alias included for convenience and backward compatibility.
A ggplot theme.
Gardner S (2001) Extensions of biplot methodology to discriminant analysis with applications of non-parametric principal components. PhD thesis, Stellenbosch University. http://hdl.handle.net/10019.1/52264
These functions return tibbles that summarize
an object of class 'tbl_ord'. tidy()
output contains one row per
artificial coordinate and glance()
output contains one row for the whole
ordination.
## S3 method for class 'tbl_ord' tidy(x, ...) ## S3 method for class 'tbl_ord' glance(x, ...) ## S3 method for class 'tbl_ord' fortify(model, data, ..., .matrix = "dims", elements = "all")
## S3 method for class 'tbl_ord' tidy(x, ...) ## S3 method for class 'tbl_ord' glance(x, ...) ## S3 method for class 'tbl_ord' fortify(model, data, ..., .matrix = "dims", elements = "all")
x , model
|
An object of class 'tbl_ord'. |
... |
Additional arguments allowed by generics; currently ignored. |
data |
Passed to generic methods; currently ignored. |
.matrix |
A character string partially matched (lowercase) to several
indicators for one or both matrices in a matrix decomposition used for
ordination. The standard values are |
elements |
Character vector; which elements of each factor for which to
render graphical elements. One of |
Three generics popularized by the ggplot2 and broom packages make use of the augmentation methods:
The generics::tidy()
method
summarizes information about model components, which here are the
artificial coordinates created by ordinations. The output can be passed to
ggplot2::ggplot()
to generate scree plots.
The returned columns are
name
: (the name of) the coordinate
other columns extracted from the model, usually a single additional column of the singular or eigen values
inertia
: the multidimensional variance
prop_var
: the proportion of inertia
quality
: the cumulative proportion of variance
The generics::glance()
method
reports information about the entire model, here always treated as one of a
broader class of ordination models.
The returned columns are
rank
: the rank of the ordination model, i.e. the number of ordinates
n.row
,n.col
: the dimensions of the decomposed matrix
inertia
: the total inertia in the ordination
prop.var.*
: the proportion of variance in the first 2 ordinates
class
: the class of the wrapped model object
The ggplot2::fortify()
method
augments and collapses row and/or column data, depending on .matrix
and
.element
, into a single tibble, in preparation for ggplot2::ggplot()
.
Its output resembles that of generics::augment()
, though rows in the
output may correspond to rows, columns, or both of the original data. If
.matrix
is passed "rows"
, "cols"
, or "dims"
(for both), then
fortify()
returns a tibble whose fields are obtained, in order, via
get_*()
, recover_aug_*()
, and annotation_*()
.
The tibble is assigned a "coordinates"
attribute whose value is obtained
via get_coord()
. This facilitates some downstream functionality that relies
on more than those coordinates used as position aesthetics in a biplot, in
particular stat_spantree()
.
A tibble.
augmentation methods that must interface with tidiers.
# illustrative ordination: PCA of iris data iris_pca <- ordinate(iris, ~ prcomp(., center = TRUE, scale. = TRUE), seq(4L)) # use `tidy()` to summarize distribution of inertia tidy(iris_pca) # this facilitates scree plots tidy(iris_pca) %>% ggplot(aes(x = name, y = prop_var)) + geom_col() + scale_y_continuous(labels = scales::percent) + labs(x = NULL, y = "Proportion of variance") # use `fortify()` to prepare either matrix factor for `ggplot()` fortify(iris_pca, .matrix = "V") %>% ggplot(aes(x = name, y = PC1)) + geom_col() + coord_flip() + labs(x = "Measurement") iris_pca %>% fortify(.matrix = "U") %>% ggplot(aes(x = PC1, fill = Species)) + geom_histogram() + labs(y = NULL) # ... or to prepare both for `ggbiplot()` fortify(iris_pca) # use `glance()` to summarize the model as an ordination glance(iris_pca) # this enables comparisons to other models rbind( glance(ordinate(subset(iris, Species == "setosa"), prcomp, seq(4L))), glance(ordinate(subset(iris, Species == "versicolor"), prcomp, seq(4L))), glance(ordinate(subset(iris, Species == "virginica"), prcomp, seq(4L))) )
# illustrative ordination: PCA of iris data iris_pca <- ordinate(iris, ~ prcomp(., center = TRUE, scale. = TRUE), seq(4L)) # use `tidy()` to summarize distribution of inertia tidy(iris_pca) # this facilitates scree plots tidy(iris_pca) %>% ggplot(aes(x = name, y = prop_var)) + geom_col() + scale_y_continuous(labels = scales::percent) + labs(x = NULL, y = "Proportion of variance") # use `fortify()` to prepare either matrix factor for `ggplot()` fortify(iris_pca, .matrix = "V") %>% ggplot(aes(x = name, y = PC1)) + geom_col() + coord_flip() + labs(x = "Measurement") iris_pca %>% fortify(.matrix = "U") %>% ggplot(aes(x = PC1, fill = Species)) + geom_histogram() + labs(y = NULL) # ... or to prepare both for `ggbiplot()` fortify(iris_pca) # use `glance()` to summarize the model as an ordination glance(iris_pca) # this enables comparisons to other models rbind( glance(ordinate(subset(iris, Species == "setosa"), prcomp, seq(4L))), glance(ordinate(subset(iris, Species == "versicolor"), prcomp, seq(4L))), glance(ordinate(subset(iris, Species == "virginica"), prcomp, seq(4L))) )
These *_ord
functions wrap core R functions with modifications
for use with 'tbl_ord' methods. Some parameters are hidden from the user
and set to settings required for these methods, some matrix outputs are
given row or column names to be used by them, and new '*_ord' S3 class
attributes are added to enable them.
eigen_ord(x, symmetric = isSymmetric.matrix(x)) svd_ord(x, nu = min(dim(x)), nv = min(dim(x))) cmdscale_ord(d, k = 2, add = FALSE) cancor_ord(x, y, xcenter = TRUE, ycenter = TRUE, scores = FALSE)
eigen_ord(x, symmetric = isSymmetric.matrix(x)) svd_ord(x, nu = min(dim(x)), nv = min(dim(x))) cmdscale_ord(d, k = 2, add = FALSE) cancor_ord(x, y, xcenter = TRUE, ycenter = TRUE, scores = FALSE)
x |
a numeric or complex matrix whose spectral decomposition is to be computed. Logical matrices are coerced to numeric. |
symmetric |
if |
nu |
the number of left singular vectors to be computed.
This must between |
nv |
the number of right singular vectors to be computed.
This must be between |
d |
a distance structure such as that returned by |
k |
the maximum dimension of the space which the data are to be
represented in; must be in |
add |
logical indicating if an additive constant |
y |
numeric matrix ( |
xcenter |
logical or numeric vector of length |
ycenter |
analogous to |
scores |
Logical; whether to return canonical scores and structure correlations. |
The following table summarizes the wrapped functions:
Original function | Hide params | New params | Add names | New class |
base::eigen() |
Yes | No | Yes | Yes |
base::svd() |
Yes | No | Yes | Yes |
stats::cmdscale() |
Yes | No | No | Yes |
stats::cancor() |
No | Yes | No | Yes |
By default, cancor_ord()
returns the same data as stats::cancor()
: the
canonical correlations (cor
), the canonical coefficients ($xcoef
and
$ycoef
), and the variable means ($xcenter
, $ycenter
). If scores = TRUE
, then cancor_ord()
also returns the scores $xscores
and $yscores
calculated from the (appropriately centered) data and the coefficients and
the intraset structure correlations $xstructure
and $ystructure
between
these and the data. These modifications are inspired by the cancor()
function in candisc, though two caveats should be noted: First, the
canonical coefficients (hence the canonical scores) are scaled by
compared to these, though the intraset structure correlations are the same.
Second, the interset structure correlations are not returned, as these may
be obtained by conferring inertia unto the intraset ones.
Objects slightly modified from the outputs of the original functions, with new '*-ord' classes.
# glass composition data from one furnace glass_banias <- subset( glass, Context == "L.15;B.166", select = c("SiO2", "Na2O", "CaO", "Al2O3", "MgO", "K2O") ) # eigendecomposition of a covariance matrix (glass_cov <- cov(glass_banias)) eigen_ord(glass_cov) # singular value decomposition of a data matrix svd_ord(glass_banias) # classical multidimensional scaling of a distance matrix cmdscale_ord(dist(glass_banias)) # canonical correlation analysis with trace components glass_banias_minor <- subset( glass, Context == "L.15;B.166", select = c("TiO2", "FeO", "MnO", "P2O5", "Cl", "SO3") ) # impute half of detection threshold glass_banias_minor$TiO2[[1L]] <- 0.5 cancor_ord(glass_banias, glass_banias_minor) # calculate canonical scores and structure correlations glass_cca <- cancor_ord(glass_banias[, 1:3], glass_banias_minor[, 1:3], scores = TRUE) # scores glass_cca$xscores # intraset correlations glass_cca$xstructure # interset correlations glass_cca$xstructure %*% diag(glass_cca$cor)
# glass composition data from one furnace glass_banias <- subset( glass, Context == "L.15;B.166", select = c("SiO2", "Na2O", "CaO", "Al2O3", "MgO", "K2O") ) # eigendecomposition of a covariance matrix (glass_cov <- cov(glass_banias)) eigen_ord(glass_cov) # singular value decomposition of a data matrix svd_ord(glass_banias) # classical multidimensional scaling of a distance matrix cmdscale_ord(dist(glass_banias)) # canonical correlation analysis with trace components glass_banias_minor <- subset( glass, Context == "L.15;B.166", select = c("TiO2", "FeO", "MnO", "P2O5", "Cl", "SO3") ) # impute half of detection threshold glass_banias_minor$TiO2[[1L]] <- 0.5 cancor_ord(glass_banias, glass_banias_minor) # calculate canonical scores and structure correlations glass_cca <- cancor_ord(glass_banias[, 1:3], glass_banias_minor[, 1:3], scores = TRUE) # scores glass_cca$xscores # intraset correlations glass_cca$xstructure # interset correlations glass_cca$xstructure %*% diag(glass_cca$cor)