Statistical Computing and Graphics Concepts to Master
Difference between a functional language with live access to data and a procedural or macro language
Meaning of a variable and what is the most commonly used object in S to store the values of a variable
What constitutes legal names for S objects
The most common data types in S
How to compose S commands for doing arithmetic on constants and variables
Names of the most commonly used algebraic and statistical functions
Difference between character constants and names of objects
Purpose and names of commonly used attributes of S objects
When to use { [ ( ) ] }
How to create, use, and subset vectors
Making systematic sequences of integers and floating point numerics
How to create logical expressions for checking equality and inequality, unions, and intersections of conditions
Notation for missing values and how to check for them
How to invoke functions having more than one argument
Purpose and components of data frames
How to subset rows and/or columns of data frames
How to extract or access a single variable from a data frame
Purpose of a factor variable and how to create one
How to make functions in add-on S libraries available for execution
Purpose of the data.dump and data.restore functions.
Methods for carrying out an overall inspection of the quality and completeness of data in an imported dataset.
Understand the search list.
Understand the special purpose of search position one.
Know how to make the variables contained in a data frame available for computation without using the name of the data frame as a prefix to a variable name.
How the general setup for using the Hmisc upData function to change, add, or delete variables in a data frame.
Understand some of the ways for repeating analyses over various data subgroups (stratified analyses).
Know how to create basic derived variables and how to recode categories of categorical variables.
Know an easy way to categorize a continuous variable into intervals.
Know the order that data import / annotation and other changes to variables / analyses / attaching data frames should be done.
Know how to use S functions to compute probabilities from a few of the most commonly used statistical distributions. For discrete distributions such as the binomial, know how to compute the probability of a specific value and how to compute cumulative probabilities.
Know how to use S statistical functions for the normal, t, F, and chi-square distribution to compute P values for statistical tests.
How to get a Spearman test of association, Wilcoxon two-sample test, and Kruskal-Wallis test, as special cases of a more general rank test, using a single S function.
Know the syntax for an S statistical formula.
Know in general terms the capabilities of the summary.formula function.
Understand that in many cases stratifying on a continuous variable by categorizing into intervals does not require creating a new variable.
Know the major elements of graphical perception.
Know how best to make a graph so that people can perceive differences.
Understand Weber's law and its ramifications for graphical design.
Know causes of common optical illusions in statistical graphics.
Know the ordering of perception tasks by how well humans interpret information from them.
Understand problems caused by pop charts.
Know how to determine a good aspect ratio for graphing curves.
Know how dot charts overcome problems with other types of charts.
Know how to best summarize distributional characteristics of data for graphics.
Know the best types of graphs for representing various types of data, and understand why these types are preferred.
Understand various methods for conditioning on other variables.
Know how to interpret and when to use the following types of plots:
rug plots (one-dimensional scatterplots)
histograms
density plots
empirical cumulative distribution plots
box plots
scatter plots
thermometer plots
bubble plots
scatterplot matrices
Last Modified: 31 Oct 2001
-- FrankHarrell - 29 Feb 2004