Barry Grant < http://thegrantlab.org/bggn213/ >
2019-04-30 (15:18:13 PDT on Tue, Apr 30)

Background

The goal of this hands-on session is for you to explore a complete analysis using the unsupervised learning techniques covered in the last class. You'll extend what you've learned by combining PCA as a preprocessing step to clustering using data that consist of measurements of cell nuclei of human breast masses. This expands on our RNA-Seq analysis from last day.

The data itself comes from the Wisconsin Breast Cancer Diagnostic Data Set first reported by K. P. Benne and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets".

Values in this data set describe characteristics of the cell nuclei present in digitized images of a fine needle aspiration (FNA) of a breast mass.

FNA is a type of biopsy procedure where a very thin needle is inserted into an area of abnormal tissue or cells with a guide of CT scan or ultrasound monitors (Figure 1). The collected sample is then transferred to a pathologist to study it under a microscope and examine whether cells in the biopsy are normal or not.

For example radius (i.e. mean of distances from center to points on the perimeter), texture (i.e. standard deviation of gray-scale values), and smoothness (local variation in radius lengths). Summary information is also provided for each group of cells including diagnosis (i.e. benign (not cancerous) and and malignant (cancerous)).

1. Exploratory data analysis

Preparing the data

Before we can begin our analysis we first have to download and import our data correctly into our R session.

For this we can use the read.csv() function to read the CSV (comma-separated values) file containing the data (available from our class website: WisconsinCancer.csv )

Assign the result to an object called wisc.df.

# Save your input data file to a new 'data' directory
fna.data <- "data/WisconsinCancer.csv"

# Complete the following code to input the data and store as wisc.df
wisc.df <- ___
  • Q1.. What type of object is returned from the read.csv() function?
  • Q2. How many observations (i.e. patients) are in this dataset?
  • Q3. How many of the observations have a malignant diagnosis?
  • Q4. How many variables/features in the data are suffixed with _mean?

The functions class(), dim(), nrow(), table(), length() and grep() may be useful for answering the first 4 questions above.

Examine your input data to ensure column names are set correctly. The id and diagnosis columns will not be used for most of the following steps (you can use the View() or head() functions here).

wisc.df

Next use as.matrix() to convert the other features (i.e. columns) of the data (in columns 3 through 32) to a matrix. Store this in a variable called wisc.data.

  • Q5.. Why do you think we are using the indices 3:32 here?

What is in the last column in this data set? Did you check your data properly?

  • Note that this dataset went through excel and was deposited just like this.
  • Take-home: Always carefully check your data even when it comes from experts.
# Convert the features of the data: wisc.data
wisc.data <- as.matrix( ___ )

If you have not already done so, now would be a good time to assign the row names of wisc.data the values currently contained in the id column of wisc.df. While not strictly required, this will help you keep track of the different observations throughout the modeling process.

# Set the row names of wisc.data
row.names(wisc.data) <- wisc.df$id
#head(wisc.data)

Finally, setup a separate new vector called diagnosis that contains the data from the diagnosis column of the original dataset. We will use this later to check our results as we aim to discriminate benign from malignant cancerous cells.

# Create diagnosis vector for later 
diagnosis <- ___ 

2. Principal Component Analysis

Performing PCA

The next step in your analysis is to perform principal component analysis (PCA) on wisc.data.

It is important to check if the data need to be scaled before performing PCA. Recall two common reasons for scaling data include:

  • The input variables use different units of measurement.
  • The input variables have significantly different variances.

Check the mean and standard deviation of the features (i.e. columns) of the wisc.data to determine if the data should be scaled. Use the colMeans() and apply() functions like you've done before.

# Check column means and standard deviations
colMeans(wisc.data)

apply(wisc.data,2,sd)
  • Q6. True or False? The data need to be scaled before PCA

Do the mean and standard deviation values of each column look similar?

# Check column means and standard deviations
colMeans(wisc.data)

apply(wisc.data,2,sd)

The goal with scaling is to make the variables comparable. Generally (as in this case) variables are scaled to have a standard deviation of one and a mean of zero. Such scaling is particular recommended when variables are measured using different scales or when the original mean and/or standard deviation of variables is largely different. Note that this standardization is widely used in the context of gene expression data analysis before PCA and clustering analysis. - Note how the apply() function will apply the sd() function over the columns (2) of the input wisc.data matrix, See the ?apply and ask Barry for further clarification if you are unsure what this means here. - You could also use the round() function to help with the readability of your results above. - Take-home: You can apply any function over the rows or columns of a dataset without having to write explicit loops.

Execute PCA with the prcomp() function on the wisc.data, scaling if appropriate, and assign the output model to wisc.pr.

# Perform PCA on wisc.data by completing the following code
wisc.pr <- prcomp( ___ )

Inspect a summary of the results with the summary() function.

# Look at summary of results
summary(wisc.pr)
## Importance of components:
##                           PC1    PC2     PC3     PC4     PC5     PC6
## Standard deviation     3.6444 2.3857 1.67867 1.40735 1.28403 1.09880
## Proportion of Variance 0.4427 0.1897 0.09393 0.06602 0.05496 0.04025
## Cumulative Proportion  0.4427 0.6324 0.72636 0.79239 0.84734 0.88759
##                            PC7     PC8    PC9    PC10   PC11    PC12
## Standard deviation     0.82172 0.69037 0.6457 0.59219 0.5421 0.51104
## Proportion of Variance 0.02251 0.01589 0.0139 0.01169 0.0098 0.00871
## Cumulative Proportion  0.91010 0.92598 0.9399 0.95157 0.9614 0.97007
##                           PC13    PC14    PC15    PC16    PC17    PC18
## Standard deviation     0.49128 0.39624 0.30681 0.28260 0.24372 0.22939
## Proportion of Variance 0.00805 0.00523 0.00314 0.00266 0.00198 0.00175
## Cumulative Proportion  0.97812 0.98335 0.98649 0.98915 0.99113 0.99288
##                           PC19    PC20   PC21    PC22    PC23   PC24
## Standard deviation     0.22244 0.17652 0.1731 0.16565 0.15602 0.1344
## Proportion of Variance 0.00165 0.00104 0.0010 0.00091 0.00081 0.0006
## Cumulative Proportion  0.99453 0.99557 0.9966 0.99749 0.99830 0.9989
##                           PC25    PC26    PC27    PC28    PC29    PC30
## Standard deviation     0.12442 0.09043 0.08307 0.03987 0.02736 0.01153
## Proportion of Variance 0.00052 0.00027 0.00023 0.00005 0.00002 0.00000
## Cumulative Proportion  0.99942 0.99969 0.99992 0.99997 1.00000 1.00000
  • Q7. From your results, what proportion of the original variance is captured by the first principal components (PC1)?
  • Q8. How many principal components (PCs) are required to describe at least 70% of the original variance in the data?
  • Q9. How many principal components (PCs) are required to describe at least 90% of the original variance in the data?

Interpreting PCA results

Now you will use some visualizations to better understand your PCA model. A common visualization for PCA results is the so-called biplot.

However, you will often run into some common challenges with using biplots on real-world data containing a non-trivial number of observations and variables. Here we will need to look at some alternative visualizations. You are encouraged to experiment with additional visualizations before moving on to the next section

Create a biplot of the wisc.pr using the biplot() function.

  • Q10. What stands out to you about this plot? Is it easy or difficult to understand? Why?

It is quite crap really! This default plot is only helpful for relatively small datasets in my experience.

biplot(___)

Rownames are used as the plotting character for biplots like this one which can make trends rather hard to see. In fact, this plot is very poor. So lets generate our own scatter plot of each observation along principal components 1 and 2 (i.e. a plot of PC1 vs PC2 available as the first two columns of wisc.pr$x) and color the points by the diagnosis (available in the diagnosis vector you created earlier).

  • Q11. Generate a PC1 vs PC2 plot as described above colored by diagnosis. What do the points in this plot represent? What are the red points? Where did this coloring come from and why were these colors chosen by R?
# Scatter plot observations by components 1 and 2
plot( ___ , col = ___ , 
     xlab = "PC1", ylab = "PC2")

Note that each point in this plot is a patient and they are colored by their diagnosis (red for malignamnt (i.e. cancerous) and black for benign (i.e. not cancerous)). The spread and aparent seperation of cancer from non cancer we see from this analysis is striking and demands our further attention!

  • Q12. Generate a similar plot for principal components 1 and 3. What do you notice about these plots?
# Repeat for components 1 and 3
plot(wisc.pr$x[, ___ ], col = diagnosis, 
     xlab = "PC1", ylab = "PC3")

Because principal component 2 explains more variance in the original data than principal component 3, you can see that the first plot has a cleaner cut separating the two subgroups.

Overall, the plots indicate that principal component 1 is capturing a separation of malignant from benign samples. This is an important and interesting result worthy of further exploration - as we will do in the next sections!

Variance explained

In this section, you will produce scree plots showing the proportion of variance explained as the number of principal components increases. The data from PCA must be prepared for these plots, as there is not a built-in function in base R to create them directly from the PCA model.

As you look at these plots, ask yourself if there's an 'elbow' in the amount of variance explained that might lead you to pick a natural number of principal components. If an obvious elbow does not exist, as is typical in some real-world datasets, consider how else you might determine the number of principal components to retain based on the scree plot.

Calculate the variance of each principal component by squaring the sdev component of wisc.pr (i.e. wisc.pr$sdev^2). Save the result as an object called pr.var.

# Calculate variance of each component
pr.var <- ___
head(pr.var)
## [1] 13.281608  5.691355  2.817949  1.980640  1.648731  1.207357

Calculate the variance explained by each principal component by dividing by the total variance explained of all principal components. Assign this to a variable called pve and create a plot of variance explained for each principal component.

# Variance explained by each principal component: pve
pve <- ___ / ___

# Plot variance explained for each principal component
plot(pve, xlab = "Principal Component", 
     ylab = "Proportion of Variance Explained", 
     ylim = c(0, 1), type = "o")

# Alternative scree plot of the same data, note data driven y-axis
barplot(pve, ylab = "Precent of Variance Explained",
     names.arg=paste0("PC",1:length(pve)), las=2, axes = FALSE)
axis(2, at=pve, labels=round(pve,2)*100 )

Optional: There are quite a few CRAN packages that are helpful for PCA. This includes the factoextra package. Feel free to explore this package. For example:

## ggplot based graph
#install.packages("factoextra")
library(factoextra)
fviz_eig(wisc.pr, addlabels = TRUE)

Communicating PCA results

In this section we will check your understanding of the PCA results, in particular the loadings and variance explained. The loadings, represented as vectors, explain the mapping from the original features to the principal components. The principal components are naturally ordered from the most variance explained to the least variance explained.

  • Q13. For the first principal component, and using two significant figures , what is the component of the loading vector (i.e. wisc.pr$rotation[,1]) for the feature radius_mean?

  • Q14. For the first principal component, and using two significant figures, what is the component of the loading vector (i.e. wisc.pr$rotation[,1]) for the feature smoothness_se?

  • Q15. Which original variable contributes most to PC1? "

We can consider the influence of each of the original variables upon the principal components (typically known as loading scores). This information can be obtained from the prcomp() returned $rotation component. For example:

# Access the contribution of one variable on PC1 
wisc.pr$rotation["concave.points_mean",1]

# Sort the loadings values by their absolute contribution to PC1
sort( abs(wisc.pr$rotation[,1]) )

A larger value here represents a larger contribution. Barry will discuss how PCA determines this contribution value in a moment - ask him now if he has not already!

N.B. Once you get this far please let Barry know as we will discuss colectively how PCA works under hood. This is a good time to take a short break til the rest of the class catches up ;-)

3. Hierarchical clustering

Hierarchical clustering of case data

The goal of this section is to do hierarchical clustering of the observations. Recall from our last class that this type of clustering does not assume in advance the number of natural groups that exist in the data.

As part of the preparation for hierarchical clustering, the distance between all pairs of observations are computed. Furthermore, there are different ways to link clusters together, with single, complete, and average being the most common linkage methods.

Scale the wisc.data data and assign the result to data.scaled.

# Scale the wisc.data data: data.scaled
data.scaled <- ___(wisc.data)

Calculate the (Euclidean) distances between all pairs of observations in the new scaled dataset and assign the result to data.dist.

data.dist <- ___(data.scaled)

Create a hierarchical clustering model using complete linkage. Manually specify the method argument to hclust() and assign the results to wisc.hclust.

wisc.hclust <- ___(data.dist, ___)

Results of hierarchical clustering

Let's use the hierarchical clustering model you just created to determine a height (or distance between clusters) where a certain number of clusters exists.

  • Q16. Using the plot() and abline() functions, what is the height at which the clustering model has 4 clusters?
plot(___)
abline(___, col="red", lty=2)

Selecting number of clusters

In this section, you will compare the outputs from your hierarchical clustering model to the actual diagnoses. Normally when performing unsupervised learning like this, a target variable (i.e. known answer or labels) isn't available. We do have it with this dataset, however, so it can be used to check the performance of the clustering model.

When performing supervised learning - that is, when you're trying to predict some target variable of interest and that target variable is available in the original data - using clustering to create new features may or may not improve the performance of the final model.

This exercise will help you determine if, in this case, hierarchical clustering provides a promising new feature.

Use cutree() to cut the tree so that it has 4 clusters. Assign the output to the variable wisc.hclust.clusters.

wisc.hclust.clusters <- ___

We can use the table() function to compare the cluster membership to the actual diagnoses.

table(wisc.hclust.clusters, diagnosis)
##                     diagnosis
## wisc.hclust.clusters   B   M
##                    1  12 165
##                    2   2   5
##                    3 343  40
##                    4   0   2

Here we picked four clusters and see that cluster 1 largely corresponds to malignant cells (with diagnosis values of 1) whilst cluster 3 largely corresponds to benign cells (with diagnosis values of 0).

Before moving on, explore how different numbers of clusters affect the ability of the hierarchical clustering to separate the different diagnoses.

  • Q17. Can you find a better cluster vs diagnoses match by cutting into a different number of clusters between 2 and 10? How would you determine what better is in this context?

We can think in terms of TRUE positive, TRUE negative, FALSE positive and FALSE negative as we discussed back in an earlier class. Also recall that we discussed the concepts of sensitivity, specificity and ROC/AUC analysis.

  • Sensitivity refers to a test's ability to correctly detect ill patients who do have the condition. In our example here the sensitivity is the total number of samples in the cluster identified as predominantly malignant (cancerous) divided by the total number of known malignant samples.

  • Specificity relates to a test's ability to correctly reject healthy patients without a condition. In our example specificity is the proportion of benign (not cancerous) samples in the cluster identified as predominantly benign that are known to be benign.

  • Take-home. There is a judgment call here on your part: Which of your analysis procedures resulted in a clustering model with the best specificity? How about sensitivity? Which would you prefer to optimize for in this particular case?

4. OPTIONAL: K-means clustering

K-means clustering and comparing results

As you now know from the last class, there are two main types of clustering: hierarchical and k-means.

In this section, you will create a k-means clustering model on the Wisconsin breast cancer data and compare the results to the actual diagnoses and the results of your hierarchical clustering model. If you are running a little behind feel free to skip ahead to section 5 otherwise if you find you are flying through things please take some time to see how each clustering model performs in terms of separating the two diagnoses and how the clustering models compare to each other.

Create a k-means model on wisc.data, assigning the result to wisc.km. Be sure to create 2 clusters, corresponding to the actual number of diagnosis. Also, remember to scale the data (with the scale() function and repeat the algorithm 20 times (by setting setting the value of the nstart argument appropriately). Running multiple times such as this will help to find a well performing model.

wisc.km <- kmeans(___, centers= ___, nstart= ___)

Use the table() function to compare the cluster membership of the k-means model (wisc.km$cluster) to the actual diagnoses contained in the diagnosis vector.

table(___, ___)
  • Q18. How well does k-means separate the two diagnoses? How does it compare to your hclust results?

Use the table() function to compare the cluster membership of the k-means model (wisc.km$cluster) to your hierarchical clustering model from above (wisc.hclust.clusters). Recall the cluster membership of the hierarchical clustering model is contained in wisc.hclust.clusters object.

table(___, ___)
##                     
## wisc.hclust.clusters   1   2
##                    1  17 160
##                    2   0   7
##                    3 363  20
##                    4   0   2

Looking at the second table you generated, it looks like clusters 1, 2, and 4 from the hierarchical clustering model can be interpreted as the cluster 1 equivalent from the k-means algorithm, and cluster 3 can be interpreted as the cluster 2 equivalent.

5. Combining methods

Clustering on PCA results

In this final section, you will put together several steps you used earlier and, in doing so, you will experience some of the creativity and open endedness that is typical in unsupervised learning.

Recall from earlier sections that the PCA model required significantly fewer features to describe 70%, 80% and 95% of the variability of the data. In addition to normalizing data and potentially avoiding over-fitting, PCA also uncorrelates the variables, sometimes improving the performance of other modeling techniques.

Let's see if PCA improves or degrades the performance of hierarchical clustering.

Using the minimum number of principal components required to describe at least 90% of the variability in the data, create a hierarchical clustering model with the linkage method="ward.D2". We use Ward's criterion here because it is based on multidimensional variance like principal components analysis. Assign the results to wisc.pr.hclust.

grps <- cutree(wisc.pr.hclust, k=2)
table(grps)
## grps
##   1   2 
## 216 353
table(grps, diagnosis)
##     diagnosis
## grps   B   M
##    1  28 188
##    2 329  24
plot(wisc.pr$x[,1:2], col=grps)

plot(wisc.pr$x[,1:2], col=diagnosis)

Note the color swap here as the hclust cluster 1 is mostly "M" and cluster 2 is mostly "B" as we saw from the results of calling table(grps, diagnosis). To match things up we can turn our groups into a factor and reorder the levels so cluster 2 comes first

g <- as.factor(grps)
levels(g)
## [1] "1" "2"
g <- relevel(g,2)
levels(g)
## [1] "2" "1"
# Plot using our re-ordered factor 
plot(wisc.pr$x[,1:2], col=g)

OPTIONAL: Lets be fancy and look in 3D with the rgl package we learned about in a previous class. Feel free to skip this optional step if you have difficulty installing the rgl package on windows machines. On mac you will likely need to install the XQuartz package from here: https://www.xquartz.org

library(rgl)
plot3d(wisc.pr$x[,1:3], xlab="PC 1", ylab="PC 2", zlab="PC 3", cex=1.5, size=1, type="s", col=grps)

To include the interactive rgl plot in your HTML renderd lab report you can add the R code rglwidget(width = 400, height = 400) after you call the plot3d() function. It will look just like the plot above. Try rotating and zooming on this 3D plot.

## Use the distance along the first 7 PCs for clustering i.e. wisc.pr$x[, 1:7]
wisc.pr.hclust <- hclust(___, method="ward.D2")

Cut this hierarchical clustering model into 2 clusters and assign the results to wisc.pr.hclust.clusters.

wisc.pr.hclust.clusters <- cutree(wisc.pr.hclust, k=2)

Using table(), compare the results from your new hierarchical clustering model with the actual diagnoses.

  • Q19. How well does the newly created model with four clusters separate out the two diagnoses?
# Compare to actual diagnoses
table(___, diagnosis)
##                        diagnosis
## wisc.pr.hclust.clusters   B   M
##                       1  28 188
##                       2 329  24
  • Q20. How well do the k-means and hierarchical clustering models you created in previous sections (i.e. before PCA) do in terms of separating the diagnoses? Again, use the table() function to compare the output of each model (wisc.km$cluster and wisc.hclust.clusters) with the vector containing the actual diagnoses.
table(___, diagnosis)
table(___, diagnosis)
##    diagnosis
##       B   M
##   1 343  37
##   2  14 175
##                     diagnosis
## wisc.hclust.clusters   B   M
##                    1  12 165
##                    2   2   5
##                    3 343  40
##                    4   0   2

6. Sensitivity/Specificity

Sensitivity refers to a test's ability to correctly detect ill patients who do have the condition. In our example here the sensitivity is the total number of samples in the cluster identified as predominantly malignant (cancerous) divided by the total number of known malignant samples.

Specificity relates to a test's ability to correctly reject healthy patients without a condition. In our example specificity is the proportion of benign (not cancerous) samples in the cluster identified as predominantly benign that are known to be benign.

  • Q21. Which of your analysis procedures resulted in a clustering model with the best specificity? How about sensitivity?

7. Prediction

We will use the predict() function that will take our PCA model from before and new cancer cell data and project that data onto our PCA space.

#url <- "new_samples.csv"
url <- "https://tinyurl.com/new-samples-CSV"
new <- read.csv(url)
npc <- predict(wisc.pr, newdata=new)
npc
##            PC1       PC2        PC3        PC4       PC5        PC6
## [1,]  2.576616 -3.135913  1.3990492 -0.7631950  2.781648 -0.8150185
## [2,] -4.754928 -3.009033 -0.1660946 -0.6052952 -1.140698 -1.2189945
##             PC7        PC8       PC9       PC10      PC11      PC12
## [1,] -0.3959098 -0.2307350 0.1029569 -0.9272861 0.3411457  0.375921
## [2,]  0.8193031 -0.3307423 0.5281896 -0.4855301 0.7173233 -1.185917
##           PC13     PC14      PC15       PC16        PC17        PC18
## [1,] 0.1610764 1.187882 0.3216974 -0.1743616 -0.07875393 -0.11207028
## [2,] 0.5893856 0.303029 0.1299153  0.1448061 -0.40509706  0.06565549
##             PC19       PC20       PC21       PC22       PC23       PC24
## [1,] -0.08802955 -0.2495216  0.1228233 0.09358453 0.08347651  0.1223396
## [2,]  0.25591230 -0.4289500 -0.1224776 0.01732146 0.06316631 -0.2338618
##             PC25         PC26         PC27        PC28         PC29
## [1,]  0.02124121  0.078884581  0.220199544 -0.02946023 -0.015620933
## [2,] -0.20755948 -0.009833238 -0.001134152  0.09638361  0.002795349
##              PC30
## [1,]  0.005269029
## [2,] -0.019015820
plot(wisc.pr$x[,1:2], col=g)
points(npc[,1], npc[,2], col="blue", pch=16, cex=3)
text(npc[,1], npc[,2], c(1,2), col="white")

  • Q22. Which of these new patients should we prioritize for follow up based on your results?

8. OPTIONAL

PCA of protein structure data

Visit the Bio3D-web PCA app http://bio3d.ucsd.edu and explore how PCA of large protein structure sets can provide considerable insight into major features and trends with clear biological mechanistic insight. Note that the final report generated from this app contains all the R code required to run the analysis yourself - including PCA and clustering. We will delve more into this type of analysis in the next class.

About this document

Here we use the sessionInfo() function to report on our R systems setup at the time of document execution.

sessionInfo()
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] rgl_0.100.19     factoextra_1.0.5 ggplot2_3.1.1    webex_0.9.0     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.1              later_0.8.0            
##  [3] pillar_1.3.1            compiler_3.5.1         
##  [5] plyr_1.8.4              ggpubr_0.2             
##  [7] tools_3.5.1             digest_0.6.18          
##  [9] jsonlite_1.6            evaluate_0.13          
## [11] tibble_2.1.1            gtable_0.3.0           
## [13] pkgconfig_2.0.2         rlang_0.3.4            
## [15] shiny_1.3.2             crosstalk_1.0.0        
## [17] ggrepel_0.8.0           yaml_2.2.0             
## [19] xfun_0.6                withr_2.1.2            
## [21] stringr_1.4.0           dplyr_0.8.0.1          
## [23] knitr_1.22              htmlwidgets_1.3        
## [25] webshot_0.5.1           manipulateWidget_0.10.0
## [27] grid_3.5.1              tidyselect_0.2.5       
## [29] glue_1.3.1              R6_2.4.0               
## [31] rmarkdown_1.12          purrr_0.3.2            
## [33] magrittr_1.5            promises_1.0.1         
## [35] scales_1.0.0            htmltools_0.3.6        
## [37] assertthat_0.2.1        xtable_1.8-4           
## [39] mime_0.6                colorspace_1.4-1       
## [41] httpuv_1.5.1            labeling_0.3           
## [43] miniUI_0.1.1.1          stringi_1.4.3          
## [45] lazyeval_0.2.2          munsell_0.5.0          
## [47] crayon_1.3.4