Dream to Learn is shutting down...

We are very sorry to say that Dream to Learn will be shutting down as of December 28th, 2019. If you have content that you wish to keep, you should make a copy of it before that date.


Machine Learning and Self Organizing Maps (SOMs) on Cancer Data

POSTED IN: Data Analytics & Visualization Blog

Now that I've got a few new tools in the toolbelt (like "Kohonen" library) - I went back to some older data and took another run at it.  Below is the cancer data. "breast cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg. He assessed biopsies of breast tumours for 699 patients up to 15 July 1992; each of nine attributes has been scored on a scale of 1 to 10, and the outcome is also known."

Its interesting to contrast the Kohonen clusters with the GINI Fit - contrasting V1 to V2.

The original posting here - https://dreamtolearn.com/ryan/data_analytics_viz/48 


coolBlueHotRed <- function(n, alpha = 1) {
  rainbow(n, end=4/6, alpha=alpha)[n:1]


cdata <- biopsy

## now swap out B and M for 1 and 0 (1 is malignant)

cdata$outcome <- as.character(cdata$outcome)
cdata[cdata == "m"] <- 1
cdata[cdata == "b"] <- 0
cdata$outcome <- as.numeric(cdata$outcome)
names(cdata)[1] <- "clmp_thicknss"
names(cdata)[2] <- "cell_uniformity"
names(cdata)[3] <- "cell_shp_unifrm"
names(cdata)[4] <- "marginal_adhesion"
names(cdata)[5] <- "sngl_elptcl_cell_sz"
names(cdata)[6] <- "bare_nuclei"
names(cdata)[7] <- "bland_chromatin"
names(cdata)[8] <- "normal_nucleoli"
names(cdata)[9] <- "mitoses"
names(cdata)[10] <- "Malignant"

data.sc <- scale(cdata)

data.som <- som(data.sc,  grid = somgrid(8, 4, "hexagonal"))
plot(data.som, palette.name = coolBlueHotRed, main = "Cancer Data - 699 Samples")

Interested in more content by this author?

About the Author

Ryan Anderson

Ryan Anderson

Hi! I like to play with data, analytics and hack around with robots and gadgets in my garage. Lately I've been learning about machine learning.

About this blog

Description is...<br/>Data Analytics & Visualization Blog - Generating insights from Data since 2013

Created: July 25, 2014


Up Next