Chapter 01 - Introduction

1. Introduction

Read Bret Victor’s essay on Up and Down the Layers of Abstraction

Data science is a process of abstraction. In order to explain or to predict a real phenomena, the process start with framing the problem, acquiring & refining the data and then moves between the three layers of abstraction - transformations (data abstraction), visualisations (visual abstraction) and modelling (symbolic abstraction). All these three layers of abstraction work together to try and build a truer (or more closer) representation of the real phenomena.

See talk on Visualising Multi-Dimensional Data

Data visualisation (data-vis) helps us to understand the portrait and the shape of the data. The science of data-vis for exploratory data analysis is well developed, for both static graphics (scatter plot matrices, glyph based approaches, geometric transforms like parallel coordinates) and interactive graphics (layering, brushing and linking, projections and tours). However, the power of visualisation is rarely leveraged for understanding the models developed better. Model evaluation is still largely restricted through numerical summaries.

Data Transformation  — — — —  Model Building
    (Tidy Data)                              
         │                                 
         │                                 
         │                                 
 Visual Exploration   
     (Data-Vis)
  

Extending visualisation to model building can be a powerful way to improve our understanding of the model. We can use visualisation to aid the transition of implicit knowledge in the data and your head to explicit knowledge in the model.

Model visualisation (model-vis) can help us to understand the shape of the model and compare it to the shape of the data. It allows to see the fit of the model and understand where the fit can be improved. It also allows us to better understand the parameters in the model and how the model changes when the parameters change as well as how the parameters changes when the input data changes.

Data Transformation  — — — —  Model Building
    (Tidy Data)                (Tidy Model) 
         │                          |       
         │                          |      
         │                          |      
 Visual Exploration          Model Exploration
     (Data-Vis)                 (Model-Vis)
  

Model-Vis Approach

[0] Visualise the data space
[1] Visualise the predictions in the data space
[2] Visualise the errors in model fitting
[3] Visualise with different model parameters
[4] Visualise with different input datasets
[5] Visualise with different feature dimensions
[6] Visualise the entire model space
[7] Visualise the many models together

<- previous | home | next ->