Data visualisation with ggplot2

Data visualization

  • Important aspect in data sciences -> Communicate information clearly and efficiently to the community.

  • Powerful tool to discovers patterns in the data.

  • It makes complex data more accessible -> reveal data.

  • Bad graphics can be a reason for paper rejection!

  • A picture is worth a thousand words.
    • Always, always, always plot the data!
    • When possible, replace tables with figures that are more compelling.

What is a good graph?

Data-ink ratio

The data-ink ratio is the proportion of ink that is used to present actual data compared to the total amount of ink used in the entire display.

\[ \verb|data-ink ratio| = \frac{\verb|Data-ink|}{\verb|Total ink used to print the graphic|} \]

The data-to-ink ratio should be keep as high as possible.



How to lie with graphs

It is easy to exaggerate effects or distort the reality with graphs.