Data acquisition and visualisation
Autumn 2015
Þessi síða er líka til á íslensku.
This three-week course is a hands-on introduction to data acquisition and visualisation aimed at undergraduate students in Reykjavík University’s computer science department. It covers the complete process of finding data sources, collecting and cleaning data, using best practices to visualise data, and selecting the tools suitable for each step.
These concepts are applied in a practical project where you will apply your new knowledge to make a static or interactive data visualisation. In the first two weeks, alongside lectures on data visualisation and acquisition, you will develop your project’s concept and framework incrementally. The final week is devoted to refining your project with support from the lecturers.
Course syllabus
Classes are held from 09:00 until 12:00 in room M102 between the 26th of November and the 16th of December. The first lecture is on Thursday the 26th where the full scheduled will be announced.
Visualisation
The slides for this lecture are available as PDFs: part one and part two.
- Brief overview of the history of charts
- Best of charts
- Worst of charts
- Techniques and tips on designing tables and charts
- Useful chart types (and the rest)
- Telling a story with data
- Pre-attentive processing
- Gestalt theory
- Chart junk
- Transforming a poor chart
- Compare, compare, compare
- Font usage
- Colour usage
Data acquisition — the world as an API
The slides for this lecture are available as a PDF.
- Getting data (scraping, sniffing, asking)
- Cleaning data (methods, tools, formats)
- Understanding data (tools, maths)
- Best practices
- Data acquisition examples
Digital mapping
The slides for this lecture are available as a PDF.
- Projections
- Scales
- Colour schemes
- Data classification
- Data normalisation
- Legends
- Base maps
- Cartograms
- Map localisation
- Data sources
- Tools
Course assessment
The course will be assessed via a question paper, a group presentation, and a written report. The composition of your grade will be broken down into:
Question paper (10%)
Presentation
- Concept (8%)
- Implementation and methods (16%)
- Quality of visualisations (32%)
- Quality and content of the presentation (24%)
Report (10%)
- Abstract
- Answer: Who? What? Why?
- Critique your successes and failures
- Conclusion
- Bibliography and citations
You are required to attend, participate actively, and complete paper, presentation, and report. Both the presentation and report must be in English. The expected workload for students is 150 hours.
Deadlines
You must hand in the completed question paper no later than 14 December. Presentations will be held on 15 and 16 December (you will be able to choose the day and time). Final presentation materials must be available to lecturers before your presentation and be available for at least thirty days. Your written report must be handed in no later than 16 December.
The deadlines are not flexible: anything submitted after its deadline will not be graded.
Teaching methods
The course consists of:
- Lectures
- Practical work in groups
- Preparing and presenting a group presentation
- Writing a report
- Recommended reading
- Recommended podcasts
The lectures will cover the course syllabus while the practical work will allow student groups to find, prepare, and visualise data to use in their group presentation. During the practical work the lecturers will be at hand to answer questions and give guidance. The written report (approximately 1,000 words) will summarise the data analysis and visualisation. The recommended reading and podcasts are to support the concepts covered in the lectures.
Learning outcome
Knowledge
- Milestones in the history of data visualisation
- Frequent problems in data visualisation
- Best practices and common pitfalls in data visualisation
- Limits of human visual perception
- Process of acquiring data and preparing it for use
- Principles of high-quality digital maps
- Tools for creating charts and maps, and for collecting data
Skills
- Select suitable chart types, colours, fonts, and layouts
- Select suitable interactive data visualisation methods
- Select suitable map projections, scales, and colour schemes
- Select suitable tools for data collection
- Acquire data from various online sources
Competencies
- Take raw data and explain it to your audience using charts and maps
- Apply manual and automatic methods to collect data
- Apply manual and automatic methods to clean malformed data
Books and podcasts
A copy of this book is available in the university library. Edward R. Tufte. 1983. The Visual Display of Quantitative Information. Graphics Press.
Roger D. Peng and Elizabeth Matsui. 2015. The Art of Data Science. Leanpub.
Jonathan Gray et al (eds). 2012. Data Journalism Handbook. O’Reilly.
Hadley Wickham. 2014. Tidy Data. Journal of Statistical Software.
Stephen Few. 2014. Graph Selection Matrix. Perceptual Edge.
Stephen Few. 2010. Our Irresistible Fascination with All Things Circular. Perceptual Edge.
BBC More or Less podcast.
Data Stories podcast.