Advanced Data Analysis (ADA) course (code 27084)

Lecturer: Dr. Paolo Coletti . Office E203 - Office hours:


ADA course syllabus A.Y. 2017/18
ADA course content A.Y. 2017/18


In order to correctly follow this course each student is required previous knowledge on these topics:

Course content


How to study for this course

This course is different from the majority of courses you are used to. This course is much more technical than theoretical and it is strictly sequential. This means that you have to adapt your study strategy. First of all, you either attend all the lessons (or compensate for missing lessons watching immediately the corresponding videos or reading the book) or it is really not worth coming to the next one, you will disturb your neighbours asking for help not realizing that your lack of understanding is due to your missing knowledge and not to my speed. Moreover, after each lesson you must repeat slowly on your own everything done in class in order to be sure to have fully grasped the explained concepts before the next lesson.

For the exam the main difference with respect to other courses is that you have to train much more than studying. The content of this course is easy and does not need extensive study, however it is only with practice that you become skilled enough and know immediately what to do without wasting time.


Exam is divided into three parts.

The first part (25% of the final grade) consists in written exercise on relational databases architecture and lasts 40 minutes, but may vary according to the complexity of exercises. It is held in a standard classroom all students together. This part is completely on paper and "closed book": no paper nor electronic help or tool is allowed.

The second part (25% of the final grade, file handling errors count negatively towards the final grade) consists in exercises on Access and lasts approximately 20 minutes depending on the length and complexity of exercises. It is held in computer room in turns of 25 students each. This part is totally open books: you may use any written or electronic document (including books, previous exams with solutions, your personal handwritten or electronic notes and my slides). You may not, however, use any communication program or device.

The third part (50% of the final grade, file handling errors count negatively towards the final grade) consists in statistical data analysis and graph creation with R and lasts approximately 40 minutes depending on the length and complexity of exercises. It follows the same other rules of the second part. It is held in computer room in turns of 25 students each. In case there are more than 25 students, turns will appear on this website as soon as enrolment is closed; if you have specific timetable's needs, please write an email to Dr. Coletti as soon as possible, before the timetable appears. This part is totally open books: you may use any written or electronic document (including books, previous exams with solutions, your personal handwritten or electronic notes and my slides). You may not, however, use any communication program or device.

Important warning: the crucial point of practical parts, in particular Access, is time. If you never practice with the programs, you will still be able to do the exam but you will waste a lot of time looking for the commands or wondering what should you do and you will not finish it in time. Only with exercises will you be fast enough to complete it in the indicated time. It is a good idea to practice with a clock on the previous exams that you find below.

The exam's grade consists in a weighted-average of the three parts, with file handling errors not participating in the average but counting negatively. Active participation in class, in particular interventions to improve the course (spottings errors or giving very good suggestions) can slightly increase the final grade up to 3/30. It is not necessary that all parts be sufficient to pass the exam, the weighted average must be sufficient. As the exam is indivisible from an administrative point of view, it is not possible to split it or to save a part and redo another: all parts must be taken in the same session. Only the homework lasts until the last session of the academic year.

Homework 1

Students who will be present during presentation days (see below) have the possibility to do a homework which will replace the database and Access parts of the exam following these rules.
0. The homework can be done in groups of 1, 2, 3, 4 students, you decide group's composition. As the scope of group's work is to exchange knowledge through reciprocal teaching, students with already experience with databases (for example students who have already passed my exams 27006 25213 25253 40027 or students with a degree in computer science or engineering) may not be in the same group.
1. Each group must submit to Dr. Paolo Coletti via email the group composition and a database proposal within
26 October 2017. It is strongly suggested to submit it well before, do not wait for the last moment. Database proposal must contain the database description and a draft version of the schema, containing the main tables, relations and fields involved. The proposal is part of the homework's evaluation. I expect that your database be different from the ones of previous years, the ones used in class and the many ones I gave in exams in these years. However, if you still want to choose a database similar to these ones, I want that your structure be different. Thus, if you look at my solution and try to twist my structure deliberately it might be a suicide, much better not looking at my solution.
2. Dr. Paolo Coletti answers with the crucial corrections and suggestions.
3. The group builds a database in Access. The database must contain at least

Students in the group
Junction tables
Fields (not counting foreign keys)
Validation rules
Table validation rules
Different summary queries
Different non-selection queries
Different queries with left/right join
Different 2 queries involving al least
2 tables
3 tables
4 tables
5 tables
Forms with locking using at least 2 tables
Reports based on a query with grouping

There must be at least a query containing a condition, formula, function, that asks something to user, that is based on another query. Tables must use required and index feature. Database must be filled in with enough data to make queries, forms and reports return something meaningful.
4. The group writes the database documentation.
5. The database and documentation must be submitted to Dr. Paolo Coletti within
13 November 2017 with your preference for the presentation day and time. Preferences are satisfied in order of arrival, the earlier you submit the more you can decide. Database and documentation are part of homework's evaluation.
6. The entire group must present the database at his presentation day
24-25 November 2017, no exception (unless with medical certificate). Presentation is rather brief and straightforward. Dr. Paolo Coletti chooses who presents which part and who answers to each question. Presentation and answering is part of the homework's evaluation.
7. Attendance to other groups' presentations is mandatory for all students who do the homework, at least for your presentation day, no exception. During the presentation other students are kindly invited to present observations and questions. Do not worry about lowering your colleagues' evaluation pointing out mistakes, as if any error I have not spotted comes out it will not be considered in the evaluation. You receive a small increase on the exam's final grade for each competent and correct observation.
8. Homeworks' grades are, unless a tragic presentation is made, the same for all the students of the group. Grade is based on: proposal, database, documentation, presentation, answering.
9. If you have a homework's grade of at least 60% you automatically use this grade as database+Access grade and you do not receive those parts' papers during the exam. This grade lasts until the last session of current academic year. If you prefer NOT to use your homework's grade for an exam's session, you MUST write an email to Dr. Paolo Coletti AT LEAST 7 days before, no exceptions.

What not to do:
- submit a draft in a hurry just to do something: you are reducing your homework grade
- submit partially (only database for example): I do not make puzzles of different submissions, everything missing when I correct will be considered missing
- resubmit with modifications: they will not be accepted if I have already corrected
- plan to work on the last days: if something goes wrong, you might be unable to submit everything
- ignore my lessons as you have already knowledge of databases and Access: first of all, you are putting the other members of your group in troubles because if I see things I have not explained I will ask the others to do them in class. Then you are wasting an opportunity to learn something new, at least to learn different points of view. Finally, you may use structures and techniques which are commonly used but for which I had put forward good motivations against them during my lessons and thus reduce the grade of your group.

Homework 2

The second homework is a continuous activity which requires regular attendance and constant commitment, following these rules:
0. It can be done only alone.
1. Each student must attend each lesson from November to January. You have only 1 possibility to skip a lesson, no exception even with medical certificate.
2. After the second lesson on R (currently planned on 8th November), each student must find a suitable dataset different from the one used in class
and submit it to me before the next R lesson (currently planned on 15th November). It must be an RData file with:
- no more than 20 variables
- at least 3 binary nominal variables, already set to factor with appropriate labels
- at least 3 non-binary nominal variables, already set to factor with appropriate labels
- at least 3 scale variables
- at least 3 ordinal variables, already set to ordered with appropriate labels
- at least 50 cases
- all missing cases handled appropriately
- a codebook in a separate file (Word, PDF or text).
The quality of this dataset and codebook is part of the evaluation.

3. After each lesson, you submit the same content of the lesson but applied to your dataset with the same comments and considerations I did in class. You have time until 24 hours before the next lesson. In case the next lesson is on the very next day, you have time until the next lesson starts.
4. At the beginning of each lesson, a random person presents briefly its dataset and all the rest of the class repeats the content of the previous lesson using their data. Then, another random person is selected to present publicly this task. If a person does not have any computer in class, s/he still can do it with others or simply discuss how to do everything and, in case they are selected, use the classroom's computer.
5. After this is done, if you want to go away and not attend my part of the lesson you may do it without penalties.
6. Your evaluation depends on the quality of your dataset and codebook, on the homeworks you regularly submit, on the brief presentation of your dataset and on the presentation of the previous lesson on another person's dataset.
7. If your evaluation is at least 60% you automatically use this grade as R grade and you do not receive that part papers during the exam. This grade lasts until the last session of current academic year. If you prefer NOT to use your homework's grade for an exam's session, you MUST write an email to Dr. Paolo Coletti AT LEAST 7 days before, no exceptions.
8. Since current regulations require a mandatory final written part, the students who use both homeworks must come to the exam session anyway and they will receive a written question "How does a statistical test work? Explain it using also a brief numerical example different from the one done in class." which will be worth 5% of the grade. It is closed book and it is the repetition of my lesson on statistical tests' theory. You do not need to write explicitly all the values of the involved variable's/variables' (just some cases, to give me an idea what is/are the variable/variables about) and thus you can invent the coherent statistics that you need. You have 60 minutes, but I expect you to finish in a much shorter time.

There is no 60% minimum per exercise at the exams, this rules applies only to homeworks.

Differences from A.Y. 2016/17

Please, if you followed the course in A.Y. 2016/17 take note of these important differences:

Study resources

Lessons' slides
Videos as support to attendance
Books as support to attendance
Further readings
Precourse   Go down here


Relational database architecture
Relational databases Go down here Databases course book (this book includes the theoretical part, as well as the pratical part covered by the videos)
• Allen G. Taylor, Database Development For Dummies, For Dummies, 2000, ISBN 978 0764507526


Advanced Access

Go down here

• Sams Teach Yourself Microsoft Office Access 2003 in 24 Hours, Alison Balter, ISBN 0-6723-2545-4

Statistical analysis with R Statistical Analysis with R Go down here

Data analysis course book (chapters 1-6, 9-14)

To begin:
• Ploner Alexander, Introduction to R & R Commander, 2011, avaiable on
For data analysis:
• Natasha A. Karp, R commander an Introduction, 2010, available on
• Chang G. Andy and Kerns G. Jay, Introduction To Probability And Statistics Using R/Rcommander With Ipsur Plug-In, 2010, available on
For statistical tests: Advanced Statistics coursebook


Files and programs used in class Last updated
MyFarm database used in Databases course book  
Library, studentsandexams, studentsandexams2 databases  

Northwind 2003 database for Access 2007 and 2010 and 2013

Files and datasets distributed in class 16 November 2015
R-portable, portable and already configured 7 August 2015
R Commander menues in Italian and English (for students with R commander in Italian) 18 April 2013

Videos of lessons

Warning: videos are large, your best choice is watching them online from YouTube.

If you really want to download them, be sure to have enough space and enough time available. Do not save them on your UNIBZ disk space or you will make your account go over quota! Right click the mouse button and choose Save Target As... and save them either on a USB pendrive on your personal computer. If you have a Mac and the video does not open with QuickTime, try to install and use program VLC

precourse 01.avi
116 MB
YouTube Precourse on unibz network and file handling.
precourse 02.avi
24 MB
YouTube Update for 2016
107 MB

Single table database in normal form, primary key, information redundancy, empty fields, one-to-many and many-to-one relations, foreign key.

83 MB

One-to-one relations, many-to-many relations, junction table, temporal databases.

122 MB

Junction tables for more relations, details table, foreign keys with more relations, orphans and referential integrity, hierarchical structure, process structure. Suggestions for database design.

63 MB

Northwind database overview. Access overview, Saving operations. Tables, field types, primary key. Queries, query wizard, design view, sorting, criteria.

65 MB

Using other fields for criteria, asking for values, virtual fields, expression builder, functions DateDiff, DateAdd, Year, Between.


Summary queries, examples, “where” and “group by”
Reports: structure and examples. Exporting and printing reports.
Tables: Fields and fields’ features. Validation rules, table validation rules. Like operator.
Importing tables from Excel. Building new tables. Relations: relationships diagram, building relations, referential integrity, Lookup Wizard, mandatory and non-mandatory value lists. Forms: structure and examples, subform, locking form and subform.


Non-selection queries. Left/right joins. Database documentation.

(78 MB)
YouTube Questionnaires. Variables: scale, nominal, ordinal, Likert scale. Missing values: NA, NaN. R overview: portable version, installing packages, loading packages, R commander, saving script, output, workspace, loading workspace.
(174 MB)
YouTube Descriptive statistics for one nominal variable, for one scale variable. Graphs for one nominal variable: column plot, pie chart, radar graph, bar plot, line plot, area plot, 3D. R: color palette, bar plot, pie chart. Graphs for one scale variable: histogram, box plot, plot case by case. R: histogram, box plot, index plot.
(91 MB)
YouTube Descriptive statistics for two variables: contingency table, row and column percentage, statistics by groups, Pearson correlation, Spearman correlation. Graphs for two variables: clustered column plot, stacked column plot, 3D column plot, boxplots by groups, histograms by groups, scatterplot, mathematical graph. Three variables: surface plot, bubble chart, scatterplot by groups.
(113 MB)
YouTube Restrict data set, remove cases with missing values, binning, recode variables, massive recoding, compute new variables. Basic vector operations.
(47 MB)
YouTube Statistical tests: sample and population, hypotheses, significance. Student t test for one variable. Chi-square test for one dimensional contingency table.
(109 MB)
YouTube Chi-square test for a two-dimensional contingency table, Student's t test for two populations, prerequisite, Mann-Whitney test, ANOVA, Kruskal-Wallis test, correlations' tests, when testing difference of two scale variables, Student's t test for two paired variables, Wilcoxon signed-rank test. Normality: histogram with normal curve, QQ-plot, skewness and excess Kurtosis, Shapiro Wilk test.
(88 MB)
YouTube Additional video for A.Y. 2016/17. Checking normality prerequisite, one-sided tests, sign test, subsetting while removing categories.

This short video illustrates how to reach unibz network folder \\ubz01fst (which contains course_coletti and your own personal stuff) using VPN when you are connected from outside university or when you are connected using wifi.
This procedure is not part of exam's stuff.


Before the exam:

  1. if you use your own computer, make sure that it works perfectly and everything is installed;
  2. check that you are able to locate \\\Courses directory. You will not be helped during the exam on this topic;
  3. if you are using unibz computer and even if you are using your own computer, make sure that you are able to us the computers in classroom A518 and that your unibz account is properly configured and working correctly, in particolar that you are able to save correctly files on your Desktop. Coming to one of my office hours before the exam can be helpful if you are not sure of this;
  4. do not come to the practical test without having ever logged on your account to check everything;
  5. if you have to do R exercises and are planning to use unibz computers, have R portable copied and uncompressed on your own USB pendrive.

Frequently Asked Questions

Q: Where can I get Windows, Word, Excel, Access, 7-Zip?
A: You can download 7-Zip from, for Mac the compression program I suggest is Keka For Windows 10 and Office 2016 for Mac you can download them from entering with your UNIBZ login and legally install them on your own computer. For Office 2016 for Windows, follow the installation instructions for Office 365 and then you will find a link to install Office 2016 on your computer.

Q: How can I have Windows, Office in English?
A: It is a chaos. Every version of these programs and every edition has different ways and rules to change the language and for some it is not possible. I suggest typing on Google "how to change menu language for" followed by the program and the version you need (Windows 10, Office 2016, Office 2016 for Mac).

Q: I have a Mac. May I study using it?
A: If you have a Mac, you have several alternatives. The suggested solution is 3.
1) You can partition your hard disk in two parts and install Windows (the copy offered by university) in the other part. There are many tutorials on the web how to do it, for example see The ISO image for Windows and Office with unibz license are here: . In this way you will have a perfect copy of Windows and Office, including Access, installed at unibz.
2) You can install Windows on a virtual machine using free program VirtulBox which is free. In case you are interested, here are the instructions: , and here you get the ISO image for Windows and Office with unibz license here: .
3) You can come to the lessons with your Mac and install R on it and using Access (which does not exist for Mac) on unibz computers.

Q: Where can I get R and R Commander? How to I install it?
A: For Windows users, you can download portable preconfigured R-portable for you directly from this link, you do not need to install it nor configure it. In any case, R is installed on all computers in room A518 but on a couple R commander does not work correctly. If you really want to install it on your computer, you can download R from To install R Commander, in the Packages menu choose Install package(s) (flag "Install dependecies" if you see it on a Mac) and install Rcmdr with all its dependencies, then in the Packages menu load the Rcmdr package and eventually install the dependencies; once it is installed on your personal computer or on any unibz computer, you do not need to install it anymore.
For Mac users, unfortunately there is no R-portable (if you find one, please tell me!). You have to install R. Then you have to install R Commander, which probably will ask you to install XQuartz. In case you do not see the menus of R Commander while it is running, simply kill XQuartz and restart R Commander.

Q: How can I change R language on my own computer?
A: If you use the portable version you do not have this problem. Otherwise, you will!
For Windows users: unfortunately, R installs itself in your location's language, doesn't matter your current language and what you choose! To change it, you must first have permissions to edit all your files of R installation: go to R directory, (usually C:\Program Files\R) and select Properties, then remove the Read-only attribute and, in security tab, assign all the permissions to your user. Then go to etc subdirectory (probably C:\Program Files\R\R-2.15.2\etc), open your Notepad program and drag the file called Rconsole file onto Notepad. Change the language row to "language = en" and save the file.
If it still opens up in English or if you cannot manage to do these operations, try the following more complex alternative: find the directory where R is installed (usually C:\Program Files\R), go to share\local and and rename your language directory to anything else. For example, if your R is in Italian you have to go to something like C:\Program Files\R\R-2.15.2\share\locale and rename directory "it" to "it_blablabla". You must do the same for Rcmdr in C:\Program Files\R\R-2.15.2\library\Rcmdr\po directory.
For Mac users: Here R seems to install itself in your operating system's language. Changing R language is possible using a trick similar to Windows: Finder -> App -> right click on R icon -> Show package content. Open Resources and rename the it.lproj or de.lproj to something else. Unfortunately, I did not find a similar trick for R Commander, removing your language for XQuartz does not help. The only solution until now seems to be changing the language of the entire Mac, if you find a better one tell me.

Q: How can I reach network folder \\ubz01fst from outside unibz or connected via wifi?
A: For Windows users: if you are connected to wifi ScientificNetwork try to digit in any explorer address bar \\ and see whether you reach it. You need to provide your login and password, but you need to tell to your computer that you are using a different domain and then you have to type unibz\loginname instead of simply loginname. If this fails or if you are no connected to ScientificNetwork, then you need to install VPN. There is a specific video up here.
For Mac users: if you are connected to wifi ScientificNetwork, Finder -> Go -> Connect to server -> smb:// . You need to provide your login and password, but you need to tell to your computer that you are using a different domain and then you have to type unibz\loginname instead of simply loginname.

Q: May I fix an appointment to talk with you?
A: I do not fix personal appointments. I have office hours explicitely dedicated to this task, which I try to scatter evenly during the semester with a higher frequency before exams. In any case all the office hours are open to everybody, so you may come also to office hours of my other courses, see list on Obviously you may always ask me questions via email (please, state clearly which course are you talking about and what is your problem).

Q: When will the next exam be? Can you give me a hint on the exam's date because I have to catch a plane? Can you move the exam's date? Can you fix the exam's date on the week I suggest?
A: Please stop writing me emails on this topic. Exam date appears on your timetable as soon as it is official. If you have something to say about it, contact your students' speaker who is the only one who can submit requests on students' behalf.

Q: I may not enrol online for technical or administrative reasons or I forgot to enrol or it is my third attempt and I cannot enrol. Can I do the exam anyway?
A: No, I may not let non-enrolled students take part of the exam. Do not ask me to do illegal things! Ask the secretary whether there is something they can do.

Q: May I do the exam with my computer?
A: Sure. But beware: (1) you must be able to navigate the Internet and to enter directory \\\Courses\Course_Coletti. Do not wait for the day before the exam to check it. (2) You are responsible for your different programs' versions and configurations and for the absence on your computer of specific programs.
In any case you will have a unibz desktop computer in front of you.

Q: May I do the exam using Windows in a different language?
A: Yes, sure.

Q: Will the exam be similar to the other exams on this website?
A: Sure, with the exception of A.Y. 2015/16 when most of the exam changes.

Q: I lost a file during the practical exam because I did not save it correctly. What may you do?
A: Absolutely nothing. With time spent on exercises you should know the unreliability level of your programs, and how often you have to save.

Q: My files were not copied correctly at the end of the practical exam. What may I do?
A: Checking that the copy is correct, and practicing file copy even during the exam, is your task and is official prerequisite for this course.

Q: Hey, exam's time is not enough! I could not even finish it. If I only had other 5 minutes I would have done it much better!
A: Sorry but you are wrong, as I calculate more than twice the needed time. Look at the important warning after exam's explanation: the fact that for you exam's time was not enough is instead an indication that you must do many more exercises to be efficient and fast enough. On the other hand, if you have documented medical problems that slow your operations, write an email to me to have more time.

Previous exams

Warning: videos are large, your best choice is watching them online from YouTube.

If you really want to download them, be sure to have enough space and enough time available. Do not save them on your UNIBZ disk space or you will make your account go over quota! Right click the mouse button and choose Save Target As... and save them either on a USB pendrive on your personal computer. If you have a Mac and the video does not open with QuickTime, try to install and use program VLC

Exam link
Solution link
Video solution
Video solution
Winter 2017 17
exam 17
suggested solution 17


Winter 2017 16
exam 16
suggested solution 16


Autumn 2016 15
exam 15
suggested solution 15


Summer 2016 14
exam 14
suggested solution 14


Winter 2016 13
exam 13
suggested solution 13

(80 MB)

prototype for AY 2015/16 12
exam 12
suggested solution 12

(103 MB)

prototype for AY 2015/16 11
exam 11
suggested solution 11

(119 MB)

prototype for AY 2015/16 10
exam 10
suggested solution 10

(91 MB)

Old exams, they are good only for database design and, in part, for R
Autumn 2015 09
old exam
exam 09
suggested solution 09

(51 MB)

Summer 2015 08
old exam
exam 08
suggested solution 08

(51 MB)

Winter 2015 07
old exam
exam 07
suggested solution 07

(56 MB)

Autumn 2014 06
old exam
exam 06
suggested solution 06

(95 MB)

Winter 2014 05
old exam
exam 05
suggested solution 05

(84 MB)

Autumn 2013 04
old exam
exam 04
suggested solution 04

(63 MB)

Summer 2013 03
old exam
exam 03
suggested solution 03

(67 MB)

prototype 02
old exam
exam 02
suggested solution 02

(67 MB)

prototype 01
old exam
exam 01
suggested solution 01


(67 MB)

This page is maintained by Paolo Coletti.

