After some intense weeks of exams, exercises, and projects, suddenly we had a shorter week with less hours—phew! That was quite necessary for many reasons. First, we had more time to digest all the knowledge and information we had received. We could also dedicate some hours to the Capstone Project, for which the first presentation (the EDA part) is lurking around the next corner!
As you have probably heard or read, the Data Science Bootcamp is a rollercoaster, and the learning curve is amazing. When you suddenly stop for a while, like this week, you realize how many concepts and techniques you have learned in such a short period of time.
Last week, Juanjo, the Maths & Stats Professor, uploaded the fourth exam, which has to be completed in two weeks. That was a nice present from him, as the last three exams were required in a week. I like this methodology, as it allows us to recap by ourselves everything we’ve done in the last few lectures while absorbing and picking up the mathematical backgrounds of the models and algorithms that will come later in the Bootcamp. I can’t wait to learn all that stuff—so exciting!
The Bootcamp changed and re-cast our data mindsets completely. Even if we are only in week seven of the part-time journey, we now look at data differently. Excel is put away in the drawer (for a while) and now Python is connecting our neurons, challenging us on how to extract, filter, replace, group, join, apply functions, model and predict! It’s impressive! In this, Pablo Monfort is the man responsible for this great ‘Pythonic’ take-off.
This week we’ve started on the Data Visualization Module. Pablo showed us how to create useful graphical representations of our pre-processed data. It’s amazing how simple and intuitive the code lines required for that purpose are, thanks to the libraries available. The most difficult part is to understand your data set, conceptualize it, and represent only what is useful and in the most informative way. What you observe from those histograms, scatterplots, etc., is what will guide you later in your analysis.
We also had some interesting classes with Jean-Christophe on Data Acquisition, setting up the bases to deal with real raw data extracted from any source: websites, PDFs, Excel, JSON files, etc. We went through some exercises on the usual challenge facing a data engineer: how to generate the best output dataframe for analysis from any matted data set. We learned some tricks to work in a fast and effective way in order to deliver the optimum result.
The week finished with Group Alignment and Data Lab classes, where Daniel and Filippo got us on track for the Capstone. My group is dealing with the Energía Independiente project. We have just received the dataset and started diving into it using all we have learned during these fantastic weeks in the Bootcamp. I’m so happy to be on this journey—and the best is yet to come!