Data Engineering at Scale - How to efficiently design and implement platforms for big data analytics
Big Data is being ubiquitously collected among us. The emergence of many businesses that provide high value to consumers with no additional cost was only possible due to data analytics platforms that extract valuable information from huge amounts of (un)structured data flowing at high velocity, frequently named Big Data. Moreover, many societal applications, i.e., not for commercial purposes (e.g., health, education, poverty, environment and cities sustainability), can be built upon the efficient storage and processing of this new kind of data. In this lecture, we will be looking into state-of-the-art techniques and technologies to collect, store, process and analyze Big Data, focusing on the arsenal and daily challenges of Data Engineers that operate at scale and in an environment with constant change, not only from a technological perspective, but also from a software development and team leading perspective. We will then focus on several practical Big Data applications that can be built using a recently proposed approach for Big Data Warehousing (storage and processing of Big Data for structured descriptive and predictive analytics), including contexts such as smart cities, manufacturing, retail, sensor-based analytics, among others.
Carlos is a Senior Data Engineer at Adidas and an Invited Professor at the University of Minho. He has more than 7 years of combined experience in the area of Big Data, including previous data engineering positions, scientific research and teaching. His main topics of interest include Big Data, Data Warehousing, Efficient Data Modeling and Storage, and Interactive and Real-time Data Analytics Platforms. He loves working with Hadoop, Spark, Presto, NoSQL, Kafka, and many other Big Data technologies. He is the (co)author of several scientific and technical publications related to the aforementioned topics and technologies, and he is constantly looking to extend the current barriers of knowledge by actively publishing and contributing to the open source community. https://www.linkedin.com/in/carlosfmscosta/