I have been using Airflow for a long time. Airflow is always my top favorite scheduler in our workflow management system. Whenever I discuss “building a scheduler”, my head immediately pops out the “Airflow” word.
At first, my Airflow is running using docker container and CeleryExecutor
. It was working fine as a scheduler for our Data Engineering team. After a few months, our scheduler needs to serve more users and handles more heavy workloads. So I have to look for something that can achieve our requirements: stability, scalability, and multi-tenants user support. …
** This article covers the most recent exam syllabus. I took the exam in 13th June 2020 **
In this guide, I will explain the idea of Azure Data Scientist Associate, the content of the exam, preparations, and some keys take away after the course.
I spend roughly 50 hours to learn from almost zero knowledge about Azure to get the certificate. About my background, I have 2 years working as a Data Scientist and now I’m a full-time Data Engineer with 3+ years experience. Being on both sides, I’m familiar from set up infrastructure for large scale data to…
I have been working with Python for sometimes. Most of the time I build the machine learning API and spend times playing with algorithms, I haven’t seriously taking the Analysis part. I taught myself for data analysis through courses, video,..etc.. and it comes a time I think that I have to do something. Well, practising always better than studying.
I have been searching the dataset for a while, a dataset which is not so simple, not so complicated and fun to play around. …