The Journey to Data Science from Scratch
# Dev NoteIntroduction
It’s a rare opportunity to be part of a well-rounded machine learning and data analysis team at the company, complete with sufficient resources and a solid pipeline. I’m taking this chance to learn about data science, and if there’s anything I don’t understand, I can directly ask my colleagues. The scope of data science is vast, and there isn’t a clear direction at the moment, so I’m just going to learn and see what sparks come from this journey—let the side hustle begin!
Plan
-
Revisit the pieces I missed from my university statistics course (2 weeks)
- Distributions and various testing methods
- R: Learning while reviewing
-
Review linear algebra (3 weeks)
- I’ve completely forgotten everything
-
Review Andrew Ng’s machine learning course (1 week)
- When I took this course, it was using Octaves, but the concepts should be applicable
-
Complete the Deep Learning Specialization on Coursera (6 weeks)
-
Finish reading the data science books I purchased earlier (3 weeks)
-
Follow the curriculum at fast.ai while diving into research papers
- I previously completed courses 1 to 4 but forgot everything.
-
Data engineering
- Airflow
- Kafka
- I particularly want to learn these two
This plan should fill about 100 days, and I might look for some additional data analysis-related courses since it seems like there’s a heavy focus on machine learning and deep learning. In any case, I’m documenting everything I think of here to avoid forgetting.
Goals
I primarily want to explore the fun aspects of integrating front-end development with data science, such as working with ml.js or tensorflow.js, which sounds really exciting.
Additionally, I have certain ideas that require the support of data science, so I want to take advantage of this time to bolster my knowledge in that area. Previously, my notes were scattered everywhere, and now I can hardly find them, plus I’ve forgotten a lot. This time, I’ll make sure to document everything thoroughly on my blog.
Related Posts
- Stop Using Access Keys AlreadyAccess Keys are an easily overlooked security risk on AWS. Use OIDC with IAM Roles so GitHub Actions can securely access AWS resources without any secrets.
- Database Primary Keys: AUTO_INCREMENT, UUID, and UUIDv7Backend developers often have to decide on a primary key: auto increment or UUID? What about collisions? How much faster is UUIDv7 compared with created_at + index? After benchmarking 20 million rows and looking at the design trade-offs, this post gives you the answer.
- Sharing My Experience with ZeaburIndependent developers often choose platforms like Vercel for deploying their services. However, when more advanced requirements arise, such as database connections, Vercel can become less convenient. Additionally, the pricing of typical cloud service providers can be quite expensive for solo developers. In this article, I’ll share some insights on using Zeabur and highly recommend it to everyone!
- Keyboard Enthusiast's Guide - Firmware EditionThis article is part of the IT 2023 Ironman Competition: A Beginner's Guide to Keyboards - Firmware Edition.