In Part 1 of this series I talked about how I learnt programming and computer science on the job (essentially picking things up as I went in a rather hodge-podge manner) and my disatisfaction because of how that process left me with knowledge gaps. I ended by concluding that, despite it being in vogue now to constantly talk about reskilling ourselves, to be a truly effective self-directed learner is hard! Learning something properly needs a lot of thinking, analysing and critical engagement, all of which are hard skills to master. Moreover, for a subject like computer science, it’s not enough…

TLDR: Self-directed learning is hard!

“There is so much information on the internet now,” is a phrase I hear. “There are tutorials, there are online courses, and there is stack overflow, a great forum for answering questions to your problems. Why would someone need a formal degree anymore? All the knowledge you need is online and available”

This is a question that I wrestled with from the beginning of my career until the day I started a Master’s degree in Computer Science. Now that I am a couple of semesters into my course, I can say that I’ve experienced both…


  • 6:30am — Wake up, tidy up, make coffee, start reviewing lectures / tackling assignments
  • 7:30am — yoga routine
  • 9:00am — start full-time work
  • 12:00pm — lunch for an hour, squeezing in about 30 minutes to review what I went over in the morning.
  • 6:00pm — dinner and a short nap
  • 7:00pm — 10:30pm — study


  • some time in the morning to study breakfast, then head out for a bit, then start studying again in the afternoon and evening
  • same thing for Sunday

The timetable above may look quite packed, but it’s a schedule that seems to work for…

This post (and subsequent ones) is my attempt at reflecting on my experiences after a couple of semesters studying for the NUS Masters in Computing, Computer Science Specialisation. Personally, I need to write about an experience before I feel like I’ve digested it properly, so I’m writing things out so my brain can make sense of things.

On a broader level, I’m also writing this for the benefit of other people who may be thinking of the same thing. Before I started this course, I was surprised how little I could find on the web about other people’s experiences —…

Posting as part of NUS CS5346 Information Visualization course.

Machine learning models can be opaque, sometimes troublingly so. Certain classes of models, such as random forests and deep neural networks, provide no clear path to understanding how a model’s inputs influences its outputs. This opacity has real-world implications. In a time where machine learning models are making consequential decisions, for example whether or not to approve a housing loan, citizens lack a way to appeal and negotiate a decision if a company is unable to explain how its model came to its conclusion. …

dvc add

dvc add is most suitable when you want to commit large files at the start of your project. Models, large files of text or folders of images are a good candidates for this command.

In the beginning, when I tried implementing DVC, I was a little over-enthusiastic. I would dvc add as many datasets that I thought needed tracking — raw data, intermediate data, and any reference files floating around. It was only later, when I started implementing pipelines, that this method showed it flaws. …

While in Part 1 of Human-computer interactions in machine-learning applications talked about how we might structure model outputs, this post discusses about the reverse: how we might process inputs from the user. Together, inputs and outputs (as shown in the chart below) make human-computer interaction a two-way, not one-way street. It’s important to consider both sides of dialogue.

Humans and machines act not in isolation, but in concert together

A big reason why we might care about how a user responds to our model’s output is that we care about how our model is performing. If our predictions successfully predicts a user’s intent, then we know we have built a good…

And how can it be improved so our machine learning model trains better?

Most of the time, we can’t answer these questions. The usual metrics we use to measure how well our model is performing — from ROC curves to F1 scores — measure a model’s aggregate performance across the whole dataset. Try to ask what subsets of the data is causing problems, or what patterns in the data are problematic, and our toolbox comes up empty.

Photo by Luke Chesser on Unsplash

I have personal experience with this problem: our text categorisation model had disappointingly low per-category F1 scores, yet our AUC scores were somehow…

The more I work on building machine learning applications, the more I focus on intentionally designing the interface that stands in between a model’s final predictions and the way it is presented.

In the Google Play App Store, explanations are included on why an app is being recommended. Source:

Presentation drives behaviour

Presentation affects perceptions, and hence drives and directs how users respond and behave. How we present a model’s predictions carries a heavy responsibility worth thinking about. For example, when I receive a product recommendation, a developer could choose to include a snippet explaining that I got this recommendation because my friends have also liked the product. …

I remember the first time I ran a deep learning model on a powerful GPU (an NVIDIA GTX 1080). The model zipped through each training epoch so fast, I felt like I had just switched from driving a sedan to riding in a sports car. 🚙

The training speed was exhilarating; experimenting with different models went a lot faster than normal. But since that project, accelerated deep learning has been a rare luxury. Compute time on a good GPU can be expensive. …


I work with data in the little red dot

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store