Blog posts

2025

Precomputed Embeddings vs. Real-Time Retrieval (RAG)

2 minute read

Published: March 01, 2025

Large Language Models (LLMs) rely on efficient retrieval strategies to generate accurate, context-aware responses. The two primary approaches are:

Fine-Tune GenAI Models

7 minute read

Published: February 27, 2025

Fine-tuning Generative AI (GenAI) models allows us to adapt pre-trained models for specific tasks, styles, or datasets while maintaining efficiency. Instead of training large models from scratch, fine-tuning enables customization with lower computational costs and faster adaptation to new domains.

GenAI Models Quality Evaluations: Text and Image

4 minute read

Published: February 23, 2025

Evaluating Generative AI (GenAI) models is challenging due to their complex and diverse outputs across different modalities (text, image, and multimodal generation). Unlike traditional supervised learning models, where direct comparison with ground truth labels is feasible, GenAI models often require implicit evaluation techniques to assess quality, coherence, and usability.

From Text Transformer to Vision Transformer Model

4 minute read

Published: February 21, 2025

Transformer models have revolutionized large language models (LLMs) and are now widely used across multimodal AI applications, including text generation, conversational AI, and vision-based models. These models have set a new standard for natural language understanding, reasoning, and content generation by leveraging attention mechanisms to capture long-range dependencies and contextual relationships.

Delve into the Attention Mechanisum

6 minute read

Published: February 15, 2025

The Attention Mechanism is the core idea behind modern large language models (LLMs). It allows models to focus on important words in a sentence while ignoring irrelevant details.

Understanding K-Means Clustering: A Step-by-Step Guide

2 minute read

Published: February 01, 2025

1. What is K-Means Clustering?

K-Means is an unsupervised learning algorithm used for clustering data points into groups based on similarity. It is widely used in data segmentation, customer profiling, and image compression.

Simple Decision Trees and Its Implementation

4 minute read

Published: January 30, 2025

A decision tree is a machine learning model that makes decisions by recursively splitting data based on the best feature, forming a tree-like structure.

Reinforcement Learning with the Snake Game

4 minute read

Published: January 24, 2025

Reinforcement Learning (RL) is like training a pet: the agent (learner) explores its environment, takes actions, and gets rewards or penalties. Over time, it learns which actions lead to better rewards and avoids actions that result in penalties.

Classification Evaluation: ROC and AUC calcuation

3 minute read

Published: January 23, 2025

ROC AUC is a key evaluation metric for binary classification models. It measures how well a model distinguishes between positive and negative classes.

Naive Bayes Theory and Example on Spam Email Detection

6 minute read

Published: January 20, 2025

In this blog post, we’ll explore Naive Bayes, a simple yet powerful algorithm used for classification tasks like spam detection. We’ll break down the theory, provide intuitive examples, and show you how to implement it from scratch in Python. Whether you’re new to machine learning or preparing for an interview, this guide will help you understand Naive Bayes in a simple and concise way.

Understand the Poission Distribution

3 minute read

Published: January 13, 2025

The Poisson probability distribution is used to model the number of times an event happens in a fixed period of time or space. It is useful when events occur independently and at a constant average rate. This distribution is widely applied in areas like call centers, traffic flow, biology, and machine learning.

Understanding Stochastic Gradient Descent (SGD)

2 minute read

Published: January 11, 2025

Stochastic Gradient Descent (SGD) is an optimization algorithm used in machine learning and deep learning to minimize the loss function. Unlike standard Gradient Descent, which computes the gradient using the entire dataset, SGD updates the model one sample at a time, making it more efficient for large datasets.

Selection of the Loss Functions for Logistic Regression

5 minute read

Published: January 10, 2025

When training a machine learning model, choosing the right loss function is critical to ensuring effective learning. In linear regression, Mean Squared Error (MSE) is commonly used, but for logistic regression, it becomes problematic.

2020

Interviewing and Networking Tips for MLE New Grads

6 minute read

Published: October 23, 2020

As a Ph.D. candidate with three internship experiences and closely related research experience, job hunting still took more than my expected time especially under the impact of the Pandemic.

Rebecca Li

Blog posts

2025

1. What is K-Means Clustering?

2020