Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
About me
About me
Posts
Precomputed Embeddings vs. Real-Time Retrieval (RAG)
Published:
Large Language Models (LLMs) rely on efficient retrieval strategies to generate accurate, context-aware responses. The two primary approaches are:
Fine-Tune GenAI Models
Published:
Fine-tuning Generative AI (GenAI) models can be categorized into two main approaches:
- Non-Parametric Fine-Tuning: Modifying model behavior without changing its parameters (e.g., ICL, RAG).
- Parametric Fine-Tuning: Updating the model’s internal parameters (e.g., Full Fine-Tuning, LoRA).
GenAI Models Quality Evaluations: Text and Image
Published:
Evaluating Generative AI (GenAI) models is challenging due to their complex and diverse outputs across different modalities (text, image, and multimodal generation). Unlike traditional supervised learning models, where direct comparison with ground truth labels is feasible, GenAI models often require implicit evaluation techniques to assess quality, coherence, and usability.
From Text Transformer to Vision Transformer Model
Published:
Transformer models have revolutionized large language models (LLMs) and are now widely used across multimodal AI applications, including text generation, conversational AI, and vision-based models. These models have set a new standard for natural language understanding, reasoning, and content generation by leveraging attention mechanisms to capture long-range dependencies and contextual relationships.
Delve into the Attention Mechanisum
Published:
The Attention Mechanism is the core idea behind modern large language models (LLMs). It allows models to focus on important words in a sentence while ignoring irrelevant details.
Understanding K-Means Clustering: A Step-by-Step Guide
Published:
1. What is K-Means Clustering?
K-Means is an unsupervised learning algorithm used for clustering data points into groups based on similarity. It is widely used in data segmentation, customer profiling, and image compression.
Simple Decision Trees and Its Implementation
Published:
A decision tree is a machine learning model that makes decisions by recursively splitting data based on the best feature, forming a tree-like structure.
Reinforcement Learning with the Snake Game
Published:
Reinforcement Learning (RL) is like training a pet: the agent (learner) explores its environment, takes actions, and gets rewards or penalties. Over time, it learns which actions lead to better rewards and avoids actions that result in penalties.
Classification Evaluation: ROC and AUC calcuation
Published:
ROC AUC is a key evaluation metric for binary classification models. It measures how well a model distinguishes between positive and negative classes.
Naive Bayes Theory and Example on Spam Email Detection
Published:
In this blog post, we’ll explore Naive Bayes, a simple yet powerful algorithm used for classification tasks like spam detection. We’ll break down the theory, provide intuitive examples, and show you how to implement it from scratch in Python. Whether you’re new to machine learning or preparing for an interview, this guide will help you understand Naive Bayes in a simple and concise way.
Understand the Poission Distribution
Published:
The Poisson probability distribution is used to model the number of times an event happens in a fixed period of time or space. It is useful when events occur independently and at a constant average rate. This distribution is widely applied in areas like call centers, traffic flow, biology, and machine learning.
Understanding Stochastic Gradient Descent (SGD)
Published:
Stochastic Gradient Descent (SGD) is an optimization algorithm used in machine learning and deep learning to minimize the loss function. Unlike standard Gradient Descent, which computes the gradient using the entire dataset, SGD updates the model one sample at a time, making it more efficient for large datasets.
Selection of the Loss Functions for Logistic Regression
Published:
When training a machine learning model, choosing the right loss function is critical to ensuring effective learning. In linear regression, Mean Squared Error (MSE) is commonly used, but for logistic regression, it becomes problematic.
Interviewing and Networking Tips for MLE New Grads
Published:
As a Ph.D. candidate with three internship experiences and closely related research experience, job hunting still took more than my expected time especially under the impact of the Pandemic.
portfolio
A cross-library image augmentation module for Deep Learning Training
Published:
A versatile image augmentation framework incorporating 300+ operations from 8 popular libraries ”
Feature Reduction to Classifiers
Published:
Performance studies of PCA, LDA, and their kernel versions to SVM, ML, KNN, GMM
Compressive Image Recovery
Published:
Low-cost and high-efficient seismic image recovery and optimal sampling recommendation
Deep Eraser
Published:
An object-oriented “eraser” for images and videos
Pick-up Drop-off Design
Published:
Use reinforcement learning to design a route for delivery man
Zero-human-effort Segmentation
Published:
A fully automatic iterative deep learning framework for cell segmentation on noisy Label
Pixel Translator
Published:
Convert gray images of border/vein to RGB leaf images using cGAN
Hierarchical Spatial Pattern Analysis on Neuronal Neighborhood
Published:
A robust method to detect & profile injury-caused alterations to brain tissue at the multi-cellular scale
Large Scale Image Registration
Published:
Accelerated large-scale image alignment by 10× with uniform keypoint control and multiprocessing
Multiplex Channels Denoising and Deblurring by Wavelet Transform
Published:
Wavelet analysis for recovering useful information from damages with as noise and blurs
publications
A Simplified Normalization Operation for Perfect Reconstruction from a Modified STFT
Published in IEEE International Conference on Signal Processing (ICSP), 2014
A improved version of short-time-fourier-transform.
Phasetime: Deep Learning Approach to Detect Nuclei in Time Lapse Phase Images
Published in Journal of clinical medicine, 2019
A Mask RCNN approach of nuclei segmentation in time lapse time in nanowells.
Attenuating Random Noise in Seismic Data by a Deep Learning Approach
Published in arXiv preprint, 2019
Attenuate Gaussian noise by residual neural networks.
Swell-noise attenuation: A deep learning approach.
Published in The Leading Edge, 2019
The full manuscript of Swell-noise attenuation by residual neural networks.
Seismic Compressive Sensing by Generative Inpainting Network: Toward An Optimized Acquisition Survey
Published in The Leading Edge, 2019
The full manuscript of compressive image recovery and non-uniform sampling recommendation of my summmer intern project at Anadarko.
Generative Inpainting Network Applications on Seismic Image Compression and Non-Uniform Sampling
Published in Workshop on Neural Information Processing Systems (NIPS): Solving Inverse Problems with Deep Networks, 2019
The preliminary results of compressive image recovery and non-uniform sampling recommendation of my summmer intern project at Anadarko.
Few Is Enough: Task-Augmented Active Meta-Learning for Brain Cell Classification
Published in Medical Image Computing and Computer Assisted Intervention(MICCAI), 2020
An active meta-Learning approach of cell classification use a very few tranining data.
Comprehensive Cell Phenotyping Method for Whole-Brain Tissue Mapping Using Highly Multiplexed Immunofluorescence Imaging
Published in Nature Communications, 2021
Our lab’s complete pipleline for whole brain analysis, including my main thesis topic of nuclei cell segmenetation.
ARIA: Adversarially Robust Image Attribution for Content Provenance
Published in CVPR, 2022
Internship mentoring project at Adobe related to Coalition for Content Provenance and Authenticity (C2PA) initative, setting up industrial standard to address misleading information, lead by Adobe
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.