News
Serving Multiple Users at Once: How Continuous Batching Keeps LLM Inference Efficient
2+ hour ago (759+ words) Making developers awesome at machine learning In the previous article, we saw how a language model processes a prompt during prefill, then generates tokens one at a time during decode, and uses KV cache to avoid repeated computation. In the…...
Train, Serve, and Deploy a Scikit-learn Model with Fast API
1+ mon, 1+ week ago (935+ words) Making developers awesome at machine learning In this article, you will learn how to train a Scikit-learn classification model, serve it with Fast API, and deploy it to Fast API Cloud. Topics we will cover include: Train, Serve, and Deploy…...
AI Agent Memory Explained in 3 Levels of Difficulty
1+ mon, 1+ week ago (1113+ words) Making developers awesome at machine learning In this article, you will learn how AI agent memory works across working memory, external memory, and scalable memory architectures for building agents that improve over time. Topics we will cover include: AI Agent…...
Getting Started with Zero-Shot Text Classification
1+ mon, 1+ week ago (776+ words) Making developers awesome at machine learning In this article, you will learn how zero-shot text classification works and how to apply it using a pretrained transformer model. Topics we will cover include: Getting Started with Zero-Shot Text Classification Image by…...
5 Techniques for Efficient Long-Context RAG
1+ mon, 2+ week ago (491+ words) Making developers awesome at machine learning In this article, you will learn how to build efficient long-context retrieval-augmented generation (RAG) systems using modern techniques that address attention limitations and cost challenges. Topics we will cover include: 5 Techniques for Efficient Long-Context…...
From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs
2+ mon, 8+ hour ago (834+ words) Making developers awesome at machine learning In the previous article, we saw how a language model converts logits into probabilities and samples the next token. But where do these logits come from? In this tutorial, we take a hands-on approach…...
7 Machine Learning Trends to Watch in 2026
1+ mon, 4+ week ago (1661+ words) Making developers awesome at machine learning In this article, you will learn how machine learning is evolving in 2026 from prediction-focused systems into deeply integrated, action-oriented systems that drive real-world workflows. Topics we will cover include: Let's not waste any more…...
Top 5 Reranking Models to Improve RAG Results
1+ mon, 3+ week ago (347+ words) Making developers awesome at machine learning In this article, you will learn how reranking improves the relevance of results in retrieval-augmented generation (RAG) systems by going beyond what retrievers alone can achieve. Topics we will cover include: Top 5 Reranking Models…...
Beyond Vector Search: Building a Deterministic 3-Tiered Graph-RAG System
1+ mon, 2+ week ago (924+ words) Making developers awesome at machine learning In this article, you will learn how to build a deterministic, multi-tier retrieval-augmented generation system using knowledge graphs and vector databases. Topics we will cover include: Beyond Vector Search: Building a Deterministic 3-Tiered Graph-RAG…...
7 Essential Python Itertools for Feature Engineering
2+ mon, 17+ hour ago (875+ words) Making developers awesome at machine learning In this article, you will learn how to use Python's itertools module to simplify common feature engineering tasks with clean, efficient patterns. Topics we will cover include: 7 Essential Python Itertools for Feature Engineering Image…...