Posts

Showing posts from September, 2025

Introduction to Bloom Filter

 A Bloom filter is a probabilistic data structure used to test whether an element is possibly in a set or definitely not in a set . It’s designed to be memory-efficient and very fast, but it allows false positives (saying an element might be in the set when it’s not) while guaranteeing no false negatives (if it says something is not in the set, it really isn’t). How it works: You start with a bit array (all bits set to 0). You have k different hash functions . To add an item: Compute its k hashes. Set the corresponding k positions in the bit array to 1. To check if an item is in the set: Compute its k hashes. If all those positions are 1 → item might be in the set (possible false positive). If any position is 0 → item is definitely not in the set . Example: Suppose you insert "cat" . Hash functions map "cat" to positions [3, 7, 12], so you set those bits. Later, you check "dog" . Its hashes are [3, 12...

Understanding the Hash Ring in Consistent Hashing

Image
 If you’ve ever looked into distributed systems or scalable caching, you’ve probably heard the term hash ring . It’s at the heart of consistent hashing — the algorithm that powers systems like DynamoDB, Cassandra, Riak, and distributed caches such as Memcached and Redis Cluster. In this post, we’ll break down what a hash ring is, how it works, and why it matters. What Is a Hash Ring? A hash ring is a conceptual circle that represents the entire range of hash values. Imagine the numbers from 0 to 2³² − 1 arranged in a circle. Nodes (servers) : Each server in your cluster is hashed to a point on the ring based on its identifier (e.g., IP address). Keys (data items) : Each key you want to store (e.g., cache key, user ID) is also hashed to a point on the ring. Assignment rule : A key is assigned to the first server found clockwise from its hash position. https://ably.com/blog/implementing-efficient-consistent-hashing Adding a Node When you add a new node: Hash the node’...

Best Practices for Access, ID, and Refresh Tokens

 When building modern applications with OAuth 2.0 and OpenID Connect, developers often get confused about where to store tokens . This decision is crucial because storing them incorrectly can open doors to serious vulnerabilities like XSS or CSRF attacks. Let’s break it down. Understanding the Tokens Access Token → Proves what you can do . It authorizes the client to call APIs on behalf of the user. ID Token → Proves who you are . It carries identity information (name, email, roles) about the authenticated user. Refresh Token (optional) → Used to obtain new access tokens without making the user log in again. Each token has a different lifespan and security sensitivity, so the storage strategy must match the risk. Storage on the User Side (Frontend) 1. Web Applications Best practice : Keep tokens in memory (JavaScript variables). Safer persistence : Use HTTP-only, Secure cookies if tokens must survive page reloads. Avoid : localStorage and sessionStorage for...

Understanding CQRS with Real-Life Examples

Image
  1. CQRS in an Online Store (Real-Life Example) In this diagram, we see an online store architecture applying CQRS: API Gateway : The entry point for all client requests. It routes incoming calls to the right backend service. Frontend Service : Connects to various backend services depending on the operation. Products Service (Write Side) : Responsible for updating product information (e.g., adding new products, changing prices, updating stock). Products Search Service (Read Side) : Maintains a denormalized, read-optimized view of product data so customers can quickly search and filter products. Reviews Service : Handles user reviews separately, but it also feeds into the search service so that reviews can be displayed alongside products. Notice that the Products Service and the Products Search Service do not share the same database. Instead, updates from the Products Service trigger events (via message queues or change notifications), and the Search Service...

Database Partitioning vs Sharding

Image
  🔹 What is Database Partitioning? Partitioning means splitting a large table into smaller pieces ( partitions ) to make queries faster and maintenance easier. Importantly, all partitions live inside the same database instance . Types of partitioning: Horizontal partitioning : Splits data by rows (e.g., orders by year). Vertical partitioning : Splits data by columns (e.g., frequently used vs. rarely used columns). ✅ Good for performance tuning when data still fits on one server. 🔹 Partitioning in Action (PostgreSQL Example) Imagine an orders table with millions of rows. We want to split it by year. Step 1: Create a partitioned table CREATE TABLE orders ( order_id BIGSERIAL PRIMARY KEY, customer_id INT NOT NULL , order_date DATE NOT NULL , amount NUMERIC ( 10 , 2 ) NOT NULL ) PARTITION BY RANGE (order_date); Step 2: Create partitions CREATE TABLE orders_2023 PARTITION OF orders FOR VALUES FROM ( '2023-01-01' ) TO ( ...

Global Server Load Balancing in the Azure Ecosystem

Image
  What is GSLB? Global Server Load Balancing (GSLB) is the practice of distributing user traffic across multiple regions or data centers. Unlike local load balancers that only manage traffic within a region, GSLB handles global routing, ensuring: Low latency – users are directed to the nearest endpoint. High availability – automatic failover if one region becomes unavailable. Disaster recovery – support for active-active or active-passive architectures. Geo-compliance – traffic can be routed based on geographic or regulatory requirements. GSLB in the Azure Ecosystem Azure provides multiple services to implement both regional load balancing and global server load balancing . Understanding their roles is key to designing a resilient and scalable cloud architecture. 1. Local Load Balancing (Regional) Azure Load Balancer (L4): Handles TCP/UDP traffic distribution within a region. Best for non-HTTP workloads such as gaming, VoIP, or real-time messaging. Azure Application Gateway ...

Understanding the OSI Model with a Real HTTP Request Example

Image
 Networking is everywhere — from browsing Google to streaming Netflix — but behind the scenes, a lot is happening to make sure data travels safely and reliably. To explain this, engineers often use the OSI Model (Open Systems Interconnection Model) : a framework that splits network communication into seven layers . In this post, we’ll break down the OSI layers, see what software or hardware handles them, and then walk through a real-life example of an HTTP request to Google . 🔹 The 7 Layers of the OSI Model Physical Layer → Transmits raw bits as electrical signals, radio waves, or light pulses. Devices: Cables, Wi-Fi, NICs, hubs. Data Link Layer → Moves frames between devices on the same network. Uses MAC addresses. Devices: Switches, NIC drivers. Network Layer → Routes packets between different networks. Uses IP addresses. Devices: Routers, firewalls. Transport Layer → Ensures reliable end-to-end delivery. Breaks data into segments. Handled...

REST vs RPC vs GraphQL: Choosing the Right API Style

 APIs are the glue that connects modern applications. Whether you’re building a public-facing service or a complex microservices ecosystem, the design style of your API plays a huge role in performance, usability, and maintainability. Three common approaches dominate today’s landscape: REST , RPC , and GraphQL . Each comes from a different philosophy, and each has strengths and trade-offs. Let’s compare them. 1. REST (Representational State Transfer) REST treats everything as a resource and interacts with those resources using standard HTTP methods. It’s resource-oriented and built on the web’s existing semantics. Example : GET / users / 123 → fetch user 123 PUT / users / 123 → update user 123 GET / users / 123 / orders → fetch orders for user 123 ✅ Pros Standardized and predictable Widely supported by tools and libraries Leverages HTTP features (caching, status codes, headers) Great for resource-centric systems ❌ Cons ...