deploying my first real backend to ec2 accidentally turned into a deep dive into reverse proxies, load balancers, scaling, replicas, private ips, and what a "server" actually is.
Suprim Khatri
FullStack Developer · May 15, 2026
i deployed a go backend to aws ec2 for the first time recently. nginx, certbot, systemd, github actions — the usual first real deployment stack. everything worked.
and then i made the mistake of asking: wait... what actually is a reverse proxy?
that single question somehow spiraled into load balancers, horizontal scaling, replicas, private ips, distributed systems, dns routing, and eventually database replication architecture. the weird part is i realized i had been using terms like server, instance, load balancer, and scaling without actually having a clean mental model for them.
i kept thinking of a server as some special kind of machine. turns out a server is mostly just a computer running software that serves requests over a network. that's it.
my ec2 instance is basically a virtual linux computer running inside aws infrastructure, accessible over the internet. conceptually:
Physical AWS hardware
↓
Virtualization layer
↓
My Ubuntu/Amazon Linux VMwhich is honestly not that different from running ubuntu in virtualbox on a windows laptop — except aws handles electricity, cooling, networking, uptime, and hardware failures for you.
i understood nginx mechanically. requests come in, nginx forwards them to my go app. but i couldn't understand why it was called a reverse proxy.
then it clicked. a normal proxy represents the client:
Client → Proxy → Interneta reverse proxy represents the server:
Internet → Reverse Proxy → Backend Serverthe backend is hidden behind nginx. the client thinks nginx is the server. that was the first major "ohhhhhh" moment.
then i heard "load balancers send requests to the least busy server" and immediately got confused. my thought process was basically: wait... how can nginx know which server is busy? if every ec2 instance has its own nginx, aren't they all independent?
which was actually the correct confusion. the missing piece was that there is usually one common load balancer in front — not five nginx instances magically communicating, but:
Internet
↓
Central Load Balancer
↓
┌────┼────┐
↓ ↓ ↓
VM1 VM2 VM3that central load balancer sees all traffic first, which means it knows active connections, it knows failures, it knows response times — because all requests pass through it.
this was the part my brain got stuck on for the longest. i kept imagining api.example.com somehow fanning out to four servers with no clear mechanism in between.
the answer turned out to be simple: the domain points to the load balancer, not the backend servers.
api.example.com
↓ DNS
Public IP of Load Balancer
↓
Load Balancer chooses backendbackend servers are usually private. the client never even sees them. that was the second huge mental unlock.
another thing that confused me: what ips do we even put in nginx config? i initially assumed public ec2 ips, but in real setups it's usually private ips.
upstream backend {
least_conn;
server 10.0.1.10:8080;
server 10.0.1.11:8080;
server 10.0.1.12:8080;
}those are internal aws network addresses. which means the architecture becomes:
Internet
↓
Public Load Balancer
↓
Private AWS Network
↓
Backend Serversway more secure. and finally i understood why backend servers often don't even need public ips.
for some reason i subconsciously treated nginx like magical infrastructure. but nginx is literally just another running process — started by systemd, listening on ports 80/443, waiting for requests. same idea as:
http.ListenAndServe(":8080", nil)except specialized for networking, proxying, load balancing, and connection handling.
this was the funniest realization. first, app servers become bottlenecks, so we add more app servers. then load balancers become bottlenecks, so we add more load balancers. then databases become bottlenecks, so we add replicas. then replicas need balancing, so we add database proxies and load balancers.
distributed systems basically became:
"this thing is overloaded"
↓
"put another thing in front of it"repeated forever.
i used to hear primary database, read replicas, replication, and sharding without really visualizing it. eventually it clicked:
Backend
/ \
Writes Reads
↓ ↓
Primary DB Replica DBswrites go to the primary. replicas continuously sync using wal logs in postgres or replication streams. then another realization hit me: if there are multiple replicas, something has to choose which replica serves reads. which means databases need load balancing too — same architectural pattern again.
the most important thing i learned wasn't nginx or ec2. it was this: large systems are mostly layers of traffic routing.
dns routes to load balancers. load balancers route to backend servers. backends route to database proxies. database proxies route to replicas and shards. and every layer exists because one machine eventually becomes insufficient.
before this deployment, "horizontal scaling" was just a buzzword, "reverse proxy" was something i copied from tutorials, and "load balancer" was vague cloud magic. now i can actually visualize the flow.
that's the difference. not memorizing terminology — building the mental model underneath it.
and weirdly enough, all of this started because i deployed one tiny go backend on a t3.micro and kept asking: wait... but how does that actually work?