[ SECURE CONNECTION PENDING ]
_ click to initialize _

scanning...

bash — srikanthbadavath.com
Back to Blog Cloud

How Files Get Uploaded in Production (S3, Presigned URLs & CDN)

Srikanth Badavath April 2026 14 min read

Most developers learn file upload by sending the file to their backend, saving it to disk or a database, and calling it done. That works for a demo. In production with millions of users uploading videos, images, and documents, it becomes a catastrophic bottleneck — your backend becomes the world's most expensive file forwarder.

Production systems never touch the file bytes in the backend. Here is exactly how they do it.

The Naive Way (and Why It Fails)

When you upload a file through a traditional backend, this is what happens:

sequenceDiagram participant U as User / Browser participant B as App Server (Backend) participant D as Database U->>B: POST /upload (entire file in body) Note over B: 🔴 File bytes pass through
your server RAM & network B->>B: Save file to disk / process B->>D: Store file path B->>U: 200 OK

The problems compound fast:

Rule: Your app server should process metadata — never file bytes. Object storage is purpose-built for binary data at scale.

What Is Object Storage (S3)?

Amazon S3 (Simple Storage Service) and its equivalents — Google Cloud Storage, Azure Blob Storage, Cloudflare R2 — are flat-namespace key-value stores for binary blobs. There is no folder hierarchy, no filesystem, no inode table. A file is just a key (like uploads/user-42/avatar.jpg) pointing to bytes.

ProviderServiceEgress CostFree Tier
AWSS3~$0.09/GB5GB storage, 20K GET
Google CloudCloud Storage~$0.12/GB5GB storage
CloudflareR2$0 egress10GB storage
BackblazeB2~$0.01/GB10GB storage

S3 can handle 5,500 GET requests/second per prefix and objects up to 5TB. No server you run will match that.

The Core Idea: Presigned URLs

A presigned URL is a time-limited, pre-authorized URL that lets a client upload or download directly to/from S3 — without any AWS credentials in the browser.

Think of it like a valet ticket. The valet company (your backend) issues a ticket (presigned URL) that allows exactly one specific car (file) to be parked (uploaded) for exactly 15 minutes. The valet doesn't need to drive the car through a checkpoint — the ticket itself carries the authorization.

sequenceDiagram participant U as User / Browser participant B as App Server participant S as S3 / Object Storage participant D as Database U->>B: POST /get-upload-url {filename, contentType} Note over B: Validates user auth
Generates presigned URL
(never touches file bytes) B->>S: GeneratePresignedURL(key, expiry=15min) S-->>B: presigned URL B-->>U: { uploadUrl, fileKey } U->>S: PUT presignedUrl (file bytes directly) Note over S: ✅ File stored in S3
Backend never sees bytes S-->>U: 200 OK U->>B: POST /confirm-upload { fileKey } B->>D: INSERT file_url INTO uploads B-->>U: { fileUrl, success: true }

The backend signs the URL using your AWS secret key (stored server-side, never exposed), then hands the signed URL to the client. S3 validates the signature on the PUT request — your server is completely out of the upload path.

Upload Flow: Full Code

Step 1 — Backend generates the presigned URL (Node.js)

const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const { getSignedUrl } = require('@aws-sdk/s3-request-presigner');
const { v4: uuidv4 } = require('uuid');

const s3 = new S3Client({
  region: process.env.AWS_REGION,       // e.g. 'us-east-1'
  credentials: {
    accessKeyId:     process.env.AWS_ACCESS_KEY_ID,
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
  },
});

app.post('/get-upload-url', authenticate, async (req, res) => {
  const { filename, contentType } = req.body;

  // Validate content type — never trust the client
  const allowed = ['image/jpeg', 'image/png', 'image/webp', 'video/mp4'];
  if (!allowed.includes(contentType)) {
    return res.status(400).json({ error: 'File type not allowed' });
  }

  // Build a unique, collision-proof key
  const ext  = filename.split('.').pop();
  const key  = `uploads/${req.user.id}/${uuidv4()}.${ext}`;

  const command = new PutObjectCommand({
    Bucket:      process.env.S3_BUCKET,
    Key:         key,
    ContentType: contentType,
    // Optional: limit file size server-side
    // ContentLength: 10 * 1024 * 1024,   // 10MB max
  });

  // URL expires in 15 minutes
  const uploadUrl = await getSignedUrl(s3, command, { expiresIn: 900 });

  res.json({ uploadUrl, key });
});

Step 2 — Frontend uploads directly to S3 (JavaScript)

async function uploadFile(file) {
  // 1. Ask your backend for a presigned URL
  const { uploadUrl, key } = await fetch('/get-upload-url', {
    method:  'POST',
    headers: { 'Content-Type': 'application/json' },
    body:    JSON.stringify({
      filename:    file.name,
      contentType: file.type,
    }),
  }).then(r => r.json());

  // 2. PUT the file directly to S3 — backend is NOT involved
  const upload = await fetch(uploadUrl, {
    method:  'PUT',
    headers: { 'Content-Type': file.type },
    body:    file,   // raw File object, no FormData needed
  });

  if (!upload.ok) throw new Error('Upload to S3 failed');

  // 3. Tell your backend the upload is confirmed
  const result = await fetch('/confirm-upload', {
    method:  'POST',
    headers: { 'Content-Type': 'application/json' },
    body:    JSON.stringify({ key }),
  }).then(r => r.json());

  return result.fileUrl;   // permanent CDN URL
}

Step 3 — Backend confirms and stores the URL (Node.js)

app.post('/confirm-upload', authenticate, async (req, res) => {
  const { key } = req.body;

  // Verify the object actually exists in S3 before trusting the client
  const { HeadObjectCommand } = require('@aws-sdk/client-s3');
  try {
    await s3.send(new HeadObjectCommand({
      Bucket: process.env.S3_BUCKET,
      Key:    key,
    }));
  } catch {
    return res.status(400).json({ error: 'File not found in storage' });
  }

  // Build the permanent CDN URL
  const fileUrl = `https://${process.env.CDN_DOMAIN}/${key}`;

  // Persist in database
  await db.query(
    'INSERT INTO user_files (user_id, key, url) VALUES ($1, $2, $3)',
    [req.user.id, key, fileUrl]
  );

  res.json({ fileUrl, success: true });
});

Same flow in Python (FastAPI + boto3)

import boto3, uuid, os
from fastapi import FastAPI, Depends
from pydantic import BaseModel

s3 = boto3.client(
    's3',
    region_name          = os.environ['AWS_REGION'],
    aws_access_key_id    = os.environ['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key= os.environ['AWS_SECRET_ACCESS_KEY'],
)

app = FastAPI()

class UploadRequest(BaseModel):
    filename:    str
    content_type: str

@app.post('/get-upload-url')
def get_upload_url(payload: UploadRequest, user=Depends(get_current_user)):
    allowed = {'image/jpeg', 'image/png', 'image/webp', 'video/mp4'}
    if payload.content_type not in allowed:
        raise HTTPException(400, 'File type not allowed')

    ext = payload.filename.rsplit('.', 1)[-1]
    key = f"uploads/{user.id}/{uuid.uuid4()}.{ext}"

    url = s3.generate_presigned_url(
        'put_object',
        Params={
            'Bucket':      os.environ['S3_BUCKET'],
            'Key':         key,
            'ContentType': payload.content_type,
        },
        ExpiresIn=900,   # 15 minutes
    )
    return {'upload_url': url, 'key': key}

Tracking Upload Progress

Because the client uploads directly to S3, the standard fetch API gives you no progress events. Use XMLHttpRequest instead:

function uploadWithProgress(file, presignedUrl, onProgress) {
  return new Promise((resolve, reject) => {
    const xhr = new XMLHttpRequest();

    xhr.upload.addEventListener('progress', (e) => {
      if (e.lengthComputable) {
        const pct = Math.round((e.loaded / e.total) * 100);
        onProgress(pct);   // update your progress bar
      }
    });

    xhr.addEventListener('load',  () => xhr.status < 300 ? resolve() : reject(xhr.status));
    xhr.addEventListener('error', reject);

    xhr.open('PUT', presignedUrl);
    xhr.setRequestHeader('Content-Type', file.type);
    xhr.send(file);
  });
}

// Usage
await uploadWithProgress(file, uploadUrl, (pct) => {
  progressBar.style.width = pct + '%';
  label.textContent = pct + '%';
});

Delivery Flow: CDN Caching

Uploading is only half the story. When users download a file, you don't want every request hitting S3 either. That's where a CDN (Content Delivery Network) comes in.

flowchart TD U([User requests file]) --> AS[App Server] AS --> CH{CDN Cache?} CH -- Cache HIT --> SU([Serve to User\n~5ms from edge node]) CH -- Cache MISS --> OS[(S3 Object Storage)] OS --> CDN[CDN caches the file\nat edge node] CDN --> SU2([Serve to User]) style CH fill:#1a1a2e,stroke:#00c2ff style SU fill:#0d6e4e,stroke:#00e676,color:#fff style SU2 fill:#0d6e4e,stroke:#00e676,color:#fff style OS fill:#1877F2,stroke:#00c2ff,color:#fff

A cache hit serves the file from a CDN edge node geographically close to the user — often sub-10ms. A cache miss fetches from S3 once, then caches it so every future request is a hit.

CloudFront in front of S3 (AWS)

# terraform config — CloudFront distribution in front of S3
resource "aws_cloudfront_distribution" "cdn" {
  origin {
    domain_name = aws_s3_bucket.uploads.bucket_regional_domain_name
    origin_id   = "s3-uploads"

    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.oai.cloudfront_access_identity_path
    }
  }

  default_cache_behavior {
    target_origin_id       = "s3-uploads"
    viewer_protocol_policy = "redirect-to-https"
    cached_methods         = ["GET", "HEAD"]
    allowed_methods        = ["GET", "HEAD"]

    forwarded_values {
      query_string = false
      cookies { forward = "none" }
    }

    # Cache for 1 year (files are content-addressed, so URL changes on update)
    min_ttl     = 0
    default_ttl = 86400
    max_ttl     = 31536000
  }

  restrictions {
    geo_restriction { restriction_type = "none" }
  }

  viewer_certificate {
    cloudfront_default_certificate = true
  }
}

Once CloudFront is set up, your fileUrl becomes https://d1abc.cloudfront.net/uploads/user-42/photo.jpg instead of the raw S3 URL. Users never hit S3 directly.

Large Files: Multipart Upload

Presigned URLs work for files up to 5GB in a single PUT. For larger files — or unreliable network conditions — S3's multipart upload splits the file into parts (minimum 5MB each) that are uploaded independently and reassembled by S3.

sequenceDiagram participant C as Client participant B as Backend participant S as S3 C->>B: Initiate multipart upload B->>S: CreateMultipartUpload S-->>B: uploadId B-->>C: uploadId + presigned URLs per part par Upload in parallel C->>S: PUT Part 1 (5MB) C->>S: PUT Part 2 (5MB) C->>S: PUT Part 3 (remainder) end S-->>C: ETag per part C->>B: Complete { uploadId, parts + ETags } B->>S: CompleteMultipartUpload S-->>B: Final file URL B-->>C: fileUrl
// Multipart upload — Node.js backend generating all part URLs at once
const {
  CreateMultipartUploadCommand,
  UploadPartCommand,
  CompleteMultipartUploadCommand,
} = require('@aws-sdk/client-s3');

app.post('/start-multipart', authenticate, async (req, res) => {
  const { key, contentType, partCount } = req.body;

  // Initiate
  const { UploadId } = await s3.send(new CreateMultipartUploadCommand({
    Bucket:      process.env.S3_BUCKET,
    Key:         key,
    ContentType: contentType,
  }));

  // Generate a presigned URL for each part
  const partUrls = await Promise.all(
    Array.from({ length: partCount }, (_, i) =>
      getSignedUrl(s3, new UploadPartCommand({
        Bucket:     process.env.S3_BUCKET,
        Key:        key,
        UploadId,
        PartNumber: i + 1,
      }), { expiresIn: 3600 })
    )
  );

  res.json({ uploadId: UploadId, partUrls });
});

Security Considerations

RiskMitigation
Client uploads malicious file (e.g. .exe)Validate ContentType server-side; set S3 bucket ACL to block execution; use Lambda trigger to virus-scan on upload
Client uploads to another user's keyAlways prefix key with user.id; generate key server-side, never trust client-provided key
Presigned URL leaked / reusedKeep expiry short (15 min); URL is single-use by design once the PUT succeeds
S3 bucket accidentally publicEnable "Block All Public Access" at account level; serve through CloudFront only
Unbounded file sizeSet content-length-range in POST policy; or enforce on CloudFront with WAF

S3 Bucket Policy — deny direct public access, allow CloudFront only

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudFrontOnly",
      "Effect": "Allow",
      "Principal": {
        "Service": "cloudfront.amazonaws.com"
      },
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::your-bucket-name/*",
      "Condition": {
        "StringEquals": {
          "AWS:SourceArn": "arn:aws:cloudfront::ACCOUNT_ID:distribution/DIST_ID"
        }
      }
    }
  ]
}

Post-Upload Processing with Lambda

A common pattern: trigger an AWS Lambda function automatically when a file lands in S3 — for image resizing, video transcoding, virus scanning, or thumbnail generation.

flowchart LR U([User]) -->|PUT presigned| S3[(S3 Bucket)] S3 -->|S3 Event Notification| L[Lambda Function] L -->|Resize / Transcode| S3T[(S3 thumbnails/)] L -->|Update status| DB[(Database)] L -->|Notify user| SQS[SQS / SNS] style S3 fill:#1877F2,stroke:#00c2ff,color:#fff style L fill:#ff9900,stroke:#ffb74d,color:#000 style DB fill:#420177,stroke:#b39ddb,color:#fff
# Lambda triggered on S3 PutObject — Python image resizer
import boto3
from PIL import Image
import io

s3 = boto3.client('s3')

def handler(event, context):
    for record in event['Records']:
        src_bucket = record['s3']['bucket']['name']
        src_key    = record['s3']['object']['key']

        # Download original
        obj  = s3.get_object(Bucket=src_bucket, Key=src_key)
        img  = Image.open(io.BytesIO(obj['Body'].read()))

        # Create thumbnail
        img.thumbnail((200, 200))
        buf = io.BytesIO()
        img.save(buf, format='JPEG', quality=85)
        buf.seek(0)

        # Upload thumbnail to a different prefix
        thumb_key = src_key.replace('uploads/', 'thumbnails/')
        s3.put_object(
            Bucket=src_bucket,
            Key=thumb_key,
            Body=buf,
            ContentType='image/jpeg',
        )

    return {'statusCode': 200}

Complete Production Architecture

flowchart TD subgraph Client B([Browser / Mobile App]) end subgraph Backend AS[App Server\nNode.js / Python] DB[(PostgreSQL\nStores metadata + URLs)] end subgraph AWS S3[(S3 Bucket\nObject Storage)] CF[CloudFront CDN\nGlobal Edge Cache] LB[Lambda\nPost-process / Resize] SQS[SQS Queue\nAsync notifications] end B -->|1 - Request presigned URL| AS AS -->|2 - GeneratePresignedURL| S3 S3 -->|3 - Presigned URL| AS AS -->|4 - Return URL| B B -->|5 - PUT file directly| S3 B -->|6 - Confirm upload| AS AS -->|7 - Verify + store URL| DB S3 -->|Event trigger| LB LB -->|Thumbnail / transcode| S3 LB -->|Notify| SQS B -->|Future downloads| CF CF -->|Cache miss → fetch| S3 style S3 fill:#1877F2,stroke:#00c2ff,color:#fff style CF fill:#FF9900,stroke:#ffb74d,color:#000 style LB fill:#FF9900,stroke:#ffb74d,color:#000 style DB fill:#420177,stroke:#b39ddb,color:#fff style AS fill:#0d6e4e,stroke:#00e676,color:#fff

Naive Backend Upload vs Presigned URL

FactorBackend Upload (naive)Presigned URL (production)
Backend memory per uploadEntire file in RAM0 bytes
Backend bandwidth costUser→Server + Server→S3User→S3 directly
Max file sizeServer timeout / RAM5GB (single PUT), 5TB (multipart)
ScalabilityBottleneck at backendS3 scales to millions of uploads/s
ConcurrencyLimited by backend instancesUnlimited (S3 handles it)
Upload speedTwo hopsDirect to S3 edge
AWS credentials in browserN/ANever — signed server-side

60-Second Summary

Upload Flow:
1. Client asks backend for a presigned URL → backend generates it using AWS credentials (never exposed to client)
2. Client PUTs the file directly to S3 using that URL — backend never sees the bytes
3. Client tells backend "upload done" → backend does a HeadObject check, then stores the CDN URL in the database

Delivery Flow:
All reads go through CloudFront CDN. Cache hit = served from edge node in milliseconds. Cache miss = fetched from S3 once, then cached for future requests.

Post-processing:
S3 event triggers Lambda automatically on upload — resize images, transcode video, scan for malware, generate thumbnails — all asynchronously, zero impact on upload latency.