File Upload and Download APIs: Multipart, Presigned URLs, and Chunked Transfers
File handling is where many otherwise well-designed APIs cut corners. The result is integrations that work fine for small files and break under production conditions: uploads timing out on slow connections, downloads failing midway through large transfers, clients with no way to resume an interrupted operation. File upload and download have well-established patterns that handle these conditions correctly. Most of them require explicit design choices rather than the defaults.
Simple Upload: When It Is Sufficient
For files under a few megabytes that users upload infrequently, a direct multipart form upload to your API is the simplest approach. The client sends a multipart/form-data POST request with the file as one part and any metadata as additional parts or as a separate JSON field:
POST /documents HTTP/1.1
Content-Type: multipart/form-data; boundary=----boundary
------boundary
Content-Disposition: form-data; name="file"; filename="report.pdf"
Content-Type: application/pdf
[binary file data]
------boundary
Content-Disposition: form-data; name="metadata"
Content-Type: application/json
{"title": "Q1 Report", "category": "finance"}
------boundary--
The API server receives the file, processes it (validates type and size, runs any transforms), stores it, and returns the resulting resource in the response. This is simple to implement and simple to use.
The limitations appear with scale. Large files held in server memory during upload put pressure on application memory. Slow client connections hold server resources open for extended periods. If the upload fails midway through a 500MB file, the client must start over. For most file handling use cases beyond small documents and images, direct upload is insufficient.
Presigned URLs: The Scalable Pattern
Presigned URLs move file storage off the API server entirely. The client asks the API for permission to upload a file; the API returns a time-limited URL pointing directly to object storage (S3, GCS, Azure Blob Storage). The client uploads directly to that URL. The API server never touches the file bytes.
The flow:
POST /upload-tokens HTTP/1.1
Content-Type: application/json
{"filename": "dataset.csv", "content_type": "text/csv", "size_bytes": 52428800}
HTTP/1.1 200 OK
{
"upload_url": "https://storage.googleapis.com/bucket/dataset-uuid.csv?X-Goog-Signature=...",
"expires_at": "2026-05-02T11:30:00Z",
"file_id": "file_f4e8b2d1"
}
The client PUTs the file directly to the upload_url. No API server middleware is involved. When the upload completes, the client notifies the API that the upload is done:
POST /documents HTTP/1.1
Content-Type: application/json
{"file_id": "file_f4e8b2d1", "title": "Q1 Dataset"}
The API confirms the file exists in storage, runs validation, and creates the document record.
The advantages are substantial. Application servers are not involved in the data transfer — upload bandwidth and memory pressure move to the storage layer, which is designed for it. Uploads can be large without impacting API server resources. Client-to-storage transfer uses the storage provider’s infrastructure, which is often faster and more reliable than routing through an application server. The presigned URL expires after a short window, limiting the exposure of a leaked URL.
Chunked Upload for Large Files
For very large files — gigabytes rather than megabytes — neither direct upload nor single-request presigned URL upload is reliable. Any network interruption requires starting over. On slow or unreliable connections, the probability of completing a multi-gigabyte upload in a single uninterrupted request approaches zero.
Chunked upload breaks the file into smaller pieces that can be uploaded independently. If one chunk fails, only that chunk needs to be retried. If the entire upload is interrupted, it can be resumed from the last successful chunk.
The implementation follows an initiate-upload-complete pattern. The client initiates an upload session and receives an upload session ID. It then uploads chunks in order, each identified by its position (byte range), using the Content-Range header:
PUT /upload-sessions/sess_abc123 HTTP/1.1
Content-Range: bytes 0-5242879/52428800
Content-Length: 5242880
[5MB of binary data]
HTTP/1.1 308 Resume Incomplete
Range: 0-5242879
308 Resume Incomplete is the standard response for a successfully received chunk that is not the final one. It confirms the range received and implicitly tells the client what to send next. When the final chunk is uploaded, the server responds with 200 or 201 indicating the upload is complete.
If the upload is interrupted, the client can query the session to find the highest confirmed byte offset and resume from there:
PUT /upload-sessions/sess_abc123 HTTP/1.1
Content-Range: bytes */52428800
HTTP/1.1 308 Resume Incomplete
Range: 0-20971519
The client sends only the remaining bytes. Google’s resumable upload protocol follows this pattern and is widely implemented as a reference.
Download: Range Requests and Partial Content
Downloads have a parallel concern: large files on unreliable connections need resumable download support. HTTP provides this through range requests.
The client includes a Range header specifying the byte range it wants:
GET /documents/file_f4e8b2d1/content HTTP/1.1
Range: bytes=10485760-20971519
If the server supports range requests, it responds with 206 Partial Content and the requested byte range:
HTTP/1.1 206 Partial Content
Content-Range: bytes 10485760-20971519/52428800
Content-Length: 10485760
Accept-Ranges: bytes
[10MB of binary data]
Accept-Ranges: bytes in the response (on any request, not just range requests) signals that the server supports partial content. Clients that support resumable downloads check for this header before attempting resumption.
Range requests also enable parallel download: a client can request different byte ranges simultaneously from multiple connections and assemble the pieces locally. This is how download managers and some browsers accelerate large file downloads.
Content-Type Validation and Security
File uploads are a security surface. Accept only the content types your application is designed to handle, validate them server-side rather than trusting the client-provided content type, and never serve user-uploaded content from the same origin as your application without isolation.
Malicious file uploads exploit mismatches between what an API accepts and what browsers will execute. An HTML file uploaded as text/plain and served from your API’s origin will execute as HTML if a browser retrieves it. Serve user-uploaded content from a separate origin (a storage bucket with a distinct domain) rather than from your application server. Never set Content-Type based on the client’s claim — detect it from the file’s actual bytes using a library like libmagic.
Validate file size on initiation, before any bytes are transferred. A request claiming a 100GB upload should be rejected immediately with a 413 Payload Too Large, not after 100GB has been received.
File handling done right is substantially more work than file handling done quickly. The patterns — presigned URLs, chunked upload, range downloads, content validation — each exist because a simpler approach fails under real conditions. The investment pays off the first time a user on a slow connection successfully uploads a large file that would have failed with a direct upload.