Sending Data: Forms, Files, and Payloads

Sending Data

So far, we've been asking servers for things. But the web is interactive - we need to send data too. Every time you log in, post a comment, upload a profile picture, or update your settings, you're sending data to a server.

The fascinating part? There's no special protocol for this. No complex binary format. Just the same HTTP conversation we've been learning, with your data riding along as cargo.

And here's where it gets interesting: servers often trust that the data you're sending is what you claim it is...

Data in the Request

While GET asks for things, POST changes things. It's how you tell the server: "Hey, I have something for you."

POST /login HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 29

username=admin&password=12345

But all HTTP methods can send data. The difference is in intent, convention, and where that data typically lives:

GET: "Give me something" - data usually in URL parameters (?search=cats&page=2)
POST: "Here's something new" - data usually in body (create a comment, submit a form)
PUT: "Replace whatever's at this URL with this data" - data in body (update entire profile)
PATCH: "Just change these specific parts" - data in body (update email address only)
DELETE: "Remove this thing" - data in URL parameters or body (specify what to delete)

Each verb is a different claim about what you want to happen.

GET Sends Data Too

Don't let anyone tell you GET doesn't send data - it absolutely does! Every time you search, filter, or paginate, you're likely sending data via query parameters:

GET /search?q=hacking&category=books&sort=newest HTTP/1.1
Host: bookstore.com

That's three pieces of data (q, category, sort) traveling with your request. Query parameters are just as much "data" as anything in a POST body.

HTTP methods express intent, not technical capability. Any method can carry data anywhere - in the URL, headers, or body. But conventions exist for good reasons, and breaking them confuses both servers and other developers. Most servers will ignore GET request bodies, but some APIs use them for complex queries. DELETE with a body is more common in APIs that need additional context for the deletion.

From HTML Forms to HTTP Requests

When you fill out a form on a webpage and click submit, your browser performs a bit of magic. It takes all those input fields and transforms them into the body of an HTTP request. Understanding this transformation is key to seeing how data really travels.

Consider this simple login form:

<form action="/login" method="POST">
  <input type="text" name="username" value="alice">
  <input type="password" name="password" value="secret123">
  <input type="submit" value="Log In">
</form>

When you click "Log In", the browser reads every input field (except the submit button), takes the name and value attributes, and constructs a request body:

POST /login HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 32

username=alice&password=secret123

The browser automatically: - Sets the method to POST (from the form's method attribute) - Sets the path to /login (from the form's action attribute) - Adds the Content-Type: application/x-www-form-urlencoded header - Calculates the Content-Length - Builds the body by joining name=value pairs with &

It's not magic - it's just string concatenation with some encoding rules.

Content-Type

The Content-Type header is your promise to the server about what format your data is in. It's like placing a label on a container: "This jar contains pickles, not jam" - so whoever opens it knows what to expect inside.

But here's the crucial part: servers use the Content-Type header to decide which parser to invoke. Tell the server you're sending JSON, and it'll fire up the JSON parser. Say it's XML, and the XML parser springs into action. This seemingly simple header is actually a critical routing decision that determines how your data will be interpreted.

Common Content Types

application/x-www-form-urlencoded

The default for HTML forms. Data looks like URL parameters:

POST /login HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 45

username=alice+smith&password=my%26secret%21

Notice the encoding: spaces become +, & becomes %26, ! becomes %21. That's because & is used as a delimiter between fields - if your actual data contains an &, it better be encoded as %26 or the server will think it's seeing a new field!

application/json

The modern favorite. Clean, structured, explicit:

POST /api/login HTTP/1.1
Host: example.com
Content-Type: application/json
Content-Length: 45

{"username":"alice","password":"secret&safe"}

No encoding needed for that & - JSON has its own rules.

text/plain

Sometimes you just want to send raw text:

POST /note HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 24

Remember to buy milk & cookies

When Content-Type Lies

What happens when your Content-Type says one thing but your data is another?

POST /api/users HTTP/1.1
Content-Type: application/json
Content-Length: 38

<user><name>admin</name><role>admin</role></user>

You claimed JSON but sent XML. Some servers will reject it. Others will try to be "helpful" and parse it anyway. When parsers disagree about what they're looking at, interesting things can happen. A security filter expecting JSON might let the XML pass through unchecked, while the application logic behind it happily processes the XML data.

Content-Type confusion is the beginning of many beautiful exploits. When parsers disagree about what they're looking at, you can sometimes slip data past one to reach another.

Hidden Form Fields

HTML forms often include hidden fields. They're invisible to the average user but look like any other field in the HTTP request:

<form action="/checkout" method="POST">
  <input type="hidden" name="price" value="19.99">
  <input type="hidden" name="product_id" value="12345">
  <input type="text" name="quantity" value="1">
  <input type="submit" value="Buy Now">
</form>

When submitted, ALL fields (hidden or not) become part of the request body:

POST /checkout HTTP/1.1
Host: shop.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 38

price=19.99&product_id=12345&quantity=1

But here's the thing: that "hidden" field is only hidden in the browser's UI. In the HTTP request, it's just another parameter. And since you can craft HTTP requests directly...

POST /checkout HTTP/1.1
Host: shop.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 37

price=0.01&product_id=12345&quantity=1

This is why data validation must happen server-side. The value attributes in hidden fields are merely suggestions to anyone fluent in HTTP!

The Content-Length Header: Precision Matters

Content-Length tells the server exactly how many bytes (not characters!) are in your body. Get it wrong, and strange things happen.

POST /comment HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 5

Hello, World!

We claimed 5 bytes but sent 13. What happens? Perhaps:

Server reads Hello and stops
The rest (, World!) might be interpreted as the start of the next request
This is the foundation of HTTP Request Smuggling

Too long?

Content-Length: 100

Hello

The server waits... and waits... for 95 more bytes that never come. Eventually it times out. Or worse, it might read part of your next request as this request's body.

A single mismatched Content-Length can desynchronize an entire HTTP conversation. When proxies and servers disagree about where requests end, you can sometimes smuggle a second request inside the first.

📙 Why Servers Trust

You might wonder: if trusting client data is dangerous, why do servers do it? The answer lies in the engineering realities of the web:

Performance Constraints: Deep validation is expensive. Checking every claim, parsing every field, validating every relationship - it all takes CPU cycles. When you're handling thousands of requests per second, even milliseconds matter.

Backwards Compatibility: The web is built on decades of conventions. Servers must handle requests from ancient browsers, IoT devices, command-line tools, and cutting-edge applications. Being too strict breaks legitimate use cases.

Complexity Spiral: Full validation requires understanding context. Is "0.01" a valid price? Depends on the product. Is "admin" a valid username? Depends on the system. The validation logic can become more complex than the application itself.

Developer Optimism: It's human nature to assume normal use cases. Developers test happy paths - real users filling real forms. The idea that someone would manually craft a request with a negative quantity or a price of "banana" seems absurd... until it happens.

So servers trust by necessity, validating what they must and hoping for the best on the rest. This gap between necessary trust and perfect validation? That's where security researchers live.

File Uploads

When a form includes a file upload, the browser still sends a normal POST request, but it changes how the body is structured to accommodate binary data alongside text fields.

A Real Multipart/form-data Example

POST /upload HTTP/1.1
Host: example.com
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryabc123
Content-Length: 681

------WebKitFormBoundaryabc123
Content-Disposition: form-data; name="username"

coolstudent
------WebKitFormBoundaryabc123
Content-Disposition: form-data; name="profile_picture"; filename="photo.jpg"
Content-Type: image/jpeg

(binary image data here)
------WebKitFormBoundaryabc123--

Each part starts with a boundary line like: ------WebKitFormBoundaryabc123
Headers for that part follow, usually including Content-Disposition (with the field name, and sometimes a filename)
A blank line separates headers from the content
The content can be plain text (e.g., coolstudent) or binary data (e.g., an image file)
The final boundary ends with --, signaling the end of the entire message

It's like mailing multiple documents in one envelope, with divider sheets between them.

📙 Binary Data in Text Protocol

HTTP is a text-based protocol, but it can carry anything - like a text envelope containing a photo. The multipart format provides the structure: text headers describe each part, boundaries separate them, and the actual content (text or binary) sits in between. The protocol remains human-readable even when carrying machine-readable payloads.

The Filename Game

That filename="photo.jpg" is just another claim. You can lie:

filename="photo.jpg.php"
filename="photo.php\x00.jpg"
filename="../../../etc/passwd"
filename="photo.jpg"; name="role"; filename="admin"

Each tells a different story. Each might fool a different parser.

MIME Type Confusion

The Content-Type in a file upload is another claim:

Content-Disposition: form-data; name="avatar"; filename="shell.jpg"
Content-Type: image/jpeg

<?php system($_GET['cmd']); ?>

We claim it's a JPEG, but it's PHP code. Some servers check the MIME type. Some check the file extension. Some check the actual content. Many fail to check all three properly.

Character Encoding: The Devil in the Details

When you send café, how many bytes is that? Depends on the encoding:

UTF-8: 5 bytes (é is 2 bytes)
Latin-1: 4 bytes
ASCII: Error (é doesn't exist)

If the server expects UTF-8 but interprets as Latin-1, café might become cafÃ©. Annoying for text, but what about when it's part of a security check?

URL encoding adds another layer:

space → %20 or +
/ → %2F
< → %3C

Mix encodings, and you can sometimes slip past filters:

<script> blocked? Try %3Cscript%3E
Still blocked? Try %253Cscript%253E (double encoded)
Or mix: <scr%69pt> (i = %69 in ASCII)

🛡️ Parameter Pollution and Array Games

What happens when you send the same parameter twice?

username=guest&username=admin

Different servers handle this differently:

Some take the first value: guest
Some take the last value: admin
Some combine them: guest,admin
Some create an array: ['guest', 'admin']

This ambiguity is an attacker's playground. Can you be both guest and admin? In some apps, yes.

Arrays add another layer of fun:

Many frameworks use [] to indicate arrays: role[]=user&role[]=admin becomes ['user', 'admin'].

Nested brackets may also create nested objects: user[name]=alice&user[age]=25 becomes {user: {name: 'alice', age: '25'}}.

role[]=user&role[]=admin
filter[user][status]=active&filter[admin][status]=active

Modern frameworks often parse these into complex data structures. Sometimes they parse more than they should.

Every Field is a Claim

Here's the key insight that transforms how you see web forms: every piece of data you send is just a claim the server chooses to believe or not.

That hidden price field? It's not a fact, it's a suggestion. That role=user parameter? It's not an assignment, it's an assertion. That filename? It's not metadata, it's a negotiation.

The server's job is to verify every claim. Your job, as someone learning security, is to understand which claims go unverified.

Because in the gap between what clients claim and what servers verify, that's where the magic happens.