Skip to content
Advertisement

Receive (recv) full request (e.g. curl HTTP)

How should this be done?

I want to receive a (rather long) HTTP request and cannot get this to work. The problem: Without flags, recv does not read the whole message. I guess this is normal behavior. From what I understand using the MSG_WAITALL flag causes it to block until everything is received. However, in that case the call blocks forever (until I ctrl+c the client (curl) process.

Below there is a (still lengthy, but rather minimal) example snippet. Sorry it mixes C and C++ style but I wanted to avoid own errors and sticked largely to example code with as few modifications as possible.

JavaScript

The curl resquest i used:

JavaScript

With MSG_WAITALL my server produces the output:

JavaScript

and hangs, once I CTRL+C curl it progresses with:

JavaScript

Without MSG_WAITALL and the same request, the server produces the output:

JavaScript

And curl does not receive the HTTP response. However, a shorter request properly receives the response:

JavaScript

And the server properly received the full request:

JavaScript

I do understand that it is normal that not all TCP packets arrive right away. However, what is the correct way for me to assemble the full request? I also tried non-blocking variants but usually ran into situations where no more data was ready for reading. If necessary I can produce sample code for this, similarly.

PS: The problem manifests when request length and connection badness exceed a threshold. I cannot reproduce it on my own machine with queries to localhost, and depending on my connection, the request length where problems start to manifest varies.

Advertisement

Answer

HTTP is a protocol, it has structure and rules to it. Read RFC 2616, particularly Section 4 “HTTP Message”.

Your recv code is not doing anything to follow the protocol at all. You can’t just blindly read an arbitrary buffer of data and expect it to be the complete HTTP request correctly. You have to read the request according to the rules of the protocol. Specifically, you need to:

  • read a CRLF delimited line of text. This will contain the requested method, resource, and HTTP version.

  • then read a variable length list of CRLF delimited request headers. The list is terminated by a CRLF CRLF sequence.

  • then analyze the request method and headers to determine whether the request has an entity body. If so, the headers will tell you how it is encoded over the connection (see Section 4.4 “Message Length”), so you know how it needs to be read and when you need to stop reading.

  • then process the completed request, and send your response.

  • then close the connection, unless either:

  • the request asked for HTTP 1.0 and Connection: keep-alive is present in the request headers.

  • the request asked for HTTP 1.1+ and Connection: close is not present in the request headers.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement