HTTP GET request

Hi all!

Is there any example/code how to create a simple HTTP GET request using ROOT?

I wrote a script that magically works with root.cern.ch:

TSocket sock("root.cern.ch", 80);
if (!sock.IsValid()) sock.Close();
sock.SetCompressionLevel(0);
sock.Send("GET / http/1.1\nHOST: root.cern.ch\n\n");
const Int_t MaxLen = 1<<17;
char answ[MaxLen];
while (sock.Recv(answ, MaxLen-1, 3) > 0) cout << answ << endl;
sock.Close();

But for www.cern.ch it doesn’t work, it gives an error:

Error R__unzip_header: error in header Error in <TMessage::Uncompress>: Inconsistency found in header (nin=0, nbuf=0)
I set compression level to 0, but it was ignored.

Same code, but with rich GET request, no effect.

{
// Create strings to form GET request
TString SendStr("");
TString SendStr_URL("www.google.com");
TString SendStr_GET("GET ");
TString SendStr_path("/");
TString SendStr_http(" http/1.1\r\n");
TString SendStr_host("Host: ");
SendStr_host.Append(SendStr_URL).Append("\r\n");
TString SendStr_useragent("User-Agent: Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 5.1; en-US)\r\n");
TString SendStr_accept("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n");
TString SendStr_lang("Accept-Language: en-US,en\r\n");
TString SendStr_encoding("Accept-Encoding: identity\r\n");
TString SendStr_cache("Cache-Control: no-cache\r\n");
TString SendStr_connection("Connection: Keep-Alive\r\n");
// Open socket
TSocket sock((const char*)SendStr_URL, 80);
if (!sock.IsValid()) sock.Close();
sock.SetCompressionLevel(0);
// Form GET request
SendStr.Append(SendStr_GET).Append(SendStr_path).Append(SendStr_http);
SendStr.Append(SendStr_useragent);
SendStr.Append(SendStr_host);
SendStr.Append(SendStr_accept);
SendStr.Append(SendStr_lang);
SendStr.Append(SendStr_encoding);
SendStr.Append(SendStr_cache);
SendStr.Append(SendStr_connection);
SendStr.Append("\r\n");
// Print GET request to stdout
cout << (const char*)SendStr << endl;
// Send GET request
sock.Send((const char*)SendStr);
// Create receiving biffer
const Int_t MaxLen = 1<<17;
char answ[MaxLen];
// Get response to stdout
while (sock.Recv(answ, MaxLen-1, 3) > 0) cout.write(answ, MaxLen);
sock.Close();
}

Again: it works with root.cern.ch, but it doesn’t with main cern.ch, google.com, etc.

GET / http/1.1
User-Agent: Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 5.1; en-US)
Host: www.google.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en
Accept-Encoding: identity
Cache-Control: no-cache
Connection: Keep-Alive


Error R__unzip_header: error in header
Error in <TMessage::Uncompress>: Inconsistency found in header (nin=0, nbuf=0)
 400 Bad Request
Content-Type: text/html; charset=UTF-8
Content-Length: 925
Date: Thu, 14 Nov 2013 15:20:56 GMT
Server: GFE/2.0

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 400 (Bad Request)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}
  </style>
  <a href=//www.google.com/><img src=//www.google.com/images/errors/logo_sm.gif alt=Google></a>
  <p><b>400.</b> <ins>That’s an error.</ins>
  <p>Your client has issued a malformed or illegal request.  <ins>That’s all we know.</ins>

What’s wrong with compression?

Added the following:

// Send GET request
Int_t SentLen = sock.Send((const char*)SendStr);
cout << SendStr.Sizeof() << endl;
cout << SentLen << endl;
cout << sock.GetBytesSent() << endl;

Got the answers:
292
288
296

When I look at the reply from root.cern.ch, I see at the very beginning:

[code]E html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>

[/code] instead of [code] [/code] so, anyone can see, that it lacks first 8 bytes: [b]<!DOCTYP[/b]. Also 296-288=8. It seems that something happens inside TSocket. Can anyone help?

SendRaw does the trick. As I understand, the length of message in bytes in the beginning could cause the problem.
Here is the solution: w3.C