Thursday, July 26, 2007

Playing with Google Wireless Transcoder

Last Week, I found Google Wireless Transcoder, and I started playing with it trying to find a XSS bug in the HTML "transcoder", and I shared it with Ronald, RSnake, and .mario.
What the Google Wireless Transcoder does, is pretty simple, it get's HTML code, and translate's it into XHTML mobile compilant code.
The way it works is a little mysterious (it's made in Java b.t.w.).. It could be something similar to HTML Purifier (this was pointed out by .mario), but I would say, that it works as a server-side browser, that generates valid XHTML code reconstructing the DOM.. (which in fact is not very difficult to do).. I think this because there are some errors very similar to other Java browsers, like jrex.. or jakarta (the GWT supports ftp:// gopher:// http:// between others..) for example:

This is an exception generated by this code:

http://jakarta.apache.org/commons/httpclient/xref/org/apache/commons/httpclient/HttpHost.html

So this makes me believe that they (at least) use HttpHost.java
They also use BASE64DecoderStream.java

Any way, there are some other errors like this one:

This is an exception generated by Firefox, because GWT returned invalid XHTML.. this is interesting, because it demonstrates that in some way, the GWT supports javascript URI.
(this website is googlr.com, that is a mirror of google.com, for avoiding the session generated at google.com).

We can also see that GWT, can be used as a "redirector", like:
http://www.google.com/gwt/n?u=http://www.vidoblog.net/ip/&_gwt_pg=orig

note the _gwt_pg

We can also temporarily host images, we just need to enter any website that contains images, (like google.com).
http://www.google.com/gwt/n?u=www.google.com
and the logo, will have an url simillar to:
http://www.google.com/gwt/i?i=01F8441E4_F9610322_4DB7F91D

Another interesing thing that RSnake pointed out is that, this "internal proxy's" are "logically
separated from their internal addresses." Any way, I found very interesting that:

http://www.google.com/gwt/n?u=gopher://local.sirdarckcat.net
http://www.google.com/gwt/n?u=gopher://unexistent

returns something different to:

http://www.google.com/gwt/n?u=gopher://127.0.0.1
http://www.google.com/gwt/n?u=gopher://localhost
http://www.google.com/gwt/n?u=gopher://localhostABCD

Even do local.sirdarckcat.net, and localhost (supposedly) point's to 127.0.0.1, but localhostABCD doesn't. why.. gopher://unexistent is different to gopher://localhostABCD ? maybe it's a way to avoid an attacker to contact 127.0.0.1..

We could try to enumerate the "alive" hosts with local.sirdarckcat.net:port#, but as far as I tested, all ports return's the same.

Something else that was discovered was that GWT parses data URIs.

http://www.google.com/gwt/n?u=data:text/html;base64,PGh0bWw%2BDQo8aGVhZD4NCjx0aXRsZT5IZWxsbyBXb3JsZDwvdGl0bGU%2BDQo8L2hlYWQ%2BDQo8Ym9keT4NCkczDQo8L2JvZHk%2BDQo8L2h0bWw%2B

pretty amazing it's the first web-proxy (I've seen) that actually parses them..

For ending, I think that GWT is a great tool, has a lot of features (some of them hidden to naive eyes). I think this should be investigated more deeply, (for example the impact of using GWT as a SEO technique, to use GWT pagerank as an inbound link to your site).

Greetz!!