[NTLUG:Discuss] Scriptable, javascript-aware web browser OR virtual operator
David Stanaway
david at stanaway.net
Thu Aug 30 20:08:49 CDT 2007
Interesting way of doing it.
But for me:
bash$ TZ=GMT date
Fri Aug 31 01:04:20 GMT 2007
bash$ locate share/zoneinfo | grep -i mexico
/usr/share/zoneinfo/America/Mexico_City
/usr/share/zoneinfo/Mexico
/usr/share/zoneinfo/Mexico/BajaNorte
/usr/share/zoneinfo/Mexico/BajaSur
/usr/share/zoneinfo/Mexico/General
/usr/share/zoneinfo/posix/America/Mexico_City
/usr/share/zoneinfo/posix/Mexico
/usr/share/zoneinfo/posix/Mexico/BajaNorte
/usr/share/zoneinfo/posix/Mexico/BajaSur
/usr/share/zoneinfo/posix/Mexico/General
/usr/share/zoneinfo/right/America/Mexico_City
/usr/share/zoneinfo/right/Mexico
/usr/share/zoneinfo/right/Mexico/BajaNorte
/usr/share/zoneinfo/right/Mexico/BajaSur
/usr/share/zoneinfo/right/Mexico/General
bash$ TZ=Mexico/General date
Thu Aug 30 20:06:28 CDT 2007
You can manipulate the TZ environment variable in your program and use
the builtin localtime functions to get time in your desired timezone.
Look in the zoneinfo db for timezone names. Use the relative path from
the root of zoneinfo for the zoneinfo file as a timezone name.
Carl Haddick wrote:
>> Leroy Tennison wrote:
>>
>> > I need to "screen scrape" a generated web page which is generated by
>> > filling in a form on a previous page. I haven't had any success finding
>> > a solution yet so I'm hoping someone here can help.
>>
>
> I never contribute like I should. Please let me know if a script
> example like this is OK as an occasional offering.
>
> The problem got me interested in re-learning http connections in Python.
>
> First I went to www.timezoneconverter.com/cgi-bin/tzc/tzc with Firefox.
> That web page has a form which lets you convert times from one time zone
> representation to another.
>
> I chose to convert Jamaica time, mon, to GMT, but before I hit the
> 'convert time now' button I turned on tcpdump in a root shell to capture
> the http dialog:
>
> tcpdump -n -s 0 -w http.dump tcp port 80
>
> Then, I hit the convert time button. In the root shell window, I
> control-c'd the tcpdump process and chowned http.dump to my user rights.
>
> Opening up http.dump in ethereal (new versions are named something
> else), I found the url encoded 'post' values in the first line after the
> headers.
>
> I also just copied the headers from ethereal. I used the 'follow tcp
> stream' under the 'analyze' pull-down menu.
>
> Using tcpdump saved me from looking through html for the form and all
> the input fields. Easier to copy than create.
>
> The headers and the url encoded values I plugged into a short Python
> script. The work is done by the calls to httplib.HTTPConnection,
> tzconvert.request, tzrequest.getresponse, and a 'search' using a
> compiled regular expression.
>
> When I run this script from a command line it prints GMT for local time,
> assuming local in Jamaica. Time zone issues aside, it does this by
> surfing a web site and pulling information out of a returned web page.
>
> If I were smarter I could no doubt think of far better ways, but, as Ma
> always said, it sucks to be me.
>
> Leroy, if you need help, email me. If the solution is complex I would
> not be available on an upaid basis, but on the other hand I've lurked
> here for years. Might not hurt to contribute something back.
>
> Script follows.
>
> Regards,
>
> Carl
>
> #!/usr/local/bin/python
>
> import httplib,urllib,re
>
> tzres=re.compile('<b>([0-9]{2}:[0-9]{2}:[0-9]{2} [0-9a-z,\s]+)</b>',re.M|re.I)
>
> postdata=urllib.urlencode({
> 'style':'1',
> 'use_current_datetime':'1',
> 'month':'8',
> 'day':'30',
> 'year':'2007',
> 'time':'13:34:39',
> 'time_type':'24hour',
> 'fromzone':'Jamaica',
> 'tozone':'GMT',
> 'Submit.x':'92',
> 'Submit.y':'12',
> 'Submit':'Convert'
> })
>
> headers={
> 'Host':'www.timezoneconverter.com',
> 'User-Agent':'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8) Gecko/20051111 Firefox/1.5',
> 'Content-type':'application/x-www-form-urlencoded',
> 'Accept':'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5',
> 'Accept-Language':'en-us,en;q=0.5',
> 'Accept-Encoding':'gzip,deflate',
> 'Accept-Charset':'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
> 'Keep-Alive':'300',
> 'Connection':'keep-alive',
> 'Referer':'http://www.timezoneconverter.com/cgi-bin/tzc.tzc',
> }
> tzconvert=httplib.HTTPConnection('www.timezoneconverter.com')
> tzconvert.request('POST','/cgi-bin/tzc.tzc',postdata,headers)
> verthtml=tzconvert.getresponse()
> if verthtml.status==200 and verthtml.reason=='OK':
> srch=tzres.search(verthtml.read())
> if srch:
> print 'Right now in Jamaica, mon, it\'s %s GMT'%srch.group(1)
> else:
> print 'Conversion not found, mon, call it \'island time\'.'
> else:
> print 'Bad ju-ju in the islands, mon. Bad HTML request.'
>
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss
>
More information about the Discuss
mailing list