adapting the generated access scripts

Of course such generated scripts are only the raw version of the scripts you will run in production. There are a lot of hidden fields and URLs in HTML bearing so-called session variables, and we have to extract such strings from the HTML and employ them during further processing. Of course this is somehow reverse engineering, as nobody explains us the concept behind the web pages and how they keep state. Be aware, that web servers also encode the user identity into that state, and if that state was easy to decode entirely, you were able to pretend to be any other user without ever authentifying as that one. This did happen in the past.

Once a new raw perl script is generated properly, it needs getting embedded into a proper frame, you might adopt p.pl for that purpose, it serves me well. Until it works well, I always call p.pl with --nocleanup_tempdir_p, so that I can have a look at the generated HTML. There a hidden HTML for fields and CGI parameters serving as session IDs and so forth, recognize them! Extract them, keep them, employ them just in the way the original HTML looks! That's esp. the part, that a machine won't be able to do itself, and where web-site developers (resp. the developers of their frameworks) are more or less "creative" and unpredictible. But that period is also quite interesting and challenging. As my script says:

This script is based on Daniel Stenberg's crawlink.pl . which in turn “… was based on the checklink.pl script I wrote ages ago.” (I = Daniel Stenberg).

crawlink.pl pretty-prints HTML forms.

log+handcraft
log+gen
adapt
change+restart
what else?
Aleph Soft GmbH
other involvements
Jochen Hayek

$Date: 2011/11/04 11:32:34 $	Home \| Imprint / Impressum
Copyright © 2010 Aleph Soft GmbH, Jochen Hayek.