Friday, September 5, 2008

Adventures in Webpage Content Injection.

I run several web sites and recently decided that I wanted to add some common text to the bottom of all the pages. Since I don't generate the content, it would be best if the server did this as it served the pages.

A quick search through Apache's module directory and I saw that mod_layout fit the bill ... in theory, at least. I had tried, in vain, for the last two days to get mod_layout working with my antiquated Linux server (fedora core 3), before I decided that I should really look into my other options -- especially since mod_layout hadn't been updated and from what I could see from forums, it's developer wasn't exactly interested in making it working with Apache 2.0+.

Then I saw a post about how mod_rewrite could be cajoled into doing this sort of a task. Essentially it does this by rewriting the request into a CGI call, passing the original requested file name as a paramter.

Like so:


RewriteRule /(.*) /wrapper.cgi?file=$1 [nc,l,qsa]


All the examples were using PHP. My server is old (as mentioned above) and you can't really find RPM's for older Fedora's. So, since I don't have PHP installed, I looked at my options ... and it immediately struck me that Python would be up to the task.

So, you need to create a Python scripted named
wrapper.cgi
which contains:

import os
import os
import re
import urllib

print "Content-type: text/html"
print
docroot = os.getenv( 'DOCUMENT_ROOT' )
fname = docroot + urllib.unquote( os.getenv( 'REQUEST_URI' ) )
buff = open( fname ).read( )

mobj = re.compile( '<body[^>]*>', re.IGNORECASE | re.VERBOSE )
mobj2 = re.compile( '</body>', re.IGNORECASE | re.VERBOSE )
obj = mobj.search( buff )
obj2 = mobj2.search( buff )
header = buff[:obj.end()]
body = buff[obj.start( ):obj2.start( )]
footer = buff[obj2.start():]

print header
if os.path.exists( docroot + '/header.inc' ):
print open( docroot + '/header.inc' ).read( )
print body
if os.path.exists( docroot + '/footer.inc' ):
print open( docroot + '/footer.inc' ).read( )
print footer


So, now I have a framework and method for wrapping all the assorted web pages with some common header & footer code (such as some essential support links, for example).

3 comments:

Anonymous said...

Did you look into post-processing via a filter?

http://httpd.apache.org/docs/2.0/mod/mod_ext_filter.html#extfilterdefine

Seems like your could re-write your cgi script pretty easily as a shell script.

talon74 said...

Hmm...guess my look through the module repositories wasn't that extensive. I was focusing on the pre-processing aspect of this situation, so probably ended up unduly limiting myself.

Anonymous said...

You're better off having learned a bit (more?) about mod_rewrite. No harm, and I totally understand -- apache and it's modules cover a huge number of uses; it's not surprising you didn't find that module first.

And FWIW, I only knew about mod_ext_filter because I was in a similar situation and couldn't use mod_rewrite.