Making use of Etag and Last-Modified header
Tue Nov 3 2009
Many php pages can make use of these headers. For example, if user
requests to see a page with a blog post and sends the If-Modified-Since
header
and the If-None-Match header (with some Etag value)
we can do this (for example)
get the unix timestamp of the latest comment (if any) and the timestamp
the blog post was created.
Get the highest value, call it 'our-modified'
Convert the If-None-Match header value to unix timestamp and compare it
to the value of 'out-modified'
if 'out-modified' is not greater than if-modified-since, then just
return response 304 header and that's all. We save some processing time and
alot of bandwidth.
We can do the same with comparing the value of Etag. For that we need
an algoritm to generate the Etag. We can use the name of class and the
last-modified (our-modified) value
Even better is to user the 'id' from the table and a timestamp
together.
For example the request for a blog post will have the Etag based in
RESOURCE.id + timestamp of latest comment for this post (if any/or the
timestamp of post itself)
Imprtant thing here is to make sure that etag is Not just the timestamp
value, but a combination of timestamp and some other things, just to make
sure that Etag is unique. After all, if using
just a temestamp along, we can have 2 resources with the same exact
timestamp.
Its even possible to have 2 pages that have the same record id and same
timestamp IF they are from different database tables. For example the
mailing list message is stored in LIST_MESSAGE table
and make have the same id and blog post stored in RESOURCE table. It's
very unlikely that these 2 will also have the exact same timestamp, but
nevertheless is possible. To prevent this we going
to also use the class name. So a class name + id + timestamp is a
unique combination.
In order for all these to work we need to always send the Last-Modified
header in pages so that a browser may record this value and use it the next
time.
The important thing here is to use our server's time and NOT take into
consideration the user's time zone offset.
This last condition (our server's timestamp) is very very
important!
About php-specific issue: php will be default set lots of special
headers that all design to prevent the browsers from caching the pages that
use SESSIONS.
This is actually good when dealing with logged-in user, but when user
is not logged-in we still using SESSIONS on all pages but we also want to
browser to be able to cache pages.
This is what we need to do:
you must explicitly assert:
session_cache_limiter('public');
This will set your expiration date to today and make the Pragma header
public.
This should be set via ini_set and ONLY if we determine that user is
not logged-in.
Also Apache server may have its own directives to send special
not-cache headers.
We should disable there:
Uncommenting the
following line disables
#CacheNegotiatedDocs