Other Logging Issues Using MySQL for Apache Logging

The WHERE clause select s only url values t hat have a period in t hem , t o elim inat e pat hnam es t hat nam e files t hat have no ext ension. To ext ract t he ext ension values for t he out put colum n list , t he inner SUBSTRING_INDEX call st rips off any param et er st ring at t he right end of t he URL and leaves t he rest . This t urns a value like cgi-binscript.pl?id=43 int o cgi-binscript.pl . I f t he value has no param et er part , SUBSTRING_INDEX ret urns t he ent ire st ring. The out er SUBSTRING_INDEX call st rips everyt hing up t o and including t he right m ost period from t he result , leaving only t he ext ension.

18.15.6 Other Logging Issues

I ve chosen a sim ple m et hod for hooking Apache t o MySQL, which is t o writ e a short script t hat com m unicat es wit h MySQL and t hen t ell Apache t o writ e t o t he script rat her t han t o a file. This works well if you log all request s t o a single file, but cert ainly wont be appropriat e for every possible configurat ion t hat Apache is capable of. For exam ple, if you have virt ual servers defined in your ht t pd.conf file, you m ight have separat e CustomLog direct ives defined for each of t hem . To log t hem all t o MySQL, you can change each direct ive t o writ e t o ht t pdlog.pl, but t hen youll have a separat e logging process running for each virt ual server. That brings up t wo issues: • How do you associat e log records wit h t he proper virt ual server? One solut ion is t o creat e a separat e log t able for each server and m odify ht t pdlog.pl t o t ake an argum ent t hat indicat es which t able t o use. Anot her is t o add a virt_host colum n t o t he httpdlog t able and m odify ht t pdlog.pl t o t ake a host nam e argum ent indicat ing a server nam e t o writ e t o t he virt_host colum n. • Do you really want a lot of ht t pdlog.pl processes running? I f you have m any virt ual servers, you m ay want t o consider using a logging m odule t hat inst alls direct ly int o Apache. Som e of t hese can m ult iplex logging for m ult iple virt ual host s t hrough a single connect ion t o t he dat abase server, reducing resource consum pt ion for logging act ivit y. Logging t o a dat abase rat her t han t o a file allows you t o bring t he full power of MySQL t o bear on log analysis, but it doesnt elim inat e t he need t o t hink about space m anagem ent . Web servers can generat e a lot of act ivit y, and log records use space regardless of whet her you writ e t hem t o a file or t o a dat abase. One way t o save space is t o expire records now and t hen. For exam ple, t o rem ove log records t hat are m ore t han a year old, run t he following query periodically: DELETE FROM httpdlog WHERE dt DATE_SUBNOW ,INTERVAL 1 YEAR; Anot her opt ion is t o archive old records int o com pressible t ables. This requires t hat you use MyI SAM t ables so t hat you can com press t hem wit h t he m yisam pack ut ilit y. For exam ple, w hen t he dat e changes from Sept em ber 2001 t o Oct ober 2001, you know t hat Apache wont generat e any m ore records wit h Sept em ber dat es and t hat you can m ove t hem int o anot her t able t hat will rem ain st at ic. Creat e a t able nam ed httpdlog_2001_09 t hat has t he sam e st ruct ure as httpdlog including any indexes . Then t ransfer Sept em bers log records from httpdlog int o httpdlog_2001_09 using t hese queries: INSERT INTO httpdlog_2001_09 SELECT FROM httpdlog WHERE dt = 2001-09-01 AND dt 2001-10-01; DELETE FROM httpdlog WHERE dt = 2001-09-01 AND dt 2001-10-01; Finally, run m yisam pack on httpdlog_2001_09 t o com press it and m ake it read- only. This st rat egy has t he pot ent ial drawback of spreading log ent ries over m any t ables. I f you want t o t reat t he t ables as a single ent it y so t hat you can run queries on your ent ire set of log records, creat e a MERGE t able t hat includes t hem all. Suppose t he set of t ables includes t he current t able and t ables for Sept em ber 2001 t hrough April 2002. The st at em ent t o creat e t he MERGE t able would look like t his: CREATE TABLE httpdlog_all dt DATETIME NOT NULL, request date host VARCHAR255 NOT NULL, client host method VARCHAR4 NOT NULL, request method GET, PUT, etc. url VARCHAR255 BINARY NOT NULL, URL path status INT NOT NULL, request status size INT, number of bytes transferred agent VARCHAR255 user agent TYPE = MERGE UNION = httpdlog, httpdlog_2001_09, httpdlog_2001_10, httpdlog_2001_11, httpdlog_2001_12, httpdlog_2002_01, httpdlog_2002_02, httpdlog_2002_03, httpdlog_2002_04; The UNION clause should nam e all t he t ables t hat you want t o include in t he MERGE t able. Not e t hat youll need t o drop and recreat e t he httpdlog_all definit ion each t im e you generat e a new st at ic m ont hly log t able. Also, if you add an index, youll need t o add it t o each of t he individual t ables, and recreat e t he MERGE t able t o include t he index definit ion as w ell. Report s run against t he httpdlog_all t able will be based on all log ent ries. To produce m ont hly report s, j ust refer t o t he appropriat e individual t able. Wit h respect t o disk space consum ed by web logging act ivit y, be aware t hat if you have query logging enabled for t he MySQL server, each request will be writ t en t o t he httpdlog t able and also t o t he query log. Thus, you m ay find disk space disappearing m ore quickly t han you expect , so it s a good idea t o have som e kind of log rot at ion or expirat ion set up for t he MySQL server.

Chapter 19. Using MySQL-Based Web Session Management

Sect ion 19.1. I nt roduct ion Sect ion 19.2. Using MySQL-Based Sessions in Perl Applicat ions Sect ion 19.3. Using MySQL-Based St orage wit h t he PHP Session Manager Sect ion 19.4. Using MySQL for Session BackingSt ore wit h Tom cat