Wednesday, April 18. 2007Java Runtime Environment Space Eater
If you're looking for some spare disk space, check out your C:\Program Files\Java folder. Mine was full of various folders like jre1.5.0_7, jre1.5.0_11, etc. I'm not sure how they all got there, but I'm guessing they came from installing several versions of OpenOffice over the years. Looking at the Add/Remove programs in the Control Panel, I had like 8 versions of the Java Runtime Environment, collectively eating up several hundred megabytes. So I cleared out all but the latest and things are nice and tidy for the moment.
Monday, January 8. 2007Creating a Date Range Array with PHP
I recently needed to come up with an array of dates between two given dates. I typically work in the MySQL-compatible
'YYYY-MM-DD' date format, so I wanted it to use dates like that. So I came up with the following function:
function createDateRangeArray($strDateFrom,$strDateTo) { // takes two dates formatted as YYYY-MM-DD and creates an // inclusive array of the dates between the from and to dates. // could test validity of dates here but I'm already doing // that in the main script $aryRange=array(); $iDateFrom=mktime(1,0,0,substr($strDateFrom,5,2), substr($strDateFrom,8,2),substr($strDateFrom,0,4)); $iDateTo=mktime(1,0,0,substr($strDateTo,5,2), substr($strDateTo,8,2),substr($strDateTo,0,4)); if ($iDateTo>=$iDateFrom) { array_push($aryRange,date('Y-m-d',$iDateFrom)); // first entry while ($iDateFrom<$iDateTo) { $iDateFrom+=86400; // add 24 hours array_push($aryRange,date('Y-m-d',$iDateFrom)); } } return $aryRange; } Note that the first parameter of my mktime function calls is a 1 due to problems I experienced before with Daylight Savings Time. Tuesday, December 5. 2006PHPBB Inactive Member Removal Cron Job
A commenter to a previous article asked exactly how I delete inactive members from a PHPBB forum that I run. So I'll try to explain. This solution runs on Linux/Unix systems...I'm sure it could be done for Windows, but I'll leave the particulars to you.
It's really two separate steps. First, you need a script which will handle the deletion of inactive members. I called mine cron.php. It deletes all inactive PHPBB users who don't activate within 48 hours. It looks like this: #!/usr/bin/php -q <?php // cron job to delete inactive users older than 48 hours $db=mysql_connect('server','user','password'); mysql_select_db('your_phpbb_database_here',$db); $strSQL="DELETE phpbb_users u, phpbb_user_group ug, " . "phpbb_groups AS g FROM phpbb_users u, " . "phpbb_user_group ug, phpbb_groups g WHERE " . "u.user_active=0 AND u.user_id>0 AND " . "u.user_id=ug.user_id AND ug.group_id=g.group_id " . "AND g.group_single_user=1 AND " . "FROM_UNIXTIME(u.user_regdate)<" . "DATE_SUB(NOW(),INTERVAL 2 DAY);"; mysql_query($strSQL,$db) or die(mysql_error()); mysql_close($db); ?> You'll need to make sure the /usr/bin/php points to the location of PHP on your system, and replace the MySQL server name, user, and password with yours. Now that you have a script, you need to tell the system to run it daily. You can do this with a cron job. If you have command line access to your website, you might be able to do this with "crontab -e". But my webhost has an administrative panel that lets you set up cron jobs on the web. If you can't set up a cron job, you could put the script into a web-accessible folder and periodically call its URL, either manually or through an automated process on your local PC. This idea works great if the majority of your spam registrations don't activate their account. Usually they just want their spam links in your member list. But I'm finding that more and more spammers are activating and posting, so it remains that we want to stop spammers from registering in the first place. I'm experimenting with another method, which I'll post about when I see some results. Update 2007-11-28: I replaced my original SQL statement with the SQL in comment #1 below, which I finally tested and it seems to work well. Wednesday, September 6. 2006Splitting a large XML file on Linux 2.4
A client recently had a problem processing an XML with his PHP script. "File too large" was the error, and the data file was over 2 gigabytes in size.
It turns out you can recompile PHP to deal with large files (see Requirements section here). But this blog entry made recompiling sound problematic...you get access to the large file but certain file functions break due to integer overflows. So I wanted to avoid this option. The Linux command line tools seemed to be able to deal with it OK: stuff like less, head and tail were working on it. So I decided to try to break the XML file into parts. The split command worked but indiscriminately cut right through the middle of a data record. I considered using csplit, but as this file contained over 700,000 data records, I didn't want to deal with that many individual XML files. I decided to write a Perl script to split the file into blocks of 100,000 records each. It didn't take long to put together and Perl's regular expression matching made handling the records easy on the small test data. For some reason I thought Perl would be OK with the large data file, but when I went to run the script, it too choked on the large file. I would have to recompile Perl to get around it. As the client's box is using a Red Hat package for Perl, I didn't want to mess with it. Then I had an idea. Since the Linux command line tools were handling the file OK, I wondered if I could trick my script by feeding it one line at a time. Instead of looping an open file in the Perl script, I used a loop like this: while( $line = <STDIN> ) { # do stuff } Then I called the script like this: cat bigfile.xml | ./split.pl And it worked! Sunday, August 27. 2006Drupal: An image module that uses filemanager module
I'm working on a personal project using Drupal. The image module allows you to let users post images to the site. However, it puts all of its files into a single directory, which is a performance concern if you want to have a lot of images on your site. The filemanager module has support for lots of files, but the image module was not written to use it.
So I decided to try my hand at Drupal coding. I started with image.module and overhauled it to work with filemanager.module. Along the way I found that Drupal 4.7 doesn't offer a good way to maintain certain data through the form preview, so I filed a bug. The workaround I came up with, and also used by a prominent Drupal coder, was to use the $_SESSION variable to maintain that data. So anyway, here is the new image.module code. Please feel free to try it and let me know what I should fix. Some notes: the current version 0.1, requires you to use only the original, preview, and thumbnail sizes, and also requires a database table called image_fm. I also could not come up with an elegant way to handle the case of a single user posting multiple images at the same time, sicne the $_SESSION variable used only allows for one per user at a time. This may be easy to fix, but I don't fully understand the Drupal form API yet. Thursday, August 17. 2006PHPBB Fake Members
It's becoming frustrating to be a PHPBB administrator, at least if you want to keep your memberlist clean. Form bots out there create fake users on your site in the hopes that your memberlist will show their spam URL. It's been an ongoing, and losing battle, to keep them out.
Update 2006-08-22: The fake users keep coming. So I came up with a cron job that runs this query once per day. It will remove inactive PHPBB users older than 48 hours. This gives time for the new users to properly activate. DELETE FROM phpbb_users WHERE user_active=0 AND user_id>0 AND FROM_UNIXTIME(user_regdate)<DATE_SUB(NOW(),INTERVAL 2 DAY); The user_id>0 part is to avoid deleting the Anonymous user, which has a user ID of -1 on my installation. Thursday, July 27. 2006Blogspam
My blog has been getting a mountain of trackback spam attempts lately. So far this week there's been over 4,500 POSTs to my blog trackback links (wasting over a half meg of bandwidth!). I turned off trackbacks last fall, but it doesn't stop the hordes of zombies and open proxies from trying. I'm still seriously considering writing my own blog software to deter spam traffic, simply by virtue of the forms and links being unique and unfamiliar to the automated spamming software out there. It would be interesting to see if the spam traffic would drop or they would just keep pounding the site. If only I had the time!
On a positive note, I managed to get one open proxy closed. It was running inadvertently on a server of a municipal government here in the U.S. A message to their webmaster got the ball rolling. The other 99% of the time I bother to notify a domain owner of an open proxy or infected system, my message is ignored. Wednesday, April 12. 2006Integrating other sites with PHPBB 2.0.20
In a previous entry, I detailed how I used some code from PHPBB to integrate its session management with my existing website. The idea is to include just enough PHPBB stuff to get PHPBB sessions working, and nothing else. Due to some session code changes introduced in the PHPBB update to version 2.0.20, I had to change the code some. Here is how it looks now:
define('IN_PHPBB', true);
$phpbb_root_path = '/somepath/'; include($phpbb_root_path . 'extension.inc'); include($phpbb_root_path . 'config.'.$phpEx); $ip_sep = explode('.',$_SERVER['REMOTE_ADDR']); $user_ip=sprintf('%02x%02x%02x%02x', $ip_sep[0], $ip_sep[1], $ip_sep[2], $ip_sep[3]); include($phpbb_root_path . 'includes/constants.'.$phpEx); include($phpbb_root_path . 'includes/sessions.'.$phpEx); include($phpbb_root_path . 'includes/db.'.$phpEx); $strSQL = "SELECT config_name, config_value FROM " . CONFIG_TABLE . " WHERE config_name IN ('cookie_name', " . "'cookie_path', 'cookie_domain', 'cookie_secure', " . "'rand_seed', 'session_length');"; if( !($result = $db->sql_query($strSQL)) ) { die('Could not query config information'); } while ( $row = $db->sql_fetchrow($result) ) { $board_config[$row['config_name']] = $row['config_value']; } $userdata = array(); $userdata = session_pagestart($user_ip, PAGE_INDEX); In addition, I had to copy the dss_rand() function out of PHPBB's includes/functions.php file into my startup-script. I think that's preferable to including the whole block of functions, but that's another option. You have also have to modify the message_die() function inside dss_rand() because I'm not including that function. I just used PHP's die() function and only included the text of the error, not the PHPBB specific parameters. Update 2006-07-17: This code is OK for PHPBB 2.0.21 also. Tuesday, April 4. 2006Figuring the Start of the Week with PHP
I have a time keeping utility written in PHP and MySQL. To make a query of the hours recorded so far in a given week, I needed to determine the start date of the week, in the MySQL format of 'YYYY-MM-DD HH:MM:SS'. I had been using this code:
date('Y-m-d H:i:s', mktime(0, 0, 0, date('m'), date('d')-date('w'), date('Y'))); It was successfully giving me midnight on Sunday for my code. Until the change to daylight savings time reared its ugly head (why can't we stay on DST all year?). This week that piece of code gave me '2006-04-01 23:00:00', so the rest of my code decided to include some time from Saturday. I worked through a couple variants, but settled on this replacement: date('Y-m-d', mktime(1, 0, 0, date('m'), date('d')-date('w'), date('Y'))) . ' 00:00:00'; This code adds an hour to the time computation. So the errant Saturday at 11PM gets returned to Sunday at midnight. Other weeks will show 1AM, and the switch back to standard time might show 2AM. Since I always want midnight, the simplest thing seemed to be to drop the time off the date function output entirely and just set it to midnight in the string. Monday, March 20. 2006Inbox Spam Update
In my last entry on spam, I mentioned I would move to using POPFile as my mail junk filter. I turned off the ineffective Bayes filtering of SpamAssassin and just left it to toss blatant spam using its other rules. I also disabled Thunderbird's junk filter.
POPFile has done well. So far it has classified nearly 14,000 messages, 85% of which were spam. So that means my local system had to download nearly 12,000 spams before the POPFile classifier could junk them. It would be great to have a system like this on the server to prevent that wasted message downloading. POPFile has been a bit too aggressive in classifying messages as spam. I still have to browse the junk folder now and then to make sure a legitimate message isn't there. On the other hand, very spams actually show up in the inbox. I'm still contemplating going to using a whitelist and blocking everything else, but I'll save that battle for another day.
« previous page
(Page 2 of 7, totaling 67 entries)
» next page
|
CategoriesQuicksearchSyndicate This Blog |