DP code release with modern PHP goodness

Today I’m proud to announce a new release of the software that runs pgdp.net: R201701. The last release was a year ago and I’m trying to hold us to a yearly release cadence (compared to the 9 years in the last one).

This version contains a slew of small bug fixes and enhancements. The most notable two changes that I want to highlight are support for PHP versions > 5.3 and the new Format Preview feature.

This is the first DP release to not depend on PHP’s Magic Quotes allowing the code to run on PHP versions > 5.3 up to, but not including, PHP 7.x. This means that the DP code can run on modern operating systems such as Ubuntu 14.041 and RHEL/CentOS 7. This is a behind-the-scenes change that end users should never notice.

The most exciting user-visible change in this release is the new Format Preview functionality that assists proofreaders in formatting rounds. The new tool renders formatting via a simple toggle allowing the user to see what the formatted page would look like and alerting if it detects markup problems.

What’s next for the DP code base? We have a smattering of smaller changes coming in over the next few months. The biggest change on the horizon is moving from the deprecated mysql extension to mysqli, which will allow the code to run on PHP 7.x, and moving to phpBB 3.2.

Many thanks to all of the DP volunteers who made this release possible, including developers, squirrels, and the multitude of people who assisted in testing!


1 Ubuntu 16.04 uses PHP 7.0, but can be configured to use PHP 5.6.

Death to magic quotes

Magic quotes is a misguided feature of PHP that modifies user input to PHP pages so that the input can be used directly in SQL statements. This violates the programing principle of only escaping data when it is necessary and results in all kinds of weird edge cases.

This feature was deemed so misguided that it was deprecated in PHP 5.3 and removed entirely from PHP 5.4. The DP code base has relied on magic quotes to function from the beginning of the project in 2000.

I’m very happy to report that after much development and validation effort, we’ve removed the dependency on magic quotes from the DP code base! The work was done over the course of a year, primarily by myself with help from jmdyck, and validated by a team of squirrels (shout-out to wfarrell and srjfoo) and other volunteers. It was rolled out in production on November 5th and has been almost 100% bug-free – quite an accomplishment given how much of the code was impacted. A huge thank you to the team who helped make this possible!

The biggest win is our ability to run the DP code on much more recent versions of PHP all the way up to, and including 5.6.1

RIP magic quotes.


1 It won’t work on PHP 7.0 or later because the code still relies on the deprecated mysql extension, although I fixed that on a branch last night!

Celebrating a decade at Distributed Proofreaders

Today marks my 10 year “DP birthday”, as we like to say on the site. 10 years ago I joined Distributed Proofreaders and proofread a few pages. Turns out they needed developers more than proofreaders, the former being harder to find, and I was encouraged to dip my toes into the code. It wasn’t long before I was working with jmdyck, one of the lead developers, rewriting the spellcheck function into WordCheck.1 Thus began my foray into a development style best summed up by the phrase “development by community”.

My involvement with DP has waxed and waned over the years as my free time fluctuated, but I’ve always loved DP’s mission and the great community of people. Despite being a world-wide organization and very geographically dispersed I’ve had the pleasure of meeting several of them in person – all of whom have been delightful people.

Today I am one of the main developers working to improve our code. I periodically wear a system admin hat (a role we affectionately call ‘squirrels’) helping to keep our site running. Recently I was elected by the community to serve on the umbrella organization’s governing board where I am the board president.

If you love ebooks, believe public domain materials should be free and widely available, and need an outlet for a bit of OCD, I would encourage you to drop over to pgdp.net and join a wonderful community of people helping to preserve history one page at a time.


1 Not surprisingly, you’re really more interested in having the text of the page you’re transcribing match the image and be consistent throughout the book, not knowing how they spelled a word in 1892 differs from the contents of a modern dictionary.

Distributed Proofreaders Foundation Board

Last month Distributed Proofreaders volunteers voted for new board members of the Distributed Proofreaders Foundation (DPF), the 501(c)3 non-profit organization that governs DP. I was honored to be one of four new board members, serving a 3-year term effective June 1st. I am further honored that my fellow DPF board members voted me as the board President for the upcoming year.

I look forward to working alongside the other board members and the General Manager as we serve the DP community by providing vision and direction while keeping our hands firmly outside of day-to-day operational issues, which are under the GM’s purview.


[Disclaimer: Thoughts presented in my blog are mine alone, and do not represent the thoughts of the DPF board unless explicitly stated otherwise.]

Introducing the “new” DP logo

Distributed Proofreaders has been around since 2000, well before the advent of modern image formats like SVG vector images and PNG raster images. The DP logo, therefore, was a GIF available in only the size needed for the website:

Fast forward 15 years and our logo is still 360×68 pixels with no hope of being used at any larger size, in any of the instances where a square image is needed (like Twitter), and no chance of being used in print. Over the years folks have filled the void by creating new raster images, some of a great quality, but never anything that was considered official and never in a vector format from which we could generate raster images of various sizes.

In modern logo development you design it in a vector format, such at SVG, and then rasterize it to whatever size you need for the web. In addition to a logo you need what’s commonly called a mark or badge, essentially a square design that readily brings your brand to mind when seen. Marks are often used to link back to your website.

Most large companies go one step further and include much more detail about their brand, including specifying colors, fonts, spacing, when to use which image, and much more. Some branding guidelines from companies you’ve probably heard of:

About two weeks ago I decided DP needed some modern branding assets. Loading up my trusty copy of Inkscape and all of the current images available to me, I created a “new” DP logo in SVG format we could use to create official brand logos. I also created a DP mark in SVG format. Today, we rolled them out.

Introducing the “new” DP logo and mark:

Logo
DP logo

Mark
DP mark

Included in the roll-out is a full branding page providing access to the SVG files as well as PNGs with both white and transparent backgrounds in various sizes. The nice thing about transparent PNGs is that because they have an alpha channel, we only need one PNG to use for all our different themes rather than one GIF per background color.

To round out the set, the branding page even includes a black and white logo without a drop-shadow for use in black and white print applications.

DP logo in black and white

Making these was a fun foray back to my design and publishing days. Turns out no one knows what font the original logo is in. It’s likely some variant of Garamond, but none that I could find. Luckily I was able to find Amiri, a free font from Google Fonts that was a pretty close match. That worked for everything except the ‘dp’ in the center of the logo and the core of the mark. Those bits of the logo are very visually striking and the letter shapes in Amiri were too different from the original to use. Fortunately Linda, the general manager, had already done some work in Corel to vectorize those bits into an image for use on Twitter. After combining the two together (and converting the final text to paths for better compatibility) it was a simple matter of adding the drop shadow and exporting some PNGs.

The “new” logo isn’t exactly like the old one, but it’s pretty close and hopefully conveys the most important visual aspects of the original. And now they’re available in easily-consumable formats for virtually any media.

Enabling DP development with a developer VM

Getting started doing development on the DP code can be quite challenging. You can get a copy of the source code quite readily, but creating a system to test any changes gets complicated due to the code dependencies — primarily its tight integration with phpBB.

For a long time now, developers could request an account on our TEST server which has all the prerequisites installed, including a shared database with loaded data. There are a few downside with using the TEST server, however. The primary one being that everyone is using the shared database, significantly limiting the changes that could be made without impacting others. Another downside is that you need internet connectivity to do development work.

Having a way to do development locally on your desktop would be ideal. Installations on modern desktops are almost impossible, however, given our current dependency on magic quotes, a “feature” which has us locked on PHP 5.3, a very archaic version that no modern Linux desktop includes.

Environments like this are a perfect use case for virtual machines. While validating the installation instructions on the recent release I set out to create a DP development VM. This ensured that our instructions could be used to set up a fully-working installation of DP as well as produce a VM that others could use.

The DP development VM is a VMware VM running Ubuntu 12.04 LTS with a fully-working installation of DP. It comes pre-loaded with a variety of DP user accounts (proofer, project manager, admin) and even a sample project ready for proofing. The VM is running the R201601 release of DP source directly from the master git repo, so it’s easy to update to newer ‘production’ milestones when they come out. With the included instructions a developer can start doing DP development within minutes of downloading the VM.

I used VMware because it was convenient as I already had Fusion on my Mac and that VMware Player is freely available for Windows and Linux. A better approach would have been VirtualBox1 as it’s freely available for all platforms. Thankfully it should be fairly straightforward to create a VirtualBox VM from the VMware .vmdk (I leave this as an exercise to another developer).

After I had the VM set up and working I discovered vagrant while doing some hacking on OpenLibrary. If I had to create the VM again I would probably go the vagrant route. Although I expect it would take me a lot longer to set up it would significantly improve the development experience.

It’s too early to know if the availability of the development VM will increase the number of developers contributing to DP, but having yet another tool in the development tool-box can’t hurt.

1 Although I feel dirty using VirtualBox because it’s owned by Oracle. Granted, I feel dirty using MySQL for the same reason…

A new release of the DP site code, 9 years in the making

Today we released a new version of the Distributed Proofreaders code that runs pgdp.net! The announcement includes a list of what’s changed in the 9 years since the last release as well as a list of contributors, some statistics, and next steps. I’ve been working on getting a new release cut since mid-September so I’m pretty excited about it!

The prior release was in September 2006 and since that time there have been continuous, albeit irregular, updates to pgdp.net, but no package available for folks to download for new installations or to update their existing ones. Instead, enterprising individuals had to pull code from the ‘production’ tag in CVS (yes, seriously).

In the process of getting the code ready for release I noticed that there had been changes to the database on pgdp.net that hadn’t been reflected in the initial DB schema or the upgrade scripts in the code. So even if someone had downloaded the code from CVS they would have struggled to get it working.

As part of cutting the release I walked through the documentation that we provide, including the installation, upgrade, and configuration steps, and realized how much implied knowledge was in there. Much of the release process was me updating the documentation after learning what you were suppose to do.1 I ended up creating a full DP installation on a virtual machine to ensure the installation steps produced a working system. I’m not saying they’re now perfect, but they are certainly better than before.

Cutting a release is important for multiple reasons, including the ability for others to use code that is known to work. But the most important to me as a developer is the ability to reset dependency versions going forward. The current code, including that released today, continues to work on severely antiquated versions of PHP (4.x up through 5.3) and MySQL (4.x up to 5.1). This was a pseudo design decision in order to allow sites running on shared hosting with no control over their middleware to continue to function. Given how the hosting landscape has changed drastically over the past 9 years, and how really old those versions are, we decided it’s time to change that.

Going forward we’re resetting the requirements to be PHP 5.3 (but not later, due to our frustrating dependency on magic quotes) and MySQL 5.1 and later. This will allow us to use modern programming features like classes and exceptions that we couldn’t before.

Now that we have a release behind us, I’m excited to get more developers involved and start making some much-needed sweeping changes. Things like removing our dependency on magic quotes and creating a RESTful API to allow programmatic access to DP data. I’m hoping being on git and the availability of a development VM (more on that in a future blog post) will accelerate development.

If you’re looking for somewhere to volunteer as a developer for a literary2 great cause, come join us!

1 A serious hat-tip to all of my tech writer friends who do this on a daily basis!

2 See what I did there?