Advice To New PHP Developers From a Slowly Recovering Horrible Programmer
7/8/18 - Clifford Vickrey
Jacob Kaplan-Moss, the creator of the Django Python framework, gave a 2015 talk on a growing crisis in the software development industry. There aren't enough developers, and there's a widespread cultural belief that you have to be a "real" programmer in order to become one. Since being a "real" programmer means knowing how to write compilers, self-driving cars, sentient chatbots, and rocket controllers, it is naturally an expert skill beyond the grasp of such mortals. And not only does the skill demand such a hypertrophy of left-brained calculus muscle, but it's also A-R-T in the same category as painting, sculpture, and poetry. Don't even bother trying, in other words.
Instead, he correctly argues, programming is like any skill: most of its applications are mundane (like saving form data to a database), and proficiency in it is normally, not bimodally distributed. If the industry wants to safeguard its long-term health, it had better be more welcoming to beginners and dabblers from other fields, and rid itself of the stereotype of the toxic "rock star" tech genius to whom society must genuflect in silent admiration.
I was in the beginner boat he described. When in graduate school, I quickly felt an idealistic twenty-something's clichéd disillusionment with academia (I was frustrated that I wasn't on the cusp of changing the world with the power of ideas, man), and was looking to try something new, even as a hobby. Web development interested me. At the same time, through a bizarre coincidence, I was tasked to build an interactive social scientific web application, an endeavor for which I had a dearth of qualifications but a surplus of ambition.
But how? As recently as eight years ago, the language had neither a formal specification nor even accepted best practices. The closest thing PHP had to a dependency management system for bringing in third-party code was (shudders) PEAR. There was certainly no "Zen of PHP." Worse still, the tutorials available online were objectively awful, as well as absurdly inviting of security flaws. Need to query a database with user input? Just strip its of quotes, tack it onto a MySQL SELECT statement, and call it a day. Want a cool way to work with structured user data? Just unserialize GET parameters into objects. (Remote code execution? What's that?) Want to let a user upload a file? Who cares what the extension or MIME type is, just save it into /var/www/website/public/files. The guides on W3Schools were so bad that the World Wide Web Consortium politely asked the website to publicly disavow any relationship with the W3C. (It didn't).
With a head full of ill-conceived ideas, pop cultural references, and little else, I set to work building the app in Windows Notepad. It worked. It got great reviews. The New York Times linked to it. It even won an award from my academic discipline's premier professional association. It was so good, in fact, that it may have invited the attention of a third party that ultimately led to the site's amicably negotiated demise about which I am legally forbidden to discuss.
The app, under the hood, was also … something. Load the source code in a modern IDE and prepare yourself for warnings with more red and yellow than McDonalds' global branding effort.
What this goes to show is that even terrible code can prove enormously useful and valuable (and in this case, someone independently valued software containing ALAN_ALDA_MODE constants to be worth half a million dollars).
Nonetheless, I was frustrated with and tired of PHP after this effort, and during my time in grad school focused my programming efforts on R and Python, convinced of the orthodoxy that these were "real" languages, and stuck to simple scripts, convinced I wasn't a "real" programmer.
As fate would have it, in the past few years I have found myself happily working professionally as a software developer. I learned that PHP 7 is now quite different from PHP as I remembered it, and is not synonymous with the horrible hacks I glued together until everything "worked." It has undergone something of a renaissance to clean up its act, codify best practices, and offer new ways of writing reusable, interoperable code. It is now, unquestionably, a viable choice for the rapid protyping and deployment of green field web applications, and not just something people are forced to use as legacy code maintainers. If I were starting out again, I think I'd have had a far easier time. I'd have at least wrapped the Constant GARDENER in a class.
In the sprit of constructive reminiscence, I posed myself the question: what advice would I give to my younger self, were he starting out? After handing him Gray's Sports Almanac, I would say the following:
If you want to learn PHP on the web, learn it from reputable sources. PHP: The Right Way is a good primer on best practices, but requires a bit of prior knowledge to understand. (You have to know how to program before you know what a "programming paradigm" is). Laracasts has a good primer to help someone go from "sort of knows what HTML is" to writing a model-view-controller application from start to finish.
Ignore the advice contained in articles like this and this, at least if you're just starting out. Optimization at the micro level, at best, furnishes performance improvements on the order of a few milliseconds every 10,000 requests. And even if you achieve this with arduous levels of effort, later implementations of the language might negate any benefit of micro-optimization, such that your "improvements" in the future actually slow things down. (More to the point: if associative array parsing is the performance bottleneck of your application, you're doing something wrong in the first place). There is also the fact that such optimizations can work at cross-purposes with sane, readable code. If you rewrite every class method to be static because the interpreter resolves their calls slightly faster, and refactor every single array in your code to be objects just to slightly optimize hash table reuse at the C level, you're putting yourself in a world of hurt. If you do have performance issues, there are free and commercial profiler tools available to help you target specific performance problems rather than guessing.
Learn object-oriented patterns, particularly by reading the code of reputable dependencies in your project. For an overview of design patterns in PHP and examples of their use, check out the DesignPatternsPHP repo. Don't overdo it, though. Rethink things slightly if your codebase starts looking like this (my favorite project on GitHub, by the way).
Writing tests gives you the invaluable luxury of introducing changes without panicking about breaking things. So do it! The main benefit of object-oriented patterns is that they engender code that's easier to test, one class or module at a time. If you're not writing tests, there's almost no point in using patterns like dependency injection (which lets you mock up a class's composition for testing) and strategies (which lets you test algorithms independently of their invocations inside other classes).
Adhere to the PSR-2 coding standard. If you're lazy like I am, write code in whatever format you like, then press Ctrl + Alt + L in PhpStorm and you're done!
Don't participate in the framework wars on the Internet. The question of which framework is "best" has an outsized role in the PHP world. True: all but the simplest web applications need a "framework." There are two things to qualify this statement, though. One is that the definition of "framework" is changing from "monolithic library that essentially replaces the language for you" (a necessary evil in the benighted days of CodeIgnitor, when the language was in a comparatively primitive state) to "a bunch of components you glue together with a router and dependency injection container." The second is that, with the community's newfound emphasis on loosely-coupled packages as well as the advent of PHP Standards Recommendations (PSR), which furnish interfaces for interoperable code, the choice of framework is arguably less momentous than it once was. With good design, an application can swap in and swap out libraries from multiple frameworks as required. The upshot: there is no need to become technically or emotionally wedded to a single framework. The endless blog posts and forum threads casting Laravel as either the savior or bane of PHP development are senseless, since A) there are going to be use cases where it shines (medium-complexity apps rapidly developed by agencies) and others where it may not (high-complexity web applications with tons of domain-specific requirements), and B) framework choice is not likely to be the decisive factor in a project's success. Instead, familiarize yourself with the language, and then make a choice based on personal preference and project needs.
Want to write nice apps with as little third-party overhead as possible? Look into Slim. (If Laravel is Django/Rails, Slim is Flask/Sinatra for Python and Ruby, respectively).
(As an aside, it'd be a shame if PHP went the way of Ruby: an excellent language that became synonymous with one particular framework that, for all its virtues, is falling out of fashion and threatening the drag the language down into oblivion with it).
The moral: tech bigotry is a waste of time at best. It's destructive at worst, as when it encourages the ill-advised adoption of shinier, envied platforms with disastrous consequences. From my experience, there isn't that much to gain from migrating an application from PHP to Python, but the managers of one notable startup decided that Python 3 was "more powerful" than PHP on the apparent basis of old blog posts, left a Python debugging tool running on their production server, and exposed gigabytes of source code and user data.
What of the valid criticisms? For the most part, the annoyances associated with the language are present, but not major sources of problems. To give one example: "weak typing" is alleged to be a reason the language is both insecure and unusable. While it's true that statically typed languages perform better because of their greater correspondence to machine code, A) weak comparisons are optional; B) if you want to use weak comparisons when they come in handy (e.g. to test if a variable's "falsey," it's easier to type empty($x) instead of !isset($x) || $x === 0 || $x === 0.0 || $x === '0' || $x === false || $x === ), the comparison table isn't all that hard to memorize, and C) typing bugs almost never appear in competent PHP code. This is especially the case since PHP 7.x's type hints and strict_types execution directive eliminate the problem of type coercion that's unintended.
To give another example: the standard library is alleged to be worthless. Most functions have snake-cased names (array_key_exists) and are easy enough to read. Others, especially the string functions copied from C's standard library, are abominations: strncasecmp gives you "binary-safe case insensitive string comparisons up to N digits." Function signature consistency is another issue; the most commonly cited example is a side-by-side comparison of strpos($haystack, $needle) and array_search($needle, $haystack). (As an aside: the reason for some of this weirdness is technical and historical). Even worse from a purity standpoint is that (unlike in Python) all functions and SPL classes/interfaces are in the global namespace and implicitly available in every script.
At the same time, get a sense of where critics of PHP are coming from so that you can get an idea of what PHP's ideal use cases are and are not, and also gain perspectives that will help you develop in any language. Where I find PHP to be lacking is that is lacks certain object-oriented bells and whistles found in other languages (enums ["factors" in R] are one, the absence of which makes data validation a pain sometimes; generics are another, as without them you'll find youself constantly polluting your code with /** @var SomeObject $arrayOfSomeObjects **/ docblock tags to stave off IDE warnings). Another weakness of PHP is one of its strengths: the one-process-per-request model, which means that A) PHP does a cold start every time someone hits your website, B) doesn't share any memory with other PHP processes, so you have to persist data in files and databases before the end of the request; and C) shuts down when the request lifecycle ends, so that all the variables you declared and objects you spun up go right into the trash. On the one hand, this model is time-tested, makes it easy for developers to reason about the state of their code, and prevents applications from grinding to a halt if a single process (say) throws an unhandled exception. PHP 7.x running on a warm cache is also super-optimized at very low level, such that most applications perform just fine, with bottlenecks occurring in the data access layer. It does become a problem for heavily trafficked, file I/O intensive, web-servicey endpoints with high concurrency requirements. PHP would thus be a terrible tool with which to build (say) a stateless microservice in Netflix's stack. Async implementations of PHP are emerging to fulfill this need, but it's too early to say whether they'll catch on, or overcome the fact that user libraries usually don't take memory management into account.
Have fun, and don't compare youself as a developer to others. Instead, compare your skills to where they were six months ago. If you're willing to learn a little at a time, you'll find that comparison extremely heartening. When you stretch the comparison to six years, though, don't be surprised to find that your vintage code smells like Hawkeye Pierce's bathtub gin.