Introduction to Scaling PHP Applications – Part 1

Zach Gardner Architecture, Programming, Tutorial 12 Comments

This is the first blog in a two-part series on scaling PHP applications. Part one will focus on replacing Apache while part two will go into more advanced topics such as Master-Master replication and session storage.


Making a website able to handle high amounts of traffic is one of the cornerstones of modern web applications. It’s a process that takes time, and it’s obvious to everyone when a website isn’t able to handle traffic. (i.e. A lot of work is needed to make a web application able to handle high loads. I’ve done this specifically with code written in PHP.

I started out developing in PHP when LAMP (Linux + Apache + PHP + MySQL) was all the rage. It was great for quick prototyping and for small-sized applications. It seemed that at the time, LAMP was the greatest thing since sliced bread. That was until I started writing code that was used by hundreds of people every day. Traffic spikes, high CPU and RAM usage, and manual restarts just seemed like part of being a PHP developer.

PHP has a bad reputation for not handling well under pressure. After refactoring and refactoring with no success, many PHP developers will feel like calling it quits. If our LAMP model limits us just a few dozen users, why use PHP at all?

I was in that boat not too long ago, until I came across Steve Carona’s life-changing Scaling PHP to Millions of Users. I found ways to run my code +500% faster, and increase uptime to nearly 99%.

In this blog, I’ll go over my experience with scaling PHP applications. I’ll describe some of the strategies and pitfalls, and hopefully provide guidance to all those wayward developers who want to swear off PHP completely. My first-hand experience may help clarify areas of Steve’s book. Or, I could inspire someone to take control of their stack and save it from oblivion.

Scaling Goals

When I talk about scaling, I am referring to the ability of an application architecture to gracefully respond to high loads. This involves everything from the DNS to the web server to the version of PHP to the database. Some of the strategies described in this blog are specific to PHP, but others (e.g. Nginx, Percona XtraDB Cluster) can apply to other web applications.

The goals any scaling project should have are:

  1. Maximize performance under high loads
  2. Maximize automatic handling of failure
  3. Minimize code changes necessary to achieve previous two goals
See Also:  React, The Extras.

Step 1 – Use Nginx for Static File Serving

Taking the first step is usually the hardest. If it fails, the rest of your plans slowly crumble. If it succeeds, the struggles in the next steps at least have some previous success to drive them forward. That is why I chose Nginx as the first step. It’s simple, easy to use, and gives huge performance gains over Apache. It’s so easy that even I can’t screw it up.

In LAMP, Apache has two primary jobs: serving static files and calling mod_php to evaluate PHP scripts. One service doing two jobs could be a pro, but in terms of scaling it is a con. When a user hits Apache with a request for a static file, the forked Apache process that retrieves the file has mod_php built into it. So all that RAM and CPU time used by that fork with mod_php is wasted for that request. That’s waste we can easily cut out.

This area is where Nginx shines. It can serve us static files, and not care that we’re using PHP as our server-side language. When it receives a PHP request, it can send that off to Apache without blocking the other forks. It doesn’t have mod_php in the forks, so no waste there either. Nginx forks are magnitudes smaller in both RAM and CPU usage than Apache forks. Nginx is event-driven while Apache is process-driven, so we get an asynchronous and non-blocking as a bonus.

Putting Nginx on the front with PHP requests served to Apache on the back is a really easy first step to take in this process. The only caveat I’ve experienced is with gzip and PHP’s flush() method. Nginx needs some special configuration for long running processes whose output is gzipped. I’ll leave that and reading the Nginx pitfalls/best practices as a homework assignment.

Step 2 – Use PHP-FPM for PHP Processing

Once you’ve made the first step and have committed to making your architecture better, you can’t wait to see how much better the other steps will make your stack. I know the first time that I ran benchmarks of Nginx vs. Apache, I was stunned. I calculated that the stack I was scaling could handle 200% to 300% more traffic while using less resources with Nginx on front.

Now, for my favorite part of this whole thing: getting rid of Apache. I remember back in the day when Apache was the bee’s knees. Sure, it took me a few hours to install due to resolving dozens of dependencies, but it worked and it worked well. Oh what a fool I was.

See Also:  What’s On First: The Case For Accessibility-First Programming

When it comes to evaluating PHP on a large scale, PHP-FPM is the gold standard. It and Nginx are the de facto winner of high load PHP processing. Rather than being somewhat good at hundreds of things like Apache, PHP-FPM is only concerned about one thing: evaluating PHP as quickly as possible.

And it’s really, really good at it. So good that it went from a user-maintained patch to being incorporated by the PHP team into PHP itself. PHP-FPM is a FastCGI server that bundles PHP with each fork. This allows for PHP requests to be evaluated as quickly as possible with as little red tape to go through as possible.

PHP-FPM has been supported by the PHP team since 5.3. If you’re not on 5.3 or newer yet, get on the bandwagon. It’s faster, more reliable and consistent, and allows for some pretty cool syntactic sugar. The process of upgrading can be hard, especially with a large code base. The advantages though make all the time spent worth it.

Along with using PHP-FPM, I’d highly recommend using an opcode cache. These extensions will cache PHP’s evaluation of a given file so that subsequent requests won’t have to do the evaluation again. I’ve tried the Zend OpCache, but had issues with seg faults for long-running processes. APC is the best one I’ve used, and would recommend it to anyone and everyone.


If you’re stuck on the single-server LAMP model, you could stop here and be very happy with your results. We still have no code changes (barring upgrades to a newer version of PHP), and will have anywhere from 300% to 500% better performance over Apache. The next steps are geared more towards a true scaled application, but finishing at this step is a great starting point.

Don’t miss Part Two!

— Zach Gardner, [email protected]

Comments 12

  1. I recently switched from Apache+mod_php to Nginx+php_fpm, and while I really like the increased efficiency and simple configuration it offers, I might have to switch to Nginx+php (via a reverse proxy to Apache+mod_php), or go back to Apache+mod_php altogether because of an issue I’ve been having with php_fpm. The issue didn’t show up in testing. Only in production, at scale. The issue is that a super small percentage (way less than 1%) of PHP requests don’t return anything at all. I’m talking like a white page! There are a bunch of 499, 502, and 504 errors in the access/error logs, but no other errors that I’ve found that might lead me to what is causing the issue. I’ve tried everything, but nothing has fixed the problem. It’s uber frustrating!

    1. Post

      Might be that Nginx is out of workers to process the request, and none free up before the timeout hits. Could also be true of PHP-FPM running out of workers.

      Did you do any benchmarks to validate it’s running faster? I’d recommend using siege because it uses less memory than ab. You can find out at what load you start getting 499, 502, and 504s. I’d bet it’s not due to either technology but by not having it configured for really high loads.

      I didn’t talk about how to configure Nginx or PHP-FPM for high loads. The Scaling PHP book does talk about it. There are a lot of tuning Nginx sites out there you can also look at. They should also go into what settings are needed on the OS to make it better for high loads.

      1. No, I was silly and did not think to benchmark before switching over.

        I’ve purchased the book, and have tweaked a few things. I’ll report back on my progress!

        1. Post

          Good luck on the process. I’m not sure if the book’s author will continue working on it or not, but it’s a treasure trove of information. I was able to get a lot out of it, even in its beta form.

      2. Disable Apc altogether and use zend opcache. We had the same issue and tracked it down to a bug with Apc not being compatible with the latest versions of php.

        1. Post

          Running APC on 5.4 worked really well for me. I had issues with long running transactions (e.g. an import script ran needed to run for 1.5 hours) with Zend opcache causing seg faults.

          Out of curiosity, how did you both stumble on this article? Trying to get the word out about this and the next post.

    2. Post

      It could also be a seg fault. I would define an auto_prepend_file in your php.ini with something like this: I didn’t test it, but it should give you the idea. If you make sure that you write a START file when a request comes in and an END file when the script finishes, you can look for all requests that have a START but no END. If you still have scripts giving non-200’s and all STARTS have an END, then you know your issue isn’t with PHP.

      1. I’ll give that a shot next. Thanks! I’m reading through the book now. It’s excellent! I haven’t implemented all of the suggestions yet, but was able to eliminate the 504 errors I was seeing with, so that’s good! I’ll give this a try too…

      2. Update update!

        It turns out it wasn’t nginx or PHP. It was the REST API I was using to communicate between PHP and Node.js. I replaced it with Redis Pub/Sub and the problem went away.

        Thanks for all of your help!

        Also, I never answered your question… I found this blog post through the PHP Weekly e-mail newsletter.

  2. Pingback: Introduction to Scaling PHP Applications - Part 2 | Keyhole Software

  3. Pingback: Introduction to Scaling PHP Applications - Part 1 | Zach Gardner's Blog

What Do You Think?