Originally, I posted this on ServerFault ... but perhaps it is more of a PHP lingual question.
I have a server with Dual Xeon Quad Core L5420 running at 2.5GHz. I've been optimizing my server, and have come to my final bottleneck: PHP.
My very simple PHP script:
./test.php
<?php print_r(posix_getpwuid(posix_getuid()));
My not-so-scientific-because-they-don't-pay-attention-to-thread-locking-but-scientific-enough-to-give-me-a-reasonable-multithreaded-requests-per-second-result scripts:
./benchmark-php
#!/bin/bash
if [ -z $1 ]; then
LIMIT=10
else
LIMIT=$1
fi
if [ -z $2 ]; then
SCRIPT="index.php"
else
SCRIPT=$2
fi
START=$(date +%s.%N)
COUNT=0
while (( $COUNT < $LIMIT ))
do
php $SCRIPT > /dev/null
COUNT=$(echo "$COUNT + 1" | bc)
done
END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
REQS_PER_SEC=$(echo "scale=2; $COUNT / $DIFF" | bc)
echo $REQS_PER_SEC
./really-benchmark-php
#!/bin/bash
if [ -z $1 ]; then
LIMIT=10
else
LIMIT=$1
fi
if [ -z $2 ]; then
THREADS=16
else
THREADS=$2
fi
if [ -z $3 ]; then
SCRIPT="index.php"
else
SCRIPT=$3
fi
PIDS=""
echo '' > results
for thread in `seq 1 $THREADS`; do
./benchmark-php $LIMIT $SCRIPT >> results &
PIDS="$PIDS $!"
done
for PID in $PIDS; do
wait $PID
done
RESULTS=`cat results`
MATH="0"
for RESULT in $RESULTS; do
MATH="$MATH + $RESULT"
done
echo "$MATH" | bc
The result of running ./really-benchmark-php 100 8 test.php
is ~137 requests per second.
Running the same script on a sqlite or mysql powered instance of Drupal returns ~1.5 req/s.
I have APC and mem_cache both installed, and I have verified that they're running on defaults. (Yes, APC's enable_cli is on, also.) Does someone know the magic "make PHP execute faster" switch?
I have an alternative configuration setup (FPM/FastCGI) that serves ~140 req/s of the MySQL Drupal install... how could that be possible if PHP itself can't even serve 2 req/s from the command line?
The result of the ab
tool feel just as low to me:
static page: ab -n 1000 -c 100 http://x.x.x.x/
Requests per second: 683.71
test php: ab -n 100 -c 5 http://x.x.x.x/
Requests per second: 41.38
drupal-mysql: ab -n 100 -c 10 http://x.x.x.x/drupal/
Requests per second: 0.24
drupal-sqlite: ab -n 100 -c 10 http://x.x.x.x/drupal-test/
Requests per second: 4.92
Drupal Core (unoptimized, uncached, without APC is terrible for performance/page views per second). I wrote this blog post a while ago. Perhaps it will help you.
ReplyDeleteLong story short. Use Varnish or some other reverse proxy cache.
Overall, pretty impressive. I managed a percentage increase of 167407.84% in the amount of page requests I could handle per second.
Start: 0.51
End : 854.29
Here is the difference in performance and
Here is some relevant snippets from my post that shows the different numbers.
Test 1 (Get the Starting Benchmark)
Run an apache benchmark
ab -k -n 100 -c 100 -g step1.txt http://site.com/how-it-works
Okay, so this request totally killed my server.
See the graph below.
So then I decided to reduce the Requests in order to just figure out the bog standard requests per second. I went with 100 requests, with a level of 2 concurrency.
And came out with this:
Concurrency Level: 2
Time taken for tests: 197.855 seconds
Complete requests: 100
Requests per second: 0.51 [#/sec] (mean)
Time per request: 3957.105 [ms] (mean)
Test 2 APC Enabled
I then repeated the test but with APC enabled.
Concurrency Level: 2
Time taken for tests: 87.270 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 2138900 bytes
HTML transferred: 2096300 bytes
Requests per second: 1.15 [#/sec] (mean)
Time per request: 1745.396 [ms] (mean)
Time per request: 872.698 [ms] (mean, across all concurrent requests)
As you can see, this is visibly better. But still awful. 1 request per second!? lol. That is horrendous.
Test 3 - Enable Drupal Core Caching
I then enabled Drupal Core Caching... and repeated the apache benchmark
ab -k -n 100 -c 5 -g test2-c5-k.txt http://site.com/how-it-works
Concurrency Level: 2
Time taken for tests: 23.229 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 1923002 bytes
HTML transferred: 1880900 bytes
Requests per second: 4.30 [#/sec] (mean)
Time per request: 464.580 [ms] (mean)
Time per request: 232.290 [ms] (mean, across all concurrent requests)
Transfer rate: 80.84 [Kbytes/sec] received
So now I ended up with 4 requests per second. Which is significantly better but still generally sucks.
The final step, add a reverse proxy cache application into the mix. What do I want to see?.. I actually don’t care, anything must be better than 4 requests per second. If I can get it to around 300 requests per second, then I will be pleased. Anything close to 1000 requests I’ll be ecstatic.
This is what I ended up with:
Concurrency Level: 300
Time taken for tests: 11.706 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 10000
Total transferred: 190260000 bytes
HTML transferred: 185140000 bytes
Requests per second: 854.29 [#/sec] (mean)
Time per request: 351.168 [ms] (mean)
Overall, pretty impressive. I managed a percentage increase of 167407.84% in the amount of page requests I could handle per second.
Start: 0.51
End : 854.29
And additional I reduced the page loading time per request from 1978ms to 1.17ms (concurrently) which is a overall speed gain of … a lot. A speed decrease of 99.94%. Ouch.