The Architect's Newspaper is a popular high volume WordPress website with over 80k posts. I started working with this site in 2018, my first task was a request to solve long load times and high CPU load.
Eventually, I would end up rebuilding the entire theme and ad system. Custom plugins were created for automatically clearing NGiNX and CloudFront caches using WordPress hooks, and custom offloading of popular and trending post / page calculation. A custom paywall implementation using LUA with NGiNX for checking Auth0 authenticated users via cookies, and rendering different page formats (paywalled vs. full articles) for authenticated and unauthenticated users. This method allowed micro-caching of pages for all visitors.
When AN first reached out, their website was taking anywhere from ten to twenty seconds to load any and every page, and this had apparently been occurring for months. Obviously this is a problem, and fast load times are essential to any website being successful.
My first step was to check the hosting situation, which turned out to be a popular WordPress specific hosting platform, on a shared server with about a thousand other websites. This limited my options as far as low-level adjustments I could make.
Next, I backed up all website and configuration files, databases, and DNS zone records. I then set up an AWS account for AN, and booted up an Ubuntu EC2 instance powerful enough to handle the CPU usage it was experiencing at the time. I compiled NGiNX with PageSpeed, h2/http2.0, set up MariaDB, LetsEncrypt/certbot auto SSL renewal, and finished migrating the site away from the previous provider. I then enabled micro-caching for NGiNX, which meant that pages could be rendered less often, bringing the CPU load down substantially.
After the site was up and running again with a lower load time, it was time to delve into the real cause of the issue, which ended up being a plugin – surprise! It was calculating popular and trending posts, on every page, during every visit. That's a lot for a site with 200 users per minute. While SQL caching was enabled, the queries were slightly different every time, so the cache had no effect.
Instead of using transients I opted for a different solution, using Google Analytics. We already had the popular and trending pages information from our GA data, all I needed to do was pull it and make it available to visitor browsers. I created a python script that would poll Google Analytics every ten minutes, and save the trending/popular posts to a static JSON file. I then replaced the plugin modules with my own, using AJAX to pull the JSON files. A unix timestamp rounded to ten minutes was added to the query parameters so that NGiNX could serve a cached version.
These changes brought the load time down to under 100ms. Over time I would make additional adjustments, putting the entire site behind CloudFront, and eventually moving to an ARM based EC2 instance to bring costs down even further.
The original hosting provider was charging $900 a month and had offered to "solve" the load time issue for an additional $1,600 – per month.
Not only did I address the performance issues, I brought hosting fees down to under $500 a month.