Tracing the root cause of a technical glitch can be fiendishly complex. Here’s why.
A client asked us to fix a glitch with their website’s translation function. Our mission to diagnose and cure the problem illustrates how the complexity of the web makes it difficult to gauge the scope of a solution in advance.
We got a call from a frustrated client.
Their website’s translation function had stopped working. This client has sales offices all over the world, so the problem had consequences. Of course we got right on it.
Our investigation illuminates the difficulty of fixing website problems.
Following our process in this case—diagnosis, testing, re-diagnosis, re-testing, and so on—will shed light on glitchy situations.
Our hope is that an understanding of the complexity of interacting web technologies will mitigate the sense of frustration and uncertainty.
If your website is malfunctioning, it’s still not going to be fun.
But knowledge of what’s involved can strengthen your resolve, focus the effort to solve the problem, and even help avoid investigative dead ends.
HERE’S WHAT HAPPENED
Our client’s site had a drop-down menu for different languages.
When a visitor to the site wanted to switch to a different language, they used the menu to select one.
The menu was powered by a plug-in.
Plug-ins are bits of code added to a website to give it additional capabilities. Website developers who want to add a function to their site can Google around and find third-party plug-ins. (This saves significant development time.) The developers just go out and find, like, a dozen options, each created by a different developer. It’s a digital shopping spree!
Plug-ins bring great joy. Also, heartache.
That’s because when you use a plug-in, you’re introducing code written by somebody else. It sounds risky but hang on—there are so many good reasons to do this! It’s like a developer bakes a cake and she wants to add that super-fancy frosting but she’s a baker, not a . . . frostinger. She realizes the cake is going to be so much better if she gets a frostinger on the job.
Client calls and says the translation function isn’t working. Our developer had done his homework and selected a plug-in known to be reliable. He did a deep dive on the plug-in’s requirements.
The translation plug-in sets a cap on the amount of translations it performs, according to the number of languages and the amount of words. When the client added more content to the website, it triggered the translation cap and it stopped working.
Upgrade the license for the plug-in to allow more words, so it won’t shut down. Problem solved.
Plug-ins are created by teams other than the team creating the website. The plug-in code and functionality embody the idiosyncratic methods and intentions of the team that created it.
Plug-ins are just one variable. The web is made up of 3.2 zillion variables—a made-up number that grows daily. These are just some of the variables we identified.
- Different browsers
- Different versions of the same browser
- Different operating systems
- Different versions of the same operating system
- Different software languages (PHP, Java, etc.)
- Different versions of the same software language
- Different hosting environments
The number of possible combinations is vast.
Noted philosopher Buzz Lightyear puts it this way: “To infinity—and beyond!” And remember—each component is made by a different team. And then that component is modified by yet another team, who maybe didn’t even know the first team. Just thinking about the complications makes even a good brain hurt.
BACK TO OUR STORY
It felt great to “solve” the issue. But experience counsels caution.
The site worked fine for a while, but then the trouble returned, just like a masked boogeyman in a slasher movie. The client reported getting messages from Google saying “There is an issue with this site.” And the sales team was complaining that the translations looked odd or did not work at all.
Understandably, the client wondered if we had chosen a janky translation tool.
We have dealt with similar situations, and we know the best thing to do is get to work and save the talk for later. We promised to re-pop the hood and find the source of the issue
ROUNDING UP THE USUAL SUSPECTS
Was it a browser problem?
We checked translation performance in all common browsers: Chrome, Edge, Firefox, etc. It was a roller coaster ride. The translation would appear to work and then—just as we were tossing celebratory confetti—the thing would stop working, as if to spite us and our silly confetti.
Could it be the cache?
A cache is a little container for bits of code. It sits on a chip doing nothing until at some point, an application barges in and says “I’m going to be here for a while and I’ll be using some of my tools repeatedly. I’ll just put them here.” The application could be a browser or a component of a website, or many other things. Whatever it is, it works faster when it can access bits of commonly-used code that it has stashed in the cache—rather than going back to a server somewhere to retrieve it.
Caches are good. But—[sigh]—there are many variants.
Developers create software plug-ins to make caches work. There are many developers. There are many cache plug-ins. ALSO!—(did you think we were done?)—also, there are different versions of php, the language used to create the site. A cache plug-in made for one version may not work with an earlier or later version. Again with the headaches.
We checked out the cache plug-in we had installed in the site’s code.
Caches can sometimes store outdated code that conflicts with newer versions of the applications that need to use the cache—or with the languages used to create the site. We thought maybe we had identified the culprit.
We turned off the cache plug-in and sure enough, the site’s translation function started working again.
SECOND SOLUTION, PART 2:
We theorized that the uneven operation of the cache plug-in might be due to it working well with one version of php but not another version of php. We had 2 versions of the site (standard practice) and found that different versions of php were used for the 2 different copies of the site. (One copy of the website—called the staging site—was for developers to work on without disturbing the proper functioning of the live site. The other site was that live site.)
The ever-evolving nature of the web—the software languages, the hosting environments, the HTML and the CSS—introduces the problem of conflicts between versions.
Even a great improvement comes with a cost. Let’s say all tech wizards agree that some new development—say, HTML5—makes life easier and better for everybody. They know also that moving to the new, better software will cause “version pain,” a term we just coined on the spot.
BACK TO OUR STORY
Everything worked great until it didn’t.
Clearing the cache worked for a while but again—everybody knows the masked bogeyman doesn’t just go away in the early part of the movie. In this case, the problem turned out to be that even though we turned off the cache in the website’s code, there was other cache software in a server out in cyberspace that reintroduced the problem.
Even the cache problem turned out to be a red herring.
It was a perfectly reasonable explanation—but the real explanation only revealed itself later. We know you’re curious so let’s just get to that part.
Another call. Another heartache.
Our client wanted to know why oh why is this thing—this infernal translation function—still not
working. Valid question!
We call the plug-in developer. again. Everybody on both sides is stumped. We throw around ideas. Somebody wonders if Google spiders might have something to do with it.
Turns out Google spiders are indeed part of the problem—but not because they are doing anything wrong. They are doing what spiders do—crawling into the site to see what’s there so Google can add the content to its index. But the plug-in maker developer had added code to deliberately bar spiders, which sent up a red flare to Google.
Urge the plug-in maker (with vehemence) to disable the spider-barring feature so that it doesn’t create this problem. Plug-in maker does what we suggest. Everybody feels a lot better about everything.
The creators of software—even really good software—might not actively share the details of how their software works. These hidden details can cause unanticipated problems.
The same symptom can have a large number of underlying causes. It’s not like technical people go “Oh, you say the frizbit is nixing the oxbot? There is but one explanation!” No. First of all the “symptom” is usually “It’s not working.” Second, even if the problem is described with specificity, the number of possible sources of the problem is STILL between 185 – 193 [made-up number].
Exhausted by this lengthy tale? This is the abridged version.
We’re not complaining. Solving problems like this is part of our job and we get satisfaction from helping clients when they really need our expertise. If your site has a problem of mysterious origin, call us! We promise to jump on it immediately, even if you’re catching us right after we took the first bite of a delicious sandwich.
Are we whining like “It’s not our fault”? A little—but hear us out.
We pride ourselves in taking responsibility, owning a problem, and not quibbling about how a malfunction came to be—but instead just digging in and solving it. (You now know that “solving it” can be a multi-chapter story.) And yet—as noted at the beginning of this essay/massive-tome-of-many-words—we honestly believe that explaining the complexity of web glitches actually helps our clients deal with the situation. As Sir Francis Bacon put it in 1597, knowledge is power. He wasn’t talking about the web, but it’s still true.