EXADS has multiple JavaScript libraries for rendering Ads on a publisher’s website. In this article, we will explain the caching strategy used on these JavaScript libraries and how such a strategy will influence publishers sites.
Web Caching (or HTTP Caching) became one of the most important techniques to increase a websites’ performance and one of the most effective ways to improve the user experience of their visitors. It is also part of the core content delivery strategy implemented within the HTTP protocol.
Let's go through the basic concepts of web content caching and see how this is helping EXADS to ensure super fast delivery of ads.
Table of Contents
- Why is Web Caching Important?
- The Web Cache Concept
- Terminology
- Types of Web Caching
- Caching Headers
- Cache-Control Flags
- The Strategy Used For EXADS’s Snippets
- Google PageSpeed Insights
Why is Web Caching Important?
The online user experience can have a direct impact on your brand, it can help you grow your online business or halt its growth. There are some specific components that can ensure you are providing a great user experience and Web Caching is a very important factor for optimized performance.
When serving ads, the end user experience should be seamless. With an optimized ad zone auction, serving the ad to the ad zone should take a minimum load time to show the ad to the user. Web Caching can be used to help optimize performance and it has the following advantages:
Reduced latency
A slow load time of an ad served on a website is a big cause of user frustration. This causes the following issues: the user abandons the website before the ad is served, resulting in a lost user for the website's Advertisers who have bid for the various ad zones on the website’s page; their ads will not be seen by the end user. This can affect the relationship of the website publisher with the ad network serving the ad.
The speed at which a webpage loads is crucial in ensuring a good digital experience. Google ranks websites that are fast to load higher and so Web Caching can be key in reducing load times.
Content availability
The instant availability of a website's content to end users across the world is an important component to user experience. A site may not load for a user because the network may be prone to frequent interruptions or the site may be experiencing intermittent outages. In such cases, Web Caching will cover this by still serving end users the cached content.
Avoids network congestion
The internet handles huge amounts of data and manages heavy traffic 24/7 and as a result, bandwidth congestion can be an issue on major networks affecting load times. Network congestion can be greatly reduced with Web Caching, because the path traveled in fetching the content is cut short when it’s cached. Since all the requests are not directed towards the origin, it frees up the network and reduces the load on the origin server, helping it serve non-cached content faster.
EXADS SaaS uses Web Caching to ensure clients of its white labeled ad server technology can offer an optimized service, ensuring clients can continue to grow their business and provide a world class end user experience.
The Web Cache Concept
Caching is the term for storing reusable responses to make subsequent requests faster by having them available. There are many different types of caching available, each of which has its own characteristics.
Web Caching, however, is a different type of cache. It is a core design feature of the HTTP protocol meant to minimize network traffic while improving the perceived responsiveness of the system as a whole.
"Caching is a technique that stores a copy of a given resource and serves it back when requested. When a web cache has a requested resource in its store, it intercepts the request and returns a copy of the stored resource instead of redownloading the resource from the originating server."
Source: HTTP caching, MDN Web Docs
It can be applied in a variety of ways and to a variety of types of assets, such as Images (logos, pictures, backgrounds), HTML, CSS, JavaScript, etc.
With no Web Cache in place, every request will reach the server until it eventually responds (even though it didn’t change from the last request), as seen in the following diagram.
This means that repeat visitors have to re-download the same files every time. That's a massive waste of bandwidth. On the other hand, with the Cache set, this only happens during the first request.
Terminology
There are a few terms that are important to note and some might be unfamiliar. Some of the more common ones are:
- Caching Policy: The rules to apply to the content. The browser will use these policies to behave differently. This is set using HTTP Headers.
- Origin server: The original location of the content. The source of the truth. It is responsible for serving any content that could not be retrieved from a cache but also for setting the caching policy for all the content.
- Freshness: Freshness describes whether an item within a cache is still considered a candidate to serve to a client.
- Stale content: Items in the cache expire according to the cache policy. Expired content is considered “stale”. A new request to the origin server should be done to retrieve the new content or at least verify that the cached content is still accurate.
- Validation: Stale items in the cache can be validated to refresh their expiration time. Validation involves checking in with the origin server to see if the cached content still represents the most recent version of an item.
- Invalidation: Invalidation is the process of removing content from the cache before its specified expiration date. This is necessary if the item has been changed on the origin server.
Types of Web Caching
Content can be cached at many different points throughout the delivery chain, such as:
- Browser cache
- Intermediary caching (proxies)
Browser Cache
Web browsers themselves maintain a small cache. Typically, the browser sets a policy that determines which resources should be saved. This type of cache is incredibly useful since the information is stored in the computer, allowing an instant load when accessing it.
Intermediary caching (proxies)
This storage, also called Proxy Cache, is done on the proxy server, between the client and the origin server. This is a type of shared cache as it serves multiple clients (browsers).
Usually, proxy servers are distributed all over the world, placing the content closer to the users and preventing time-consumption requests from opposite sides of the globe, reducing latency and network traffic.
Caching Headers
The majority of caching behavior is determined by the caching policy, which is set by the content owner. These policies are mainly articulated through the use of specific HTTP headers.
The ones we find more appropriate to pay attention to are:
- Cache-Control: This is the more modern replacement for the Expires header, an old and uncommonly used directive header. We will discuss the specifics of the available options to set with Cache-Control later in this article.
- Etag: The Etag header works as a cache validator. The origin can provide a unique Etag for an item when it initially serves the content, acting as a fingerprint. Then, the client should send this value in all the subsequent requests as an “If-None-Match” Request Header. The origin will either tell the cache that the content is the same or send the updated content (with the new Etag).
- Last-Modified: This header specifies the last time that the item was modified. This may be used as part of the validation strategy to ensure fresh content.
Cache-Control Flags
Several different policies can be set using this header. Some of the Cache-Control options that can be used to dictate content’s caching policy are:
- no-cache: This instruction specifies that any cached content must be re-validated on each request before being served to a client. It marks the content as stale immediately but allows it to use revalidation techniques to avoid re-downloading the entire item again.
- no-store: Indicates that the content cannot be cached in any way. Very useful if the response represents sensitive data.
- public: This marks the content as public, which means that it can be cached by the browser and any intermediary caches.
- private: This marks the content as private. Private content may be stored by the user’s browser, but must not be cached by any intermediate parties. Often used for user-specific data.
- max-age: This setting configures the maximum age that the content may be cached before it must-revalidate value or re-download the content from the origin server.
s-maxage: This is very similar to the max-age setting. The difference is that this option is applied only to intermediary caches. - must-revalidate: This indicates that the freshness information indicated by max-age, s-maxage, or the Expires header must be strictly obeyed.
- proxy-revalidate: This operates the same as the above setting, but only applies to intermediary proxies. In this case, the user’s browser can potentially be used to serve stale content in the event of a network interruption, but intermediate caches cannot be used for this purpose.
These can be combined in different ways to achieve various caching behavior.
The Strategy Used for EXADS’s Snippets
We care about our publishers’ websites and we strongly believe that serving ads shouldn’t compromise their website’s performance. With this in mind, the following constraints are applied for all the snippets we serve.
Minification
Although unrelated to cache, this allows us to send the same amount of functionality but using less network bandwidth.
To minify it, an algorithm is executed on top of our JavaScript code and replaces all the unnecessary code, as shown in the following example:
Original code:
Minified code:
This constraint has become a standard practice for page optimization and, in some cases, it can reduce the file size by as much as 30%.
Intermediary Caches
We use a CDN Proxy cache with 99.7% availability distributed across the globe, with availability in Europe, North America, South America, Asia, and Oceania.
Our origin servers are also distributed across the globe, enabling the intermediary to fetch the resource almost immediately.
Asynchronous script
Modern websites make significant use of JavaScript. The performance impact of these files will vary depending on how they are implemented.
Usually, the HTML parsing process (a process done by the Browser to interpret the code and render the page accordingly) looks like this:
The parsing is paused until the script is fetched and executed, then resuming the parsing again.
When using the async attribute, the script is completely independent, and now the parsing flow would look like this:
This attribute significantly reduces the time needed to fully display the website to the user, improving the User Experience and preventing our JavaScript from damaging loading performance.
We support async scripts, allowing parallel download while the browser is parsing the page, to be executed as soon as it is available.
3h max-age
Our snippets are regularly maintained to offer different features. This allows publishers to not need to update them and almost immediately see new changes from our Ad Server.
ETag validator
As explained in a previous point, this acts as a fingerprint of the content, preventing unnecessary network bandwidth.
Google PageSpeed Insights
If we analyze a publisher’s website using our snippets on PageSpeed Insights, we might see Google complaining about a small cache policy applied to our snippet.
This is causing some discussion (https://github.com/GoogleChrome/lighthouse/issues/11380) around the Lighthouse project (used by PageSpeed Insights) about their policy being too strict. The threshold used by this audit will require a cache duration of at least 96.5 days.
We, at EXADS, decided to not lose the fast-update opportunity for publishers and keep using 3h for TTL.
We hope that this article has given you a clearer understanding of the caching process and why EXADS SaaS is able to provide a super fast ad serving solution for our clients.