Website Caching
What is Caching
Caching is the process of storing copies of files in a cache, or temporary storage location, so that they can be accessed more quickly. Technically, a cache is any temporary storage location for copies of files or data, but the term is often used in reference to Internet technologies & website caching.
Web browsers cache HTML files, JavaScript, and images in order to load websites more quickly. Where as DNS servers cache DNS records for faster lookups and CDN servers cache content to reduce latency.
Caches explained in the context of food supplies
To understand how caches work, consider real-world caches of food and other supplies. When intrepid explorers visited the South Pole in the 1900’s on the return journey’s the explorers men subsisted on the caches of food they had stored along the way. This was much more efficient than waiting for supplies to be delivered from their base camp as they travelled.
Caches on the Internet serve a similar purpose; they temporarily store the ‘supplies’, or content, needed for users to make their journey across the web
Want to pick up the phone and speak to us about your Website project?
Call us on: 01733 361729 mail: solutions@bdolphin.co.uk
What does a browser cache do?
Every time a user loads a webpage, their browser has to download quite a lot of data in order to display that webpage. To shorten page load times, browsers cache most of the content that appears on the webpage, saving a copy of the webpage’s content on the device’s hard drive. This way, the next time the user loads the page, most of the content is already stored locally and the page will load much more quickly.
Browsers store these files until their time to live (TTL) expires or until the hard drive cache is full. (TTL is an indication of how long content should be cached.) Users can also clear their browser cache if desired. You may well have heard someone in IT / web design say ” I think you need to clear your cache”
What does clearing a browser cache accomplish?
- Once a browser cache is cleared, every webpage that loads will load as if it is the first time the user has visited the page.
- If something loaded incorrectly the first time and was cached, clearing the cache can allow it to load correctly.
- However, clearing one’s browser cache can also temporarily slow page load times for that site as every page will be seen as new.
What is Content Delivery Network (CDN) caching?
- A Content Delivery Network or CDN, caches content (such as images, videos, or webpages) in proxy servers that are located closer to end users than origin servers.
- A proxy server is a server that receives requests from clients and passes them along to other servers.
- The net result of this is that as the servers are closer to the user making the request, a CDN is able to deliver content more quickly.
CDN explained in the context of food
Think of a CDN as being like a chain of grocery stores. Instead of travelling hundreds of miles to the farms where food is grown, shoppers go to their local grocery store, which is located miles away. The net result grocery shopping takes minutes instead of days, because grocery stores stock food from faraway farms, . Similarly, CDN caches ‘stock’ the content that appears on the Internet so that webpages load much more quickly.
When a user requests content from a website using a CDN, the CDN fetches that content from an origin server, and then saves a copy of the content for future requests. Cached content remains in the CDN cache as long as users continue to request it.
What is a cache miss? What is a CDN cache hit?
In the case of a cache miss, a CDN server will pass the request along to the origin server, then cache the content once the origin server responds, so that subsequent requests will result in a cache hit.
A cache hit is when a client device makes a request to the cache for content, and the cache has that content saved. A cache miss occurs when the cache does not have the requested content.
A cache hit means that the content will be able to load much more quickly, since the CDN can immediately deliver it to the end user.
How long does cached data remain in a CDN server?
- When websites respond to CDN servers with the requested content, they attach the content’s TTL as well, letting the servers know how long to store it.
- The TTL is stored in a part of the response called the Hyper Text Transfer Protocol HTTP header, and it specifies for how many seconds, minutes, or hours content will be cached.
- When the TTL expires, the cache removes the content.
- Some CDNs will also purge files from the cache early if the content is not requested for a while, or if a CDN customer manually purges certain content.
How do other kinds of caching work?
DNS caching takes place on DNS servers. The servers store recent DNS lookups in their cache so that they do not have to query nameservers and can instantly reply with the IP address of a domain.
Search engines may cache webpages that frequently appear in search results in order to answer user queries even if the website they are attempting to access is temporarily down or unable to respond.
What is a DNS record?
DNS records (aka zone files) are instructions that live in authoritative DNS servers and provide information about a domain including what IP address is associated with that domain and how to handle requests for that domain. These records consist of a series of text files written in what is known as DNS syntax. DNS syntax is just a string of characters used as commands that tell the DNS server what to do. All DNS records also have a ‘TTL’, which stands for time-to-live, and indicates how often a DNS server will refresh that record.
You can think of a set of DNS records like a business listing in an online directory. That listing will give you a bunch of useful information about a business such as their location, hours, services offered, etc. All domains are required to have at least a few essential DNS records for a user to be able to access their website using a domain name, and there are several optional records that serve additional purposes.
What are the most common types of DNS record?
- A record – The record that holds the IP address of a domain. The “A” stands for “address” and this is the most fundamental type of DNS record: it indicates the IP address of a given domain. For example, if you pull the DNS records of bdolphin.co.uk the A record currently returns an IP address of: 185.151.30.156. Note A records only hold IPv4 addresses.
- AAAA record – The record that contains the IPv6 address for a domain (as opposed to A records, which list the IPv4 address).
- CNAME record – Forwards one domain or subdomain to another domain, does NOT provide an IP address.
- The ‘canonical name’ (CNAME) record is used in lieu of an A record, when a domain or subdomain is an alias of another domain. All CNAME records must point to a domain, never to an IP address.
- Imagine a treasure hunt where each clue points to another clue, and the final clue points to the treasure. A domain with a CNAME record is like a clue that can point you to another clue (another domain with a CNAME record) or to the treasure (a domain with an A record).
- For example, suppose news.example.com has a CNAME record with a value of ‘example.com’ (without the ‘news’). This means when a DNS server hits the DNS records for news.example.com, it actually triggers another DNS lookup to example.com, returning example.com’s IP address via its A record. In this case we would say that example.com is the canonical name (or true name) of news.example.com.
- MX record – Directs mail to an email server. A DNS ‘mail exchange’ (MX) record directs email to a mail server. The MX record indicates how email messages should be routed in accordance with the Simple Mail Transfer Protocol (SMTP, the standard protocol for all email). Like CNAME records, an MX record must always point to another domain.
- TXT record – Lets an admin store text notes in the record. These records are often used for email security. The DNS ‘text’ (TXT) record lets a domain administrator enter text into the Domain Name System (DNS). The TXT record was originally intended as a place for human-readable notes. However, now it is also possible to put some machine-readable data into TXT records. One domain can have many TXT records.
- NS record – Stores the name server for a DNS entry.
- SOA record – Stores admin information about a domain.
- SRV record – Specifies a port for specific services.
- PTR record – Provides a domain name in reverse-lookups.
What are some of the less commonly used DNS records?
- AFSDB record – This record is used for clients of the Andrew File System (AFS) developed by Carnegie Melon. The AFSDB record functions to find other AFS cells.
- APL record – The ‘address prefix list’ is an experiment record that specifies lists of address ranges.
- CAA record – This is the ‘certification authority authorization’ record, it allows domain owners state which certificate authorities can issue certificates for that domain. If no CAA record exists, then anyone can issue a certificate for the domain. These records are also inherited by subdomains.
- DNSKEY record – The ‘DNS Key Record’ contains a public key used to verify Domain Name System Security Extension (DNSSEC) signatures.
- CDNSKEY record – This is a child copy of the DNSKEY record, meant to be transferred to a parent.
- CERT record – The ‘certificate record’ stores public key certificates.
- DCHID record – The ‘DHCP Identifier’ stores info for the Dynamic Host Configuration Protocol (DHCP), a standardized network protocol used on IP networks.
- DNAME record – The ‘delegation name’ record creates a domain alias, just like CNAME, but this alias will redirect all subdomains as well. For instance if the owner of ‘example.com’ bought the domain ‘website.net’ and gave it a DNAME record that points to ‘example.com’, then that pointer would also extend to ‘blog.website.net’ and any other subdomains.
- HIP record – This record uses ‘Host identity protocol’, a way to separate the roles of an IP address; this record is used most often in mobile computing.
- IPSECKEY record – The ‘IPSEC key’ record works with the Internet Protocol Security (IPSEC), an end-to-end security protocol framework and part of the Internet Protocol Suite (TCP/IP).
- LOC record – The ‘location’ record contains geographical information for a domain in the form of longitude and latitude coordinates.
- NAPTR record – The ‘name authority pointer’ record can be combined with an SRV record to dynamically create URI’s to point to based on a regular expression.
- NSEC record – The ‘next secure record’ is part of DNSSEC, and it’s used to prove that a requested DNS resource record does not exist.
- RRSIG record – The ‘resource record signature’ is a record to store digital signatures used to authenticate records in accordance with DNSSEC.
- RP record – This is the ‘responsible person’ record and it stores the email address of the person responsible for the domain.
- SSHFP record – This record stores the ‘SSH public key fingerprints’; SSH stands for Secure Shell and it’s a cryptographic networking protocol for secure communication over an unsecure network.
Website Caching A Simple Explanation
Imagine you’re at a library, and you want to read a popular book that many others also want to read. Instead of going to the shelves every time you want to read a page, you decide to keep the book on a nearby table while you read. This way, you can quickly flip to the pages you need without constantly walking back and forth to the shelves. This concept is similar to website caching.
In the digital world, websites are made up of various files like images, text, and other content. When you visit a website, your computer sends a request to a distant server, which is like asking the librarian for a book from the shelves. The server then sends the necessary files back to your computer for you to view the website.
Website caching is like putting a copy of the book on that nearby table in the library. When a server realizes that a particular website is frequently requested, it stores a copy of its files in a cache. This cache is like the table where you can quickly access the book. So, the next time you visit that website, the server can provide the files directly from the cache (the table) instead of retrieving them from the distant shelves (the original server).
This process speeds up website loading because the server doesn’t need to fetch all the files from scratch every time you visit. It can simply use the cached copy, saving time and making the website load faster for you. Just like having the book on the table allows you to read faster, website caching helps to make the internet experience smoother and more efficient.
Website Caching A Technical Explanation
Website caching, viewed through the lens of a person with a high level of technical knowledge, involves strategically storing web content at intermediate points between the user and the original server. This process is designed to optimise data retrieval, minimise latency, and enhance overall performance.
In essence, when a user requests a website, their browser sends a request to a server hosting the site’s data. This server, known as the original server, processes the request and sends back the required content, such as HTML, images, scripts, and other resources. However, this round-trip between the user’s browser and the original server can introduce delays, especially over a network with variable latency.
Website caching addresses this challenge by introducing caches, which are temporary storage mechanisms, at strategic points along the data delivery path. When a user first requests a website, the cache nearest to them (a local cache or a content delivery network server) retrieves the content from the original server and stores a copy locally. Subsequent requests for the same content can then be served directly from this local cache, bypassing the need to fetch it again from the original server.
Furthermore, website caching employs various strategies to determine what content should be cached and for how long. It considers factors like the content’s popularity, expiration timestamps (controlled through HTTP headers), and caching directives from the original server. Cache management techniques, such as cache invalidation and cache purging, are also crucial to ensure that users receive the most up-to-date content.
Moreover, caching can be implemented at different levels within the system architecture, ranging from the browser cache on the client-side to caching mechanisms in content delivery networks (CDNs) and even caching proxies on the server-side. Each level has its own advantages and trade-offs, balancing resource utilisation, speed, and freshness of content.
Website caching is a sophisticated and nuanced process involving the strategic storage of web content at intermediate points, significantly improving website performance by reducing the need for repeated requests to the original server and minimising latency for users.
If you would like to know more about Website Caching contact Andrew Goode MBA, MSc, FCIM Click here to arrange a call
Other articles linked with websites that may provide additional insight. all you need to know about maintaining your WordPress website , Hosting questions for WordPress Websites, Common mistakes made in website design . We have written a detailed article on the marketing questions that we most commonly get asked, check this out if your marketing isn’t helping you generate sales.