Caching
This page summarizes several possible caching strategies and how they can be used with the Conversations API. #
Introduction
Caching is the temporary storage of frequently needed data. A cache is an intermediary between a data consumer and the data origin. Each time a data consumer needs data it will check the cache first to see if it can provide the
data, if not the data will be fetched from origin and saved in the cache. A cache will store data until a predetermined criteria is met (time limit, memory usage, etc), after which the data will be evicted from the cache to make room for new data.
General benefits of caching
Caching increases performance and reliability. Performance is increased because the source of cached data can be closer to the consumer and the cache can process duplicated requests for data, saving unnecessary computations and network traffic. Reliability is increased because the data consumer is isolated from service interruptions from origin.
Benefits specific to Conversations API users
For the purposes of this documentation, your application is the data consumer and the Bazaarvoice platform is the data origin. In addition to the advantages mentioned above, caching Conversations API responses offers the following advantages:
- Mitigating the effects of rate limiting
The Conversation API imposes a rate limit to ensure stability and performance for all tenants. With caching your application will make fewer API requests to origin, thus conserving your quota of requests and reducing the likelihood
that your application will be rate limited. Learn more about the rate limit.
- Increased productivity of API requests
The Bazaarvoice platform includes a global content delivery network (CDN) to ensure high performance and availability. Our CDN already does some caching, so if your application doesn't cache, it is making unnecessary requests (requests that aren't served by origin). By caching at the same or greater time-to-live (TTL) as we do you can increase the likelihood that your requests will return data from origin. Our TTL may change due to usage and seasonal variations, but it will never be lower than 15 minutes or greater than 55 minutes.
- Consistency with the moderated nature of UGC and user expectations
All Conversation API submissions go through a moderation process that can take up to 72 hours. Authors are informed of this during the submission process, so there should be no expectation of receiving the very latest content. As a result you can get the benefits of caching without a diminished user experience.
Strategies
In-app caching
Per application
In-app caching is when an application stores data it needs frequently. The storage medium might be in-memory, the file system, or software specifically built for caching. A fundamental component of this type of data storage is that is it temporary in nature. The cache will automatically remove data based on predetermined criteria: such as a time period or by removing the least used data to make room for newer data.
The following example using pseudocode demonstrates the basic pattern for in-app caching
cache = new Cache(maxSize, expireAfter);
...
function get_data(request_url) {
key = cache.createKey(request_url);
data = cache.get(key);
if (data) {
return data;
}
else {
data = query_bv_api(request_url);
cache.add(key, data);
return data;
}
}
The example above starts by creating a cache object passing in a maxSize
argument telling the cache how much data to store and a expireAfter
argument telling the cache how long to store data. Later, when using the cache object a key is created from a URL. The key is used to identify the needed data in the cache cache.get(key);
. If the cache returns relevant data, then the function returns the data and ends. If the cache did not return relevant data, then a request is made to the Bazaarvoice Conversations API. The response is then added to the cache along with an appropriate key for future lookup. Finally the data is returned. If data associated with that URL is needed again before the expireAfter
period it will be returned from the cache.
Examples
The following is a list of several existing solutions that exemplify this type of caching:
Bazaarvoice does not endorse or support any 3rd party or non-Bazaarvoice software.
- JAVA: Google Guava
- Python: Beaker
- JS/Node: node-cache
- PHP: Stash
- .Net: System.Runtime.Caching
Distributed
Distributed in-app caching a more advanced form of in-app caching. Similar to in-app caching the cache is accessed inline within the code, but the cache itself is stored on machines close to, but separate from, the machine running the code. This allows for sharding and a shared-nothing architecture across the servers, which both contribute to the overall scalability of the solution.
Examples
The following is a list of several existing solutions that exemplify this type of caching:
Bazaarvoice does not endorse or support any 3rd party or non-Bazaarvoice software.
HTTP[s] reverse proxy caching
HTTP[s] reverse proxy caching involves relying on an intermediary application, accessed via HTTPS, to respond with cached data or pass requests on to the origin data source. A caching proxy is an example of an HTTP[s] caching
application.
API caching
In this strategy the cache exists between your application and the Bazaarvoice platform and can be summarized by the following steps:
- Your application will make a request to your cache.
- Your cache will forward the request to the Bazaarvoice platform.
- The Bazaarvoice platform will return a response to your cache.
- Your cache will save the response and return it to your application.
- If the same request is made again during a predetermined period of time, your cache will return the saved response instead of forwarding the request to the Bazaarvoice platform.
This process is depicted in the image below. Note that the count of arrows, indicating network traffic, is lower between the cache and BV indicating fewer requests.
As a simplification the graphic above depicts only one client application, but you can increase the benefits of caching by pointing more than one application to the same cache.
One way to perform API caching is by placing a reverse proxy between your application and Bazaarvoice.
Typically the requests made by your application will be identical to a standard Conversations API request, except they will use the domain of your caching proxy server.
https://your.proxy.com/data/reviews.json?passkey={passkey}&apiVersion;=5.4&productId;={id}
The proxy server is configured to forward those request to the appropriate Bazaarvoice domain and back to your application.
https://api.bazaarvoice.com/data/reviews.json?passkey={passkey}&apiVersion;=5.4&productId;={id}
Examples
The following is a list of several existing solutions that can be used for reverse proxy caching:
Bazaarvoice does not endorse or support any 3rd party or non-Bazaarvoice software.
Display caching
In this strategy the cache exists between client applications (typically user's computer) and your application and can be summarized by the following steps:
- Users will make a request to your application cache for a webpage.
- Your application cache will forward the request to your application.
- As a part of building the response your application will make request to the Bazaarvoice platform.
- The Bazaarvoice platform will respond to your application, which will respond to your application cache, which will respond to the user.
- If the same request is made again during a predetermined period of time, your application cache will return the cached response instead of forwarding the request to your application.
This process is depicted in the image below. Note that the count of arrows, indicating network traffic, is lower between your application cache and your application indicating fewer requests.
In this graphic there is only one application cache depicted because there is typically a one-to-one relationship between application cache and application.
Requests to your application cache will be identical to a request made directly to your application.
https://your.domain.com/your/product/page
Your application cache will be configured to forward uncached requests to you application. Client applications will never need to know that you use an application cache.
https://192.0.2.0/your/product/page
Examples
The following is a list of several existing solutions that can be used for reverse proxy caching:
Bazaarvoice does not endorse or support any 3rd party or non-Bazaarvoice software.
- All entries from HTTP[s] caching examples
- varnish - By design varnish will only proxy to a one IP address, making it unsuitable for API caching because the Bazaarvoice CDN is dynamic in nature. However, varnish is ideal for display style caching.
Recommendations
None of the strategies listed above are mutually exclusive and in fact they can be complimentary. You might start by implementing basic in-app caching, then as your traffic increases add one or more of the other strategies.
We recommend caching UGC data for between 15 minutes to 55 minutes. We vary the expiration period of our CDN based caching within this range so, if you're not caching within this range you are making unnecessary requests that return the same data each time.
If your application caches data for more than the recommended period, you should consider the ability to purge your cached data on a per product basis. There exist scenarios in which Bazaarvoice data is updated on per product
basis (syndicating reviews from product A to product B) and your users or other stake holders may want to see the updates sooner than your expiration age. Without per-product purging you will need to invalidate your entire
cache.
Updated 6 months ago