博文共赏：在Node.js中使用声明式缓存-InfoQ

Why

写了多年的 Java 程序，即使在转投 Node 之后，仍然对 Spring 框架的 IoC 、Declarative XXX 记忆犹新，以至于当 Node 项目要用到缓存时，自然地想起了 Declarative caching ，就有了山寨一把的想法。。。

问题

为什么用缓存就不说了，地球猿都知道，直接看问题，假设我们有一个查询客户信息的函数：

复制代码

  
/** Find customer from db by id.
 @param id [Integer|String] customer id
 @param cb [Function] callback (err, customer)
*/
function findCustomer(id, cb) {
 db.query("SELECT * FROM tbl_customers WHERE id = ?", [id], cb);
}

本文的示例代码假设读者了解 Node 及异步编程，如果了解 Underscore.js , Async.js 会更有帮助
简单起见，示例代码忽略了数据库 /Cache Server 连接的建立、释放，以及部分的异常处理

要对 customers 进行缓存，无非是先检查 cache ，命中则直接返回，否则查询数据库，先缓存结果然后返回。最显而易见的实现大概是这样的：

复制代码

  
// The cached version of findCustomer.
function findCustomerWithCache(id, cb) {
 fromCache('customer:' + id, function(err, cust) {
   if (err) return cb(err);
 
   if (cust) {  // cache hit
     cb(null, cust);
   }
   else {  // cache missed
     loadAndCacheCustomer(id, cb);
   }
 });
}

以下是辅助函数的定义，假设 cached 是我们使用的缓存服务实例:

复制代码

  
/** Try to retrieve item with key k from cache.
 @param k [String] key of the cached item
 @param cb [Function] callback (err, v) v is the cached item, false if NOT found
*/
var fromCache = cached.get;
 
/** Save value v to cache with key k.
 @param dur [Integer] seconds before cache expires
 @param k [String] key of the cached item
 @param v [Any] item to be cached
 @param cb [Function] callback (err, v)
*/
function toCache(dur, k, v, cb) {
 cached.set(k, v, dur, function(err) {
   cb(err, v);
 });
}
 
// find customer from db, and cache it before return the result
function loadAndCacheCustomer(id, cb) {
 var _cacheCust, _loadAndCache;
 _cacheCust = function(cust, cb) {
   toCache(86400, 'customer:' + id, cust, cb);  // caching for 1 day
 };
 
 _loadAndCache = async.compose(_cacheCust, findCustomer);
 _loadAndCache(id, cb);
}

很好，现在调用 findCustomerWithCache 函数就能够利用到缓存了。

现在的问题是：

凡是有一个需要缓存的 findXXX 函数，都要编写配套的 findXXXWithCache``loadAndCacheXXX
使用或不使用缓存，需要调用不同的函数，这意味着一旦切换，就要修改所有的函数调用处

利用高阶函数重用代码

首先考虑通过函数重用来解决第一个问题：如果有一个通用的函数能够代为处理缓存相关的事务，只在缓存没有命中时调用实际的查询过程，就达到目的了。

API 设计

设计一个函数 (api) 最好的方法就是首先把期望的使用形式写出来。所以这个关于缓存的高阶函数，暂且命名为 withCache ，或许这么用起来会不错：

复制代码

  
// The cached version of findCustomer.
function findCustomerWithCache(id, cb) {
 var _getKey = function(id) { return 'customer:' + id; };
 withCache(86400, _getKey, findCustomer, id, cb);
}

也就是说，我们只要把数据查询过程 findCustomer 委托给一个高阶函数 withCache ，并且设定缓存时长以及 key 的生成规则，就可以了，缓存检查、更新的整个流程都由这个函数封装，从而达到重用的目的。

封装缓存流程

复制代码

  
/** The generic caching routine.
 @param dur [Integer] seconds before cache expires
 @param keyBuilder [Function] (args...) given the arguments, return a unique cache key
 @param fn [Function] the actual data loading routine
 @param args... [Any...] arguments applys to fn to load data
 @param cb [Function] callback (err, data) applys to fn to receive data
*/
function withCache(dur, keyBuilder, fn) {
 var args, cb, key;
 args = Array.prototype.slice.call(arguments, 3, -1);
 cb   = _.last(arguments);             // the callback function
 key  = keyBuilder.apply(null, args);  // compute cache key
 
 fromCache(key, function(err, data) {
   if (err) return cb(err);
 
   if (data) {  // cache hit
     cb(null, data);
   }
   else {  // cache missed
     loadAndCacheData.call(null, dur, key, fn, args, cb);
   }
 });
}
 
/** Load from db using fn.
 and cache it using the given key before return the result.
*/
function loadAndCacheData(dur, key, fn, args, cb) {
 var _cacheData = _.partial(toCache, dur, key);
 async.compose(_cacheData, fn).apply(null, args.concat(cb));
}

一旦缓存流程被封装起来，重用就变得容易了，现在我们可以对任意的数据查询函数应用缓存了。

用高阶函数封装过程，这也是函数式编程重用代码的一种最基本的方法，就像 each``map``reduce 等函数对集合应用的封装一样。

注意到 withCache 函数的签名，参数的排列顺序不是随意的，而是考虑了它们的适用范围 (或稳定程度)，使得函数重用更为方便，现在很容易做到：

复制代码

  
var withOneDayCache = _.partial(withCache, 86400);
 
// the same as: withCache(86400, _getKey, findCustomer, id, cb);
var _getKey = function(id) { return 'customer:' + id; };
withOneDayCache(_getKey, findCustomer, id, cb);
 
// exactly the same as the findCustomerWithCache function above
var findCustomerWithOneDayCache = _.partial(withOneDayCache, _getKey, findCustomer);

函数代理

现在我们距离声明式缓存仅一步之遥，不过在此之前，我们还可以再进一步：我们想要以更接近‘声明式’地获取 findCustomerWithOneDayCache 这样的函数。

以 Spring Declarative annotation-based caching 为参照：

复制代码

  
@Cacheable("customers")
public Customer findCustomer(Integer id);

在 Node 中，我们希望能够以以下方式‘声明’某个函数‘应该使用缓存’：

复制代码

  
findCustomerWithCache = cacheable(findCustomer, { ns: 'customer', dur: 86400 });

为便于理解，可将 cacheable 类比为 Spring ProxyFactoryBean ，通过它，我们可以获得一个代理，其表现形式无异于原函数 (如本例中 findCustomer)，却能够在背后默默地帮我们完成缓存的相关处理。

为实现这一机制，我们首先需要一个通用的 cache key 生成方式:

复制代码

  
/** Generates cache key using function arguments.
 @param ns [String] namespace, prefix of the cache key
 @param args [Array] arguments for loading data
*/
function getKey(ns, args) { return ns + ':' + ... }

使用前缀 + 参数列表自动生成 cache key ，对于含有多个参数的情况，可以使用 hash (参考 Default Key Generation )，这里我就不写了。

接下来就简单了，利用之前的成果，我们很容易就可以写出满足上述形式的 cacheable 函数：

复制代码

  
/** Wrap the given function fn to use the caching service.
 @param fn [Function] (args..., cb) the actual data loading routine
 @param opts [Object] options
   ns:  [String] namespace, prefix of the cache key
   dur: [Integer] seconds before cache expires
*/
function cacheable(fn, opts) { return _.partial(withCache, opts, getKey, fn); }

当然， withCache 需要稍稍改动：接受一个 options 对象作为参数。

声明式缓存

好了，就差最后一步：如何能够在不影响代码的情况下，随意设置缓存参数 (是否缓存、缓存时长等)？

老办法，首先设定我们的期望，假设 findCustomer 是由 customers 模块提供的，可以这样使用:

复制代码

  
var custs = require('customers');
custs.findCustomer(1, function(err, customer) {
 // do something with the customer
 ...
});

我们希望缓存对 findCustomer 的使用者是透明的，即无论缓存与否或缓存多长时间，上述代码都不需要修改。

要解决这个问题，可以利用 Node 模块的缓存特性。即，我们可以在应用 (进程) 初始化时 (通常叫做 prelude script)，将 customers 模块的 findCustomer 替换成 cacheable 的版本，只要模块缓存没有被清空，随后所有对于该模块的 require 都将得到修订后的版本， findCustomer 函数自然就是 cacheable 版的了。

为此我们先准备一个函数，帮助我们修订指定的模块：

复制代码

  
/** Replace the function fn in a module m, to a cacheable one.
 @see http://nodejs.org/api/modules.html#modules_caching
 @param m [Object] module object
 @param fn [String] name of the function
 @param opts [Object] cache options
   ns:  [String] namespace, prefix of the cache key
   dur: [Integer] seconds before cache expires
*/
function cache(m, fn, opts) { m[fn] = cacheable(m[fn], opts); }

现在我们建立一个 caches.js ，如何命名这个文件并不重要，重要的是它需要被包含在 prelude script 中。请把它类比为 Spring Beans 定义文件 (通常是 XML)，我们把缓存相关的‘声明’都放在这儿：

复制代码

  
var custs  = require('customers'),
 whatever = require('whatever');
 
cache(custs,    'findCustomer',       { ns: 'customer', dur: 86400 });
cache(custs,    'findCustomerByName', { ns: 'customer', dur: 86400 });
cache(whatever, 'findWhatever',       { ns: 'whatever', dur: 0     });
...

好了，现在我们可以‘声明’某个特定的函数是否以及如何使用缓存，而且随时修改这些‘声明’都不会对代码造成冲击。

最后

需要了解的是， Spring Cache Abstraction 主要包括了缓存 api 的抽象，以便在不影响代码的情况下随意切换不同的缓存服务，本文讨论更多的是类似 Declarative annotation-based caching 的特性。

相比于切换缓存服务，缓存配置变更的概率显然要大得多，更值得我们去仔细的管理，所以就不在此讨论 caching api 的抽象了。

另外，本文中讨论的实现方法依赖 Node Module Caching 特性 (如果模块被解析为同一个文件，则永远得到同一个对象)，因此该方法有效的前提是：

同一个模块不会被解析到不同的文件，详情参考 Node 文档
模块缓存不会被清除

所以在应用这一方法之前，还请仔细评估上诉的前提条件。

原文地址：在 Node.js 中使用声明式缓存

作者简介

吴颖昕：作为 Java 程序员在华为混了几年，现供职于 LightInTheBox，仍然从事 Java 开发，也开始接触到 Node.js。没有什么显赫的经历，属于半路出家、自学成才，什么都会点，又什么都不精的典型。曾经深信面向对象是世上最好的编程范式，没有之一，而今天终于相信，多样性对于软件开发领域同样是不可或缺的，按需选择，最合适的才是最好的。最近着迷于函数式编程，希望通过接触多种编程范式开拓自己的思路。
电邮: yingxinwu.g@gmail.com
博客: xinthink.com

感谢田永强对本文的审校。

给InfoQ 中文站投稿或者参与内容翻译工作，请邮件至 editors@cn.infoq.com 。也欢迎大家通过新浪微博（ @InfoQ ）或者腾讯微博（ @InfoQ ）关注我们，并与我们的编辑和其他读者朋友交流。

发布

暂无评论


	/** Find customer from db by id.
	@param id [Integer\|String] customer id
	@param cb [Function] callback (err, customer)
	*/
	function findCustomer(id, cb) {
	db.query("SELECT * FROM tbl_customers WHERE id = ?", [id], cb);
	}


	// The cached version of findCustomer.
	function findCustomerWithCache(id, cb) {
	fromCache('customer:' + id, function(err, cust) {
	if (err) return cb(err);

	if (cust) { // cache hit
	cb(null, cust);
	}
	else { // cache missed
	loadAndCacheCustomer(id, cb);
	}
	});
	}


	/** Try to retrieve item with key k from cache.
	@param k [String] key of the cached item
	@param cb [Function] callback (err, v) v is the cached item, false if NOT found
	*/
	var fromCache = cached.get;

	/** Save value v to cache with key k.
	@param dur [Integer] seconds before cache expires
	@param k [String] key of the cached item
	@param v [Any] item to be cached
	@param cb [Function] callback (err, v)
	*/
	function toCache(dur, k, v, cb) {
	cached.set(k, v, dur, function(err) {
	cb(err, v);
	});
	}

	// find customer from db, and cache it before return the result
	function loadAndCacheCustomer(id, cb) {
	var _cacheCust, _loadAndCache;
	_cacheCust = function(cust, cb) {
	toCache(86400, 'customer:' + id, cust, cb); // caching for 1 day
	};

	_loadAndCache = async.compose(_cacheCust, findCustomer);
	_loadAndCache(id, cb);
	}


	// The cached version of findCustomer.
	function findCustomerWithCache(id, cb) {
	var _getKey = function(id) { return 'customer:' + id; };
	withCache(86400, _getKey, findCustomer, id, cb);
	}


	/** The generic caching routine.
	@param dur [Integer] seconds before cache expires
	@param keyBuilder [Function] (args...) given the arguments, return a unique cache key
	@param fn [Function] the actual data loading routine
	@param args... [Any...] arguments applys to fn to load data
	@param cb [Function] callback (err, data) applys to fn to receive data
	*/
	function withCache(dur, keyBuilder, fn) {
	var args, cb, key;
	args = Array.prototype.slice.call(arguments, 3, -1);
	cb = _.last(arguments); // the callback function
	key = keyBuilder.apply(null, args); // compute cache key

	fromCache(key, function(err, data) {
	if (err) return cb(err);

	if (data) { // cache hit
	cb(null, data);
	}
	else { // cache missed
	loadAndCacheData.call(null, dur, key, fn, args, cb);
	}
	});
	}

	/** Load from db using fn.
	and cache it using the given key before return the result.
	*/
	function loadAndCacheData(dur, key, fn, args, cb) {
	var _cacheData = _.partial(toCache, dur, key);
	async.compose(_cacheData, fn).apply(null, args.concat(cb));
	}


	var withOneDayCache = _.partial(withCache, 86400);

	// the same as: withCache(86400, _getKey, findCustomer, id, cb);
	var _getKey = function(id) { return 'customer:' + id; };
	withOneDayCache(_getKey, findCustomer, id, cb);

	// exactly the same as the findCustomerWithCache function above
	var findCustomerWithOneDayCache = _.partial(withOneDayCache, _getKey, findCustomer);


	@Cacheable("customers")
	public Customer findCustomer(Integer id);


	findCustomerWithCache = cacheable(findCustomer, { ns: 'customer', dur: 86400 });


	/** Generates cache key using function arguments.
	@param ns [String] namespace, prefix of the cache key
	@param args [Array] arguments for loading data
	*/
	function getKey(ns, args) { return ns + ':' + ... }


	/** Wrap the given function fn to use the caching service.
	@param fn [Function] (args..., cb) the actual data loading routine
	@param opts [Object] options
	ns: [String] namespace, prefix of the cache key
	dur: [Integer] seconds before cache expires
	*/
	function cacheable(fn, opts) { return _.partial(withCache, opts, getKey, fn); }


	var custs = require('customers');
	custs.findCustomer(1, function(err, customer) {
	// do something with the customer
	...
	});


	/** Replace the function fn in a module m, to a cacheable one.
	@see http://nodejs.org/api/modules.html#modules_caching
	@param m [Object] module object
	@param fn [String] name of the function
	@param opts [Object] cache options
	ns: [String] namespace, prefix of the cache key
	dur: [Integer] seconds before cache expires
	*/
	function cache(m, fn, opts) { m[fn] = cacheable(m[fn], opts); }


	var custs = require('customers'),
	whatever = require('whatever');

	cache(custs, 'findCustomer', { ns: 'customer', dur: 86400 });
	cache(custs, 'findCustomerByName', { ns: 'customer', dur: 86400 });
	cache(whatever, 'findWhatever', { ns: 'whatever', dur: 0 });
	...

创作场景

博文共赏：在 Node.js 中使用声明式缓存

Why

问题

利用高阶函数重用代码

API 设计

封装缓存流程

函数代理

声明式缓存

最后

作者简介

评论

更多内容推荐

推荐阅读

电子书

大厂实战PPT下载