最近读到OVS discuss maillist上的一篇文章(见附1),Ben Pfaff提到Andy Zhou(Vmware的OVS研发人员)最近一周调研了etcd 3.0作为OVN的可选database的可能性,突然发现自己有一段时间没有关注过OVN的进展了,但从核心的数据库选型还存在不确定性的角度,看起来原计划今年出稳定版本的可能性有点挑战了。

附1邮件讨论里链接到之前的一篇讨论,即OVN的数据库选择,经过大量的分析后,Ben Pfaff收集了各方面的意见,基本上可选的主要是Zookeeper与Etcd(之前我们在调研Neutron的替换方案时,也曾考虑过etcd,但由于etcd 3.0还未发布,且从我们内部的反馈来看似乎还有稳定性问题,故仍选择采用Zookeeper),但由于OVN对一致性的要求较高,因此选择了对Etcd 3.0的beta版进行了调研。

而之所以不选择OVSDB的原因,则是一方面增加HA过于复杂,同时在scalability上存在上千个client访问时的问题;从我们自己的经验来看,如果公有云使用ovs-vsctl这种方式访问ovsdb-server时,在大量并发且port数量较多时(比如网络与计算混布的节点或网络节点存在大量router),将导致ovsdb-server cpu占用率极高,响应特别慢,这应当与OVSDB采用JSON-RPC作为通信协议存在天生低效有关,并且与ovs-vsctl需要进行大量数据的订阅有关;关于这个问题,我们后续准备采用native长链接的方式与ovsdb-server进行通信,以减少每次建立副本的开销。

从附1给出的测试结果来看,etcd 3.0似乎比较理想,OVN的相关开发人员比较倾向于组织一个mini-hackathon进行开发,有兴趣的朋友可以继续关注。

我个人很早就觉得OVN采用ovsdb-server做为南北向数据库可能只是权宜之计,现在看已经向更为生产化部署的目标进行了,期待给私有云带来更为靠谱的方案,至于公有云,大家还是各自研发吧:)

另外,Ben Pfaff关于数据库比较的结果我直接粘在这里,感觉能够节省不少分析的时间:

Database txn ACID consist trk HA OS C Py format
ActorDB yes ACID strong NO yes yes yes yes sql
Aerospike yes ACID strong NO yes yes yes yes db/KV
Cassandra NO -C-D tunable NO yes yes NO yes table
Cockroach DB yes ACID strong NO yes yes ? ? sql
Couchbase NO ???? ???? NO yes NO? yes yes JSON
CrateIO NO ???? EVNTUAL NO yes yes NO yes sql
etcd NO ACID strong yes? yes yes yes yes KV
Gigaspaces XAP yes ACID strong yes yes NO NO NO multi
HBase NO ACID strong NO yes yes NO yes table
Hyperdex yes ACID strong NO yes NO yes yes KV
Hypertable NO ???? ???? NO yes yes NO yes table
MongoDB NO ACID strong ?? yes yes yes yes JSON
RAMCloud yes ???? strong NO yes yes NO yes KV
Redis yes -C?D ???? NO yes yes yes yes KV
Riak NO ---D EVNTUAL NO yes yes yes yes KV
Scalaris yes ACI- strong NO yes yes NO yes KV
ScyllaDB NO -C-D tunable NO yes yes NO yes table
Voldemort NO ???? EVNTUAL NO yes yes NO yes KV
Zookeeper yes AC-D strong yes yes yes yes yes KV
OVSDB yes ACID strong yes NO yes yes yes table

表格列介绍如下:

  • Database: The database being evaluated.

  • txn: "yes" if the database supports transactions across arbitrary data, "NO" if its transactions are limited to a single data item, such as a single key-value pair, or perhaps even more limited.

  • ACID: The transactional properties that the database supports, within the transactions that the database supports. (Thus, a database whose transactions cover only a single data item can be listed as ACID, but this is only for those limited transactions.)

  • consist: The distributed consistency model that the database supports, one of "strong" for strong or linearizable consistency, "tunable" for consistency that can be tuned to be strong or linearizable or weaker, or "EVNTUAL" for eventual consistency.

  • trk: "yes" if the database can automatically report data changes to clients, "NO" if the database requires clients to poll for changes.

  • HA: "yes" if the database can be configured for high availability, so that loss of a single node does not stop database activity, "NO" otherwise.

  • OS: "yes" if the database is open source or free (libre) software, "NO" if it is proprietary. When a database has open source and proprietary editions, this is "yes" and only the features in the open source edition are credited in other columns.

  • C: "yes" if the database has a C (not C++) client library, "NO" otherwise.

  • Python: "yes" if the database has a Python client library, "NO" otherwise.

  • format: The database's data model. "sql", "db", "table", "multi" all indicate that OVN could directly use the data model, "KV" or "JSON" that OVN's data model would have to be overlaid on it.

附:

  1. http://openvswitch.org/pipermail/discuss/2016-June/021646.html
  2. http://openvswitch.org/pipermail/dev/2016-March/067479.html

Comments

comments powered by Disqus