Date Tags SDN

《SDN: software Defined Networks》由Juniper的两位杰出工程师编写,借着端午假期,把全书粗读了一遍,有不少收获。

此书主要章节如下:

  • 控制平面与数据平面的集中与分布
  • OpenFlow
  • SDN控制器
  • 网络可编程
  • 数据中心概念与组成
  • NFV
  • 网络拓朴与拓朴抽象
  • 构建SDN框架
  • 带宽调度、操纵与日程编制
  • 数据中心Overlay、大数据、网络功能虚拟化
  • 入口流量监控、分类与行为触发

做为时下比较火的议题,这本书覆盖了SDN的背景,起源,当前发展,以及流量调度/NFV/流量识别与干预相关方案的Use Case,尽管来自于Juniper,但全书写得仍十分中肯,也有一定的深度,值得推荐。

以下将本次阅读中我个人觉得值得分享的地方记录下来,以便有阅读此书的朋友交流探讨。

关于OpenFlow的配置协议OF-Config

Because of-config uses NETCONF/Yang, the working group is establishing their own Yang data models for these entities (tunnels, OAM). From an SDO perspective, this may not be a good model going forward.

这点与我个人的观点相同,OF-Config想要靠一个标准组织把业务模型给定死,理论上很难得到厂商的支持,毕竟要有一个通用的业务模型,实际上还是比较困难的,甚至OpenVSwitch的配置模型都很难映射到上面来,这也是OpenVSwitch至今未支持OF-Config的原因,即便在ONF想去资助的情况下。

VM Live Migration

There is still some debate about how frequently live migration might occur between data centers as a DCI use case. The reasoning is that in order for any migration to take place, a file copy of the active VM to a new compute server must be performed, and while a VM is being copied, it cannot be running, or the file would change out from under the copy operation. Hence this operation is becoming less and less common in practice. Instead, moving to a three-tiered application architecture where the norm is to create and destroy machines is far simpler (and safer).

数据中心间VM热迁移依赖于VM镜像的状态复制,为避免状态变化,在复制期间实际上VM无法运行,因此这时的热迁移实际上用处已经不大,也越来越不被人们所有采用。

这里的三层架构是指下图所示的应用部署模型,在这种模型中,水平扩展比较容易,因此VM热迁的必要性不大。

三层应用部署架构

VM运行状态同步

The runtime state can be updated in one of two ways:

  • Data-plane driven approach

    The VM sends some traffic to force the runtime state to be updated. For example, the VM can broadcast a gratuitous ARP to force all MAC tables in the tenant net‐ work to be updated with the new location of the VM’s MAC address.

  • Orchestrator or control-plane driven approach

    The orchestrator uses a control-plane signaling protocol to explicitly update the runtime state in all places where it needs to be updated.

数据中心规模评估

Scale and performance of VM moves will vary based on the type of service offering (IaaS, PaaS, or SaaS) and the degree of tenancy. A simple IaaS offering at a typical service provider with the current generation of Intel/ARM processor, c. 2012, might present the following rough scale numbers:

  • Number of data centers

    Multiples of tens (depends entirely on the geography of the offering; the example given would be for a country the size of Japan)

  • Number of servers per center

    Tens of thousands

  • Number of servers per cluster/pod

    1,000

  • Number of tenants per server

    Approximately 20 (current generation of processor, expected to double with next generation)

  • Number of VMs per tenant

    Approximately five

  • VM change rate

    Highly variable

  • VM change latency

    This is a target that varies by provider and application

这里提到2012年的服务器上一般运行20个VM,并期望在下代翻倍,按照摩尔定律,两年后的2014年,服务器上运行的VM就可达40个了(有待确认)。

数据中心分布式计算的几大误解

  • The network is reliable
  • Latency is zero
  • Bandwidth is infinite
  • The network is secure
  • Topology doesn’t change
  • There is one administrator
  • Transport cost is zero
  • The network is homogeneous.

控制器部署

The use of an SDN controller may seem to be an obvious conclusion in the context of service chaining in the data center, but less so in the Edge/Access domains (where the added cost of a co-located controller may be prohibitive but the potential interaction delay between agent/controller may be problematic).

在边缘或接入域中部署集中式的控制器由于潜在的交互延迟可能导致问题,使得在这些设备上采用嵌入式控制器(或子控制器)的思路就显得更为合适了。

拓朴信息

One final downside to traditional approaches to topology was the format of the topology information itself. The information was, as one would expect, formatted such that a router could quickly gather and process the topology for the fastest routing computa‐ tions, or if gathered using out-of-band methods, in yet another format suitable for a command-line interface, for example, but not for doing other calculations. Unfortu‐ nately, these formats were often suboptimal for other uses that these applications had, and thus required their further processing to make it useful—requiring further effort, expense, and kludges in order to make it work. Fortunately, recent and new approaches in the area of topological information, its discovery, retrieval, and processing, have been undertaken. It is in fact this new effort and approach to topology, how it is being made available to applications through the SDN controllers and frameworks, and what then can be done that we discuss in this book, and in detail throughout this chapter.

在SDN领域内,拓朴信息本身非常重要,但拓朴信息的格式也很重要,良好的格式才能孕育良好的应用。

关于北向接口

In the ever-growing world of SDN controllers, having a common API to program SDN applications to is not just theoretically important, but economically and operationally as well. It means a network operator can either buy or build a single application to accomplish a particular task, and then have it interact with all of the controllers deployed in his/her network.

在众多的SDN控制器中,拥有一个公共的北向接口是非常重要的,从这点来看,OpenDaylight项目切中了这个痛点,值得进一步关注;并且拥有一个自动生成的API也显得更为实际些,实际上OpenDaylight项目的API就是通过YANG对业务建模,从而导出API的。

控制器的两种模式

The Juniper POC framework as well as the IETF frameworks that followed can be de‐ scribed as brokers. Conversely, many of the controller strategies position the concept of a Network Operating System (NOS) as a replacement for distributed routing protocols that oversees the data plane of the managed elements on behalf of applications that define network services. In the broker model, applications interact with the network via the broker so that they or the network can be more efficient, enforce target SLAs, or provide a more satisfactory end user experience. The obvious distinction between the models is in the type of application that the architecture is meant to service (the breadth of the solution).

中介模式或NOS模式,两种方式应该会长期并存。

流量编制(Bandwidth Calendaring)Use Case

Upon closer inspection, however, if one weighs the bandwidth versus the cost per bit and then compares that against the actual amount needed, during any given time of the day, this model is quite wasteful based on the diurnal example discussed. For example, let’s assume the most bandwidth used is during the day, with peak demand requiring 85% of the network’s resources, but that nighttime data replication duties require only 40%, so paying to provide similar bandwidth during nighttime hours is rather wasteful. Assuming TomsMusicStreaming.com has access to flexible pricing of bandwidth, it makes sense to be able to adjust bandwidth on a time-based demand model. Even for fixed priced bandwidth, being able to shut down or idle virtual machine or network equipment resources could be a significant optimization of power, lowering heating and cooling bills. To these ends, calendaring—making a forward reservation of path and bandwidth—is one such way to optimize our use case

这里的calendaring就是编制日程,也即对带宽进行适当预留,如果支持灵活带宽编制,比如现在的云计算服务商支持按秒计费,按分钟计费,此时就可以通过调整带宽来实现费用节约,即使带宽不支持调整,也可以关闭不必要的vm来实现节能。

Ochestration与SDN Controller

In general, the SDN controller is only responsible for the network aspect of the data center. It performs the low-level network operations based on high-level instructions from the orchestrator. The orchestrator is responsible for the overall operation of the data center, not just the network but also compute, storage, and services.

编排与SDN控制器的区别在于SDN只管理网络部分,而编排系统则还需要管理存储与计算、服务等。

物理网络与Overlay网络

One important observation is that neither the orchestrator nor the SDN controllers touch the physical network; they only touch the servers. In the overlay model, adding a tenant or adding a virtual machine to a tenant does not involve any changes to the physical network. It is the responsibility of the Network Management System (NMS) to manage the physical network. The NMS needs to interact with the physical network when switches are added or when servers are added, but not when tenants are added or virtual machines are added. This is clearly an advantage of the overlay model. The physical network is very stable and as a result more reliable; all the dynamic changes related to tenants are dealt with in the virtualized network.

尽可能将physical网络独立由nms管理,这样可以将因为tenant状态变化的部分交由overlay网络处理,这样一个显然的好处就是physical network将尽少减少变化,而更加稳定。

No tenant state in the physical switches. Specifically, the physical switches do not contain any MAC addresses of tenant virtual machines. In the absence of overlays, the core switches contain all MAC address of all VMs of all tenants.

从这里来看,TOR感知到vm mac的不好之处就是受制于vm mac的容量,而在一个大容量的数据中心,这种数量显然是不够的。

Assuming some level of redundancy of the appliance/gateway and a worst-case VM distribution for the tenant, where every VM on the host is unique, a quick calculation of the number of potential tunnels would be in the low hundreds (about 160—8 tunnels per VM, 4 tunnels to other hosts in the group, 2 to redundant firewalls, and 2 to gateways, 20 VMs). The number of flows mapping onto those tunnels can be an additional but currently manageable scale multiplier (not always a 1:1 correspondence).

评估Overlay Tunnel时要考虑到gateway/fireway的tunnel因素。

DevOps

Applying this concept to networking, the resources would be interfaces, VLANs, and so on. If the operating system of a traditional network element supports a Puppet client/ agent, interesting solutions can emerge. For example, if the scale of the data center operation was small enough to fit within the scope of VLAN separation (not requiring an overlay), then extensions to Puppet can be used to configure VLANs on ports and trunks appropriate to such an architecture.

小规模的网络中,DevOps理念与Puppet等工具会减少运维成本。

Feedback and Reoptimization

Both the firewall application on the controller and in line within the firewall will have a common optimization goal: minimize the traffic sent through the firewall. This is based on the assumption that the firewall resource introduces additional hardware or operational costs that are defrayed by managing the scale of the solution. Once a specific media flow, including the amorphous ports, has been identified by the application, a feedback mechanism that puts in place a specific flow rule should pipeline this traffic to the egress port.

与上篇文章谈到Arista的dynamic load balance switch fabric类似,在vFirewall领域,反馈与重优化是一个比较不错的思路,这应该是系统构建中的一个不错的原则。

最后,用此书的结语来结束本文,SDN Is Really About Operations and Management。


Comments

comments powered by Disqus