10月8日OPENDAYLIGHT发布了Lithium-SR2版本,但在国内SDN圈子里似乎较少看到对此版本的讨论,但对于从Lithium版本开始研究OPENDAYLIGHT的我来说,还是感觉到了这个版本特别是在HA层面的较大改变,从这点来看,ODL正向者HA/Scalability的目标前进,相信经过SR3/SR4(预计于2016.3.3发布)到Beryllium版本的发布,ODL的生产可用度会进一步提高。

EntityOwnershipService分析

在介绍SR2的EntityOwnershipService特性前,先介绍一下ODL的SR发布机制,SR全称为Simultaneously Release(个人译为协同版本),它的主要作用在于解决ODL大量子项目之间存在依赖,新版本的发布需要考虑到各子项目间的依赖,技术委员会需要在子项目之间协调,为其制定开发、bugfix计划,并协调共同发布的版本,在ODL的正式版本发布后,一般会有四个SR版本,从正式版本到最后一个SR4版本,时间跨度为8、9个月的时间,所以ODL一个版本的生命周期还是挺久的,另外,技术委员会要协调N多项目,也确实不易。

如果我们去看SR2的Release Notes,会注意到Controller一节有大量与EntityOwner这个名称相关的改动被合入到了SR2中,这些改动被标注为与BUG-4105相关,并且即使BUG-4105被标注为Resolved Fixed即已经修复,依赖于BUG-4105的BUG-4104仍未彻底Fix,我们去查看这两个bug,会发现它们都与openflowplugin在多控制器的部署有关,而多控制器则是OPENFLOW SPEC中所支持的用于实现控制器HA的重要方式,ODL在这上面如果有问题的话,对于像OPENSTACK集成这样的功能,需要管理大量的VSWITCH场景,如果openflowplugin不能有效的支持多控制器,很难让人相信OPENSTACK与ODL集成会是一个可以商用的方案,所以这里也提醒大家,基于ODL的OPENSTACK网络方案还处于初级阶段,不可以霸王硬上弓。

下面我们来分析一下SR2所引入的EntityOwnershipService,从以下接口的注释可以看出,EntityOwnershipService主要的作用是在一组集群成员上为运行在集群上的app提供一种确定实体所有权的机制,这句话简单的来说就是想解决在多 控制器组成集群的情况下,比如一个OF交换机连接了三个控制器,每个控制器上都运行了openflowplugin(注意,openflowplugin本身也是构建在ODL MD-SAL这个核心分布式数据库上的application),EntityOwnershipService就可以用来针对于由dpid标识的OF node这个实体,在三个控制器间选出一个owner,通过OPENFLOW的role request消息来选出一个主备,并在主控制器故障时,动态在备控制之间选择一个主控制器,从而解决像BUG-4104中所说明的在不支持控制器主备时,每个ODL控制器上订阅了packet-in消息的app都会尝试去添加流表,从而导致MD-SAL上的添加存在冲突,可能导致数据损坏等问题。

/**
 * <p>
 * The EntityOwnershipService provides the means for a component/application to request ownership for a given
 * Entity on the current cluster member. Entity ownership is always tied to a process and two components on the same
 * process cannot register a candidate for a given Entity.
 * </p>
 * <p>
 * A component/application may also register interest in the ownership status of an Entity. The listener would be
 * notified whenever the ownership status changes.
 * </p>
 */
public interface EntityOwnershipService {

    /**
     * Registers a candidate for ownership of the given entity. Only one such request can be made per entity
     * per process. If multiple requests for registering a candidate for a given entity are received in the
     * current process a CandidateAlreadyRegisteredException will be thrown.
     * <p>
     * The registration is performed asynchronously and any registered {@link EntityOwnershipListener} is
     * notified of ownership status changes for the entity.
     *
     * @param entity the entity which the Candidate wants to own
     * @return a registration object that can be used to unregister the Candidate
     * @throws org.opendaylight.controller.md.sal.common.api.clustering.CandidateAlreadyRegisteredException
     */
    EntityOwnershipCandidateRegistration registerCandidate(@Nonnull Entity entity)
            throws CandidateAlreadyRegisteredException;

    /**
     * Registers a listener that is interested in ownership changes for entities of the given entity type. The
     * listener is notified whenever its process instance is granted ownership of the entity and also whenever
     * it loses ownership. On registration the listener will be notified of all entities its process instance
     * currently owns at the time of registration.
     *
     * @param entityType the type of entities whose ownership status the Listener is interested in
     * @param listener the listener that is interested in the entities
     * @return a registration object that can be used to unregister the Listener
     */
    EntityOwnershipListenerRegistration registerListener(@Nonnull String entityType, @Nonnull EntityOwnershipListener listener);

    /**
     * Gets the current ownership state information for an entity.
     *
     * @param forEntity the entity to query.
     * @return an Optional EntityOwnershipState whose instance is present if the entity is found
     */
    Optional<EntityOwnershipState> getOwnershipState(@Nonnull Entity forEntity);
}

该服务目前由控制器中的DistributedEntityOwnershipService服务所实现,具体的实现机制是由一个名为entityOwner的Shard Actor来处理,利用MD-SAL实现Shard主备的Raft一致性协议机制,在当前集群的cluster之间根据Raft对于该Shard选出的主备而确定一个Entity的主备,进而通过app所注册上来的监听来通知本控制实例上的app感知主备,以进行必要的处理。

public class DistributedEntityOwnershipService implements EntityOwnershipService, AutoCloseable {
    private static final Logger LOG = LoggerFactory.getLogger(DistributedEntityOwnershipService.class);
    static final String ENTITY_OWNERSHIP_SHARD_NAME = "entity-ownership";
    private static final Timeout MESSAGE_TIMEOUT = new Timeout(1, TimeUnit.MINUTES);

    private final DistributedDataStore datastore;
    private final ConcurrentMap<Entity, Entity> registeredEntities = new ConcurrentHashMap<>();
    private volatile ActorRef localEntityOwnershipShard;
    private volatile DataTree localEntityOwnershipShardDataTree;

    public DistributedEntityOwnershipService(DistributedDataStore datastore) {
        this.datastore = datastore;
    }

    public void start() {
        ActorRef shardManagerActor = datastore.getActorContext().getShardManager();

        Configuration configuration = datastore.getActorContext().getConfiguration();
        Collection<String> entityOwnersMemberNames = configuration.getUniqueMemberNamesForAllShards();
        CreateShard createShard = new CreateShard(new ModuleShardConfiguration(EntityOwners.QNAME.getNamespace(),
                "entity-owners", ENTITY_OWNERSHIP_SHARD_NAME, ModuleShardStrategy.NAME, entityOwnersMemberNames),
                        newShardPropsCreator(), null);

        Future<Object> createFuture = datastore.getActorContext().executeOperationAsync(shardManagerActor,
                createShard, MESSAGE_TIMEOUT);

        createFuture.onComplete(new OnComplete<Object>() {
            @Override
            public void onComplete(Throwable failure, Object response) {
                if(failure != null) {
                    LOG.error("Failed to create {} shard", ENTITY_OWNERSHIP_SHARD_NAME);
                } else {
                    LOG.info("Successfully created {} shard", ENTITY_OWNERSHIP_SHARD_NAME);
                }
            }
        }, datastore.getActorContext().getClientDispatcher());
    }

EntityOwnershipService提供的接口并不复杂,对于openflowplugin而言,如下代码由openflowplugin的RoleManagerImpl对象负责在OPENFLOW设备连接上来时初始化,该对象将以NodeId做为ENTITY,向EntityOwnershipService注册自己做为候选的owner,此后MD-SAL底层将对各控制器实例对该各ODL实例进行选主操作,由于RoleContextImpl注册实现了OpenFlowOwnerListener接口,onRoleChanged将在本实例状态发生变化时进行通知,对于被选为主的控制器实例,该回调通过openflowplugin所提供的SalRoleService来发送ROLE REQUEST消息给OPENFLOW交换机来进行master切换,这样就很好的实现控制器主备,防止多主共存时对交换机共同编程可能出现的流表混乱。

public RoleContextImpl(DeviceContext deviceContext, RpcProviderRegistry rpcProviderRegistry,
                       EntityOwnershipService entityOwnershipService, OpenflowOwnershipListener openflowOwnershipListener) {
    this.entityOwnershipService = entityOwnershipService;
    this.rpcProviderRegistry = rpcProviderRegistry;
    this.deviceContext = deviceContext;
    entity = new Entity(RoleManager.ENTITY_TYPE, deviceContext.getPrimaryConnectionContext().getNodeId().getValue());

    this.openflowOwnershipListener =  openflowOwnershipListener;
    salRoleService = new SalRoleServiceImpl(this, deviceContext);

    //make a call to entity ownership service and listen for notifications from the service
    requestOpenflowEntityOwnership();
}

private void requestOpenflowEntityOwnership() {

    LOG.debug("requestOpenflowEntityOwnership for entity {}", entity);
    try {
        entityOwnershipCandidateRegistration = entityOwnershipService.registerCandidate(entity);

        // The role change listener must be registered after registering a candidate
        openflowOwnershipListener.registerRoleChangeListener(this);
        LOG.info("RoleContextImpl : Candidate registered with ownership service for device :{}", deviceContext.getPrimaryConnectionContext().getNodeId().getValue());
    } catch (CandidateAlreadyRegisteredException e) {
        // we can log and move for this error, as listener is present and role changes will be served.
        LOG.error("Candidate - Entity already registered with Openflow candidate ", entity, e );
    }
}

@Override
public void onRoleChanged(final OfpRole oldRole, final OfpRole newRole) {

    // called notification thread from md-sal

    LOG.debug("Role change received from ownership listener from {} to {} for device:{}", oldRole, newRole,
            deviceContext.getPrimaryConnectionContext().getNodeId());

    final SetRoleInput setRoleInput = (new SetRoleInputBuilder())
            .setControllerRole(newRole)
            .setNode(new NodeRef(deviceContext.getDeviceState().getNodeInstanceIdentifier()))
            .build();

    Future<RpcResult<SetRoleOutput>> setRoleOutputFuture = salRoleService.setRole(setRoleInput);

    Futures.addCallback(JdkFutureAdapters.listenInPoolThread(setRoleOutputFuture), new FutureCallback<RpcResult<SetRoleOutput>>() {
        @Override
        public void onSuccess(RpcResult<SetRoleOutput> setRoleOutputRpcResult) {
            LOG.debug("Rolechange {} successful made on switch :{}", newRole,
                    deviceContext.getPrimaryConnectionContext().getNodeId());
            deviceContext.getDeviceState().setRole(newRole);
            if (roleChangeCallback != null) {
                roleChangeCallback.onSuccess(true);
            }
        }

        @Override
        public void onFailure(Throwable throwable) {
            LOG.error("Error in setRole {} for device {} ", newRole,
                    deviceContext.getPrimaryConnectionContext().getNodeId(), throwable);
            if (roleChangeCallback != null) {
                roleChangeCallback.onFailure(throwable);
            }
        }
    });
}

EntityOwnershipService是一个通用的服务接口,对于其他南向接口以及用户自己开发的需要进行HA处理的app而言,都可以通过注册自己的实体来实现主备管理,相比之前的ODL版本,是一个比较显著的改进了;当然,EntityOwnershipService目前还是首次实现,还需要继续优化,可以想到的问题包括如何保证切换的原子性,以及如何实现在大集群中实现小集群的主备管理接口等,这些问题我们需要持续关注社区的进展,有条件的还可以尝试进行一些试部署,协助社区进行优化改进,以更好的促进ODL的发展。

另外,ODL社区还曾尝试过基于策略的两节点HA方案,不过后来被废弃了,有兴趣的可以参考一下WIKI上本链接相关的内容2-Node Clustering,该方案的废弃即标识着两节点HA已经不被ODL所支持,从RAFT协议的角度来看,三节点是能够进行集群部署的最小配置。

openflowplugin Helium方案与Lithium方案

在上面的分析中,我们已经介绍了openflowplugin使用了EntityOwnershipService进行主备管理了,但需要注意的是,从lithium版本开始,大家应该可以注意到,在karaf中,会存在odl-openflowplugin-nsf-services-li与odl-openflowplugin-nsf-services这样两种相似的feature,它们的内在区别就在于,-li后缀所标识的feature,是lithium版本针对于openflowplugin在helium版本上存在的问题而进行新设计后的实现,像EntityOwnershipService就只在新设计的openflowplugin被实现了。

所以如果要测试新版本的openflowplugin,则要注意安装-li后缀的特性,另外这里也要吐槽一下ODL的文档,SR2发布的正式文档中似乎没有找到对这样的变更的说明,基本上更多是存在于北美几个大公司的几个开发负责人的脑中:(所以从这点来说,ODL的开放性和透明度上还需要增强。

clustering的可用性预测

从controller-dev的邮件列表上来说,Ericsson Bangalore的员工已经针对ODL开发了一些通用的Clustering/HA测试框架,并期望针对于主流的plugin如OpenFlow/OVSDB进行HA集成验证,并最终会将这些测试用例贡献给ODL,从邮件里列表的反馈来看,目前还是有若干测试出的重要问题尚未被fix,因此,可以说ODL及openflowplugin的clustering功能,还尚未能够得到较为完善的验证,是否能达到production ready的水准,还有待进一步的跟踪观察。

题外话

根据目前我在controller/openflowplugin邮件列表里的观察来看,国内除Huawei外,参与邮件列表讨论的国内公司或个人还比较少,

由于本人主要从是OpenFlow设备的研发工作,但对于网络相关的开源社区一直具有强烈的兴趣,相信白牌及SDN将来会有不错的发展,因此利用业余时间对ODL及openflowplugin的实现进行了一些初步的研究,故准确性上可能还有待提高,肯定会有错误之处,在此主要进行抛砖引玉,欢迎大家纠正及更新,共同提高ODL中文社区的参与度与活跃度。

本人长期潜水在SDN技术群和Opendaylight SDNLAB研究群,群内名片为苏州盛科-张东亚,有问题大家也可以在QQ上和我交流。


Comments

comments powered by Disqus