Aside

Cisco Live Runs on FlexPod

NetApp and Cisco have a long and well-regarded partnership, with the joint FlexPod offering being the best known and marketed. The collaboration between the companies often extends in less well-advertised but no less interesting ways. One that has been a personal highlight for me is NetApp providing the storage for the infrastructure that runs the Network Operations Center (NOC) for five of the last Cisco Live events in the US and Europe. This includes acting as a member of the NOC team both prior to the show and during the event: NetApp personnel arrive with Cisco staff the week before the show begins to setup the environment, and ensure that everything runs smoothly and non-disruptively for the attendees.

The core infrastructure – comprised of FlexPods as we leverage Cisco Nexus switches and UCS servers in conjunction with our NetApp FAS storage – has been relatively small: less than 20 servers and and less than 50TB of provisioned storage.  From a sheer numbers perspective, the majority of the equipment managed by the NOC team is at the edge: 500+ switches and 600-900 wireless access points. (Any and all numbers vary by year and by location. YMMV.) What is common to all of this infrastructure: it must be able to be stood up quickly once on site, it must perform well (as the large number of attendees do their best to test the limits of the environment – whether accidentally or deliberately), and, most importantly, it must be highly reliable and can not go down.


When we started it was with classic 7-mode systems: a mid-range FAS3200 series HA pair with several shelves of SAS drives for production on-site at the event, and a secondary FAS2200 series HA pair for DR and co-located services. Both systems worked well supporting the virtual infrastructure powering the event.

 

CLUS2014

In 2014 we upgraded the production hardware to a FAS8000 series running clustered Data ONTAP along with some new disk shelves. Flash Cache was also included to assist with things like VDI – that year the NOC provided virtual desktops for many of the labs that were being performed at the show. The system continued to work well with zero downtime or performance issues, and providing significant storage efficiencies. We had so much extra space due to NetApp dedupe, thin provisioning, etc. that we even mirrored most data locally between the controllers to provide yet-another level of redundancy (belts, suspenders, and safety pins).

CLUS2015_NOC_capacity

 


 

Now we’ve upgraded again: starting with this week’s Cisco Live Europe show in Berlin, the Cisco Live NOC runs on an AFF MetroCluster!

What’s AFF?  AFF stands for “All-Flash FAS” – this is the flash-only version of NetApp’s storage controllers that run clustered Data ONTAP: specifically optimized for low-latency flash performance. While sharing the same OS with our traditional FAS storage arrays enables customers to get all of the benefits of our rich family of integrated data management services, there are now software optimizations for flash that are only enabled in the AFF series, and those optimizations are already showing significant improvements across minor version releases (8.3.0 -> 8.3.1 -> 8.3.2).

Why AFF?  …. why not? During last year’s Cisco Live US we found that the IO load on the existing back-end disks was approaching the point at which contention and undesirable latency would start to be introduced. While the controllers themselves could produce more performance, we would have needed to add more disk shelves in order to provide any significantly increased amount of IOPS. Because we were not capacity bound, it made much more sense to instead replace the SAS drives with SSDs for the best performance possible and the most room for growth (in IO). We could have kept the existing FAS controllers to use with new SSDs – many of our FAS customers have been using hybrid or all-SSD configurations for years – but there was no good reason to not also take advantage of the performance improvements specific to the AFF line of controllers.

What’s MetroCluster? It’s an implementation of NetApp’s FAS (or AFF) storage controllers that provides high availability and disaster recovery across physical sites with zero data loss (zero RPO – recovery point objective) and minimal downtime (low to near-zero RTO – recovery time objective).  In order to achieve zero data loss, of course, you must be performing synchronous writes to two different sets of physical media, and for disaster recovery those sets must be in different physical locations. Because the speed of light is a real limit, in order to perform synchronous writes those two locations need to be relatively near each other so that the round-trip time latencies are acceptable (the controller can’t acknowledge a write operation back to the host until that write is committed at the remote site, not just the local site).  With a maximum supported distance of 200km (for now) you get a cluster that can operate across a “metropolitan” area. Customers have been using MetroCluster to protect their most mission critical data in this fashion for 10 years now.

So why MetroCluster? As I noted above, we had been replicating most of the Cisco Live data locally for an extra level of protection anyway, but, more importantly, for Cisco Live Europe a different need arose: active/active storage across two physical locations. At prior shows, the completely redundant FlexPod environments (as shown in the diagram above) had been located proximal to each other. For the 2016 show the goal was to take advantage of the building layouts at the new location (City Cube in Berlin) to provide even more redundancy by placing half of the infrastructure in each of two different buildings (one FlexPod per building). Very early in these planning stages it became obvious that using an AFF MetroCluster for Cisco Live was simply the right thing to do.


 

We’re now a few days into Cisco Live Europe 2016, and things are going well. On Friday we’ll be having the traditional NOC panel during the last session slot of the show where we’ll discuss the build-out, how the entire infrastructure (wired, wireless, WAN, datacenter, etc.) has performed, lessons learned, and any interesting statistics.  I’ll also post a follow-up blog about my experiences at the show.

For now, here’s a pic of one of the FlexPods (one half of the core datacenter infrastructure) as we were getting it plugged in on the first day. This was before it was powered on – hence the lack of blinkenlights.

NOC_FlexPod

 

 

 


 

Advertisement

Cisco Champions 2016: NetApp Honorees

On Friday January 29th, Cisco welcomed this year’s honorees for the Cisco Champions 2016 program. While the complete list of award winners has not yet been published, I’m proud to be able to say I’ve been chosen a Champion for the second year.  And yes, even prouder to see other NetApp/Solidfire employees and “extended family” on the list:

  • Chris Reno (@thechrisreno), National Pre-Sales Engineer at ePlus, Inc
  • Dave Cain (@thedavecain), TME for Converged Infrastructures at NetApp
  • Henry Vail, Senior Architect for Converged Infrastructures at NetApp
  • Jarett Kulm (@JK47theweapon and jk-47.com), Principal Technologist at HA Storage Systems and NetApp A-Team member
  • Melissa Palmer (@vmiss33 and vmiss.net), TME for Converged Infrastructures at NetApp
  • Pete Ybarra (@CertiPete), Field Technical Consultant at Avnet and NetApp A-Team member
  • Shawn Lieu (@ShawnLieu), Solutions Architect at Veeam and NetApp A-Team member

If there’s anyone that I’ve missed in the above list, please let me know and I’ll be happy to update & make sure that you’re included.

While a much younger program than the VMware vExpert one, the team at Cisco have done a fantastic job of ramping up quickly and truly building a thriving and interactive community. All the success of the program is due to the hard work, passion, and openness of the both program’s current leaders, Lauren Friedman (@Lauren) and Brandon Prebynski (@Prebynski), and its former stewards, Amy Lewis (@CommsNinja – now Director of Marketing for Solidfire at NetApp) and Rachel Bakker (@RBakker).

CiscoChampion2016_small

Tech Smorgasbord #6

An on-going reference series for interesting technology or projects which deserve further investigation, or for technical documentation (of one media format or another) that looks to be especially good reference material.


There’s been so much good material coming out of late that I’m going to need to put together several of these smorgasbords just to catch up. Here’s the first batch of things I think you’ll find interesting:


Automatic for the People

If you’re into network automation, you might be following the work of Kirk Byers (@kirkbyers). Kirk has been focusing on various tools and methods for automating network devices, such as Ansible, Paramiko, and Python, for awhile now – particularly with Python. His Python for Network Engineers is a good reference, and he routinely teaches classes on that subject – including free-by-email classes, the next of which starts in April. He recently blogged about NAPALM – Network Automation and Programmability Abstraction Layer – in conjunction with Ansible to automate IOS:

NAPALM, Ansible, and Cisco IOS

Another automation project, also utilizing Python and Ansible but originating from VMware, is Chaperone. The new toolkit is targeted at VMware’s SDDC products including vSphere, vCenter, vRealize Automation, vRealize Orchestrator, vRealize Operations, NSX, etc.


Virtually Anything

DoubleCloud Inc., founded by Steve Jin (@sjin2008),  has announced a new “Super vCenter” product called DoubleCloud vSearch that looks pretty interesting: Google search and big data analytics for VMware environments delivered as a single OVA and leveraging a simple HTML5 web UI.

You may also recall his DoubleCloud Interactive Cloud Environment (ICE) product that was launched last year to provide a single console for both CLI & GUI management of vCenter/ESXi environments (and the guests that run in those environments). Both vSearch and ICE are available as 60 day demo downloads, and ICE has a permanently free edition as well.

Keith Tenzer (@keithtenzer) has a really good blog covering Red Hat’s virtualization related technologies such as Red Hat Enterprise Virtualization and OpenStack. His most recent post is a nice write-up on Red Hat Enterprise Virtualization (RHEV) – Management Options.


NetApp News

Stefan Renner (@rennerstefan) has been publishing a number of interesting blog posts of late, with these two covering SnapMirror and Storage Virtual Machine (SVM) DR being of particular note.

How to create mirror-vault and version flexible SnapMirror relationship in CDOT 8.3

How to setup a SVM DR in CDOT 8.3.1 including all configuration and data

NetApp’s very own Andrew Sullivan (@andrew_ntap), co-host of the Tech ONTAP Podcast, has been very productive. He’s churned out a number of great scripting or automation focused blogs (including the first two below and more on Docker in the section), as well as co-writing this recent technical report on SDS from a NetApp/VMware perspective.

cDOT Environment Monitoring Using PowerShell

NetApp PowerShell Toolkit – Templates

TR-4308: Software-Defined Storage with NetApp and VMware

Ed Morgan (@mo6020) has written a handy little post on automating the NetApp simulator using Vagrant:

Using Vagrant to provision the Clustered Data ONTAP vSim


Docker Delights

Mr. Sullivan at work again – this time wearing his Containers Cap with a couple excellent posts on running some NetApp tools inside of Docker:

Putting the NetApp Manageability SDK Into Docker Containers

Perfstat in a Docker Container

Another NetAppian, Jacint Juhaz (@jac1nt), has a nice compendium post around using Docker Swarm on AWS with Cloud ONTAP for persistent data.


Miscellania

Microsoft acquires SwiftKey

SwiftKey has been a must-have on all of my Android devices for years now. It’ll be interesting to see what happens after this acquisition  – trepidation abounds.

Udacity is now offering an Advanced level Deep Learning course developed by Google that’s free for anyone to take so long as they’re willing to put in some time: participants are expected to take approximately 3 months when working about 6hrs/week . It’s part of Udacity’s Machine Learning Engineer Nanodegree program, which is not free overall but  – at $199/month for an expected 10-12 months worth of work – is still pretty affordable, particularly since they promise a 50% refund if you complete & graduate within 12 months. 


 

 

 

 

VMware vExpert 2016: NetApp Honorees

Last Friday VMware released the official list of the honorees for the VMware vExpert 2016 program. I’m proud to have been chosen for this award for the third year, and even prouder to see how many other NetApp employees, including our new Solidfire brethren, and “extended family” are on the list:

  • Chris Gebhardt (@chrisgeb), vTME and Dr. Desktop, Lord of EUC at NetApp
  • Henry Vail, Senior Architect for Converged Infrastructures at NetApp
  • Joel Kaufman (@thejoelk), TME Director for manageability at NetApp
  • Kyle Murley (@kylemurley), Systems Engineer for Solidfire at NetApp
  • Melissa Palmer (@vmiss33 and vmiss.net), TME for Converged Infrastructures at NetApp
  • Shawn Lieu (@ShawnLieu), Solutions Architect at Veeam and NetApp A-Team member

If there’s anyone that I’ve missed in the above list, please let me know and I’ll be happy to update & make sure that you’re included.

 VMW-LOGO-vEXPERT-2016-k

Tech Smorgasbord #5

An on-going reference series for interesting technology or projects which deserve further investigation, or for technical documentation (of one media format or another) that looks to be especially good reference material.


Free tech ebooks

Let’s start with something everybody loves – freebies! The New Stack has launched a new series of books on Docker and they’re giving them away. The first book is out now with four more books planned to be released over the next six months:

  1. Book 1: The Docker & Container Ecosystem
  2. Book 2: Applications & Microservices with Docker & Containers (coming in January)
  3. Book 3: Automation & Orchestration with Docker & Containers (coming in March)
  4. Book 4: Networking, Security & Storage with Docker & Containers (coming in May)
  5. Book 5: Monitoring & Management with Docker & Containers (coming in June)

http://thenewstack.io/ebookseries/


SDN under Ravello

Ravello Systems has some truly great tech enabling nested virtualization in the cloud, and many people have jumped on the bandwagon of running some – or in some cases all – of their home labs using Ravello rather than on their own equipment. It helps, of course, that Ravello have a very active presence in the VMware and OpenStack communities, provide free trials of their product, and even offer free accounts to VMware vExperts. Thanks to this, we’ve seen an explosion of blogs detailing how to run various software using Ravello’s Smart Labs – even software defined networking (SDN) technology.

NSX

Thomas Beaumont (@tleej) has a great series on running VMware’s NSX under Ravello – which lead to him being chosen as one of the three winners in Ravello’s recent blog writing contest.

http://nsx.world/nsx-on-aws-part-1/

http://nsx.world/nsx-on-aws-part-2/

http://nsx.world/nsx-on-aws-part-3/

Cumulus Networks

If you’d rather play with Cumulus Linux instead, Christian Elsen (@ChristianElsen) has you covered with a great post on getting it working with Ravello:

https://www.edge-cloud.net/2015/08/building-a-cumulus-networks-vx-cloud-lab-with-ravello-systems


Network automation

Speaking of networking, O’Reilly has just published an Early Release edition of the upcoming Network Programmabiility and Automation book by Jason Edelman (@jedelman8), Scott Lowe (@scott_lowe), and Matt Oswalt (@Mierdin). With this authorial lineup the book is practically guaranteed to be a must-read for those inclined towards either networking or automation.

In the meantime, you can check out a couple recent blog posts by Jason on the same subject:

OpenConfig, Data Models, and APIs

Network Automation with Ansible – Dynamically Configuring Interface Descriptions


Clustering with Red Hat Enterprise Linux 7

UnixArena (@UnixArena) has a highly detailed 8-part (so far, at least) series covering clustering under RHEL7 with Pacemaker. Pacemaker is one of the critical software components providing cluster high availability for both RHEL and OpenStack.

  1. http://www.unixarena.com/2015/12/compare-redhat-cluster-releases-rhel-7-ha-vs-rhel-6-ha.html
  2. http://www.unixarena.com/2015/12/rhel-7-redhat-cluster-with-pacemaker-overview.html
  3. http://www.unixarena.com/2015/12/rhel-7-installing-redhat-cluster-software-corosync-pacemaker.html
  4. http://www.unixarena.com/2015/12/rhel-7-configuring-pacemaker-corosync-redhat-cluster-part-4.html
  5. http://www.unixarena.com/2015/12/rhel-7-pacemaker-cluster-resource-agents-overview.html
  6. http://www.unixarena.com/2015/12/rhel-7-pacemaker-cluster-resource-group-management.html
  7. http://www.unixarena.com/2015/12/rhel-7-pacemaker-configuring-ha-kvm-guest.html
  8. http://www.unixarena.com/2016/01/rhel-7-pacemaker-cluster-node-management.html

Mac OS X Hypervisor Framework

With the release of Mac OS 10.10 (Yosemite), Apple added an intriguing new feature to the operating system with very little fan fare. The release notes only offered this brief paragraph:

Hypervisor (Hypervisor.framework). The Hypervisor framework allows virtualization vendors to build virtualization solutions on top of OS X without needing to deploy third-party kernel extensions (KEXTs). Included is a lightweight hypervisor that enables virtualization of the host CPUs.

Since then, there hasn’t been a lot of further discussion on the topic, either – except for the fine folks at pagetable.com. First there was a fascinating article in January of last year on using the framework to run a DOS emulator (hvdos), and then in June came the announcement of xhyve, a port of FreeBSD’s bhyve hypervisor.

(Interesting aside: bhyve was initially developed and open-sourced by NetApp back in 2011, and you can find more information, including numerous conference presentations and recordings on the FreeBSD site.)

And now Veertu Labs has launched their new virtualization product for the Mac based on Apple’s hypervisor framework. Maish Saidel-Keesing (@maishk) has a good write up here:

http://technodrone.blogspot.com/2016/01/native-mac-osx-virtualization-with.html

I haven’t played with it yet myself, but I’m looking forwad to giving it a spin, while still keeping an eye on xhyve’s future.


All CLI all the time

If you’ve perused much of my prior posts, you’ll know that I enjoy using the CLI quite a bit – whether it’s for the operating system, an application, or an infrastructure device, textual interfaces just seem more fun and (usually) more efficient to me. Sadly, despite the UNIX power of Mac OS X, its rich CLI is often overlooked so it was a nice surprise to stumble across Herb Bischoff’s Awesome OS X Command Line. It’s by no means exhaustive, but there’s quite a few little tips, tricks, and hints captured of which I wasn’t previously aware.

I also came across a nice study guide for PowerCLI put together by Christophe Calvet which includes a good conceptual introduction and links to a number of additional resources for both PowerCLI and PowerShell.


Attack Methods for Gaining Domain Admin Rights in Active Directory

Earlier in my IT career I spent a large amount of time on the job dealing with security issues: physical security systems, firewalls, operating system hardening, corporate security policies, etc.  While it’s been a few years since I’ve  had any real security responsibilities, infosec remains an area of significant interest to me. This article by Sean Metcalf (@PyroTek3) is a nicely detailed examination of some of the common vulnerabilities in Microsoft’s Active Directory today and how to mitigate against them. Lots of references and backing sources provides a treasure trove of related reading.

https://adsecurity.org/?p=2362