Sunday, December 9, 2012

[book review] Arista Warriors



I recently reviewed the book, Arista Warriors. Obviously the Python chapter interested me the most, here is the result of the output as I tried the example in the book to modify the 'show version' output with a few lines of Python. 

***** Experiment on Arista CliPlugin *****


[user@switch CliPlugin]$ pwd
/usr/lib/python2.7/site-packages/CliPlugin
[user@switch CliPlugin]$ sudo vi VersionCli.py

def showVersion( mode, detail=None ):
   <skip>

   # Print commands (delete after)
   print "*" * 10
   print "I dont really like Pie"
   print "*" * 10

[user@switch ~]$ Cli -c "show version"
**********
I dont really like Pie
**********
Arista DCS-7504
Hardware version:    02.00
<skip>
Software image version: 4.10.3
Architecture:           i386
Internal build version: 4.10.3-937242.EOS4103
Internal build ID:      a229c9db-af32-4e62-a4f7-5711e977d968

Uptime:                 6 weeks, 1 day, 22 hours and 51 minutes
Total memory:           4100488 kB
Free memory:            1758148 kB

[user@switch ~]$

That is pretty cool. But how do I write more native looking commands? Or how do I write an agent that can mount to SysDB directly? I dont know, but here are the modules that I plan to look more into for the command part, stay tuned: 

import Tac, CliParser, BasicCli, os, Tracing, EosVersion, Ethernet


***** Review *****



Here is the Review as appeared on Amazon: 

I have been working with Arista switches for a while now, this is the manual that probably should have came with the Arista switches. As the author mentioned, the command syntax the EOS format is very similar to Cisco IOS. In fact, I have heard of stores where engineers simply copy and paste IOS configurations into EOS during migration and worked just fine. However, to tap into the capabilities that makes Arista a game-changer one has to get into the realms of SystemDB, Python, Linux user space, etc. Anybody can type into commands, but the real challenge lies within the impact and scope of what you are trying to do. This book does a good job of doing the practical stuff that you can use in your day-to-day, as well as the concepts behind them.

Overall, I would recommend this as a solid investment of money and time for anybody looking into Arista switches. 

Pro: 

- Real world examples. 
- Solid explanation of concepts. 
- Sense of humor for an otherwise dry subjects. 

Notes, suggestions, erratas: 

1. Maybe more coverage into the current fat tree design with spine/leaf/core, etc. This is one area that Arista differs from competitor for the number of ECMP next-hop, tcam division of host routes, etc. 
2. Power draw is critically important in large scale data centers, Arista has some good innovation in this area with PHY-less design. 

*** Virtual Machines on Arista ***

1. If you don't have an Arista switch handy to practice, or just want a safe environment to practice with, you can run vEOS off a VM: vEOS, https://eos.aristanetworks.com/2011/11/running-eos-in-a-vm/ by Andre Pech. 
2. When you are in a pinch, you can also run another VM direction in EOS: Running Virtual Machines in EOS, https://eos.aristanetworks.com/2011/09/virtual-machines-in-eos/ by Mark Berly. 

*** sFlow ***

The whole chapter on sFlow probably warrants more coverage. This is one important telemetry tool that offers lots of information and the right direction going forward, IMO. It offers the ability to do push telemetry vs. pull such as SNMP that offers more scalability. 

It is also important in the sense of data center billing for the counter. If you are, say, Yahoo and have one of the biggest Hadoop cluster. You would want to know who is your top talker so you can bill them the network overhead accordingly. This is typically done with NetFlow that exports to collector (more on it in a bit), but if you have a network of Arista switches that does not cross the core, sFlow counter is your current best bet. 

Because aggregation is done in the onboard flow cache 'before' it sends to collector, NetFlow often falls down in even moderate amount of traffic in data centers. You are forced to scale down on the flow sampling rate that increases the error delta. sFlow on the other hand, just samples and push all the intelligence into the collector. 

The author hints at this, but here is an early peak on troubleshooting data plan traffic with sflowtools: 

1. Running the open source sflowtools directly on Arista switches for troubleshooting data plan traffic that does not cross CPU. https://eos.aristanetworks.com/wiki/index.php/EOSTroubleshooting:Local_sFlow_Collection_and_Analysis
arista#bash sudo /mnt/flash/sflowtool -t | tcpdump -r - -vv

*** Python ***

Python should have more coverage in the book as that is what Arista CLI is built on. Just some pointers toward motivated Network Engineers which modules to look more into and the location of the files would be helpful. 

1. I have asked when Python 3 will be included in Arista, best guess is when Fedora updates their OS to make 3 default. 

*** Random Notes about the book ***

1. I think 'sh run all interface' was in 4.7.x, then for some reason went away in 4.8.x, then came back after 4.9.x.  
2. I wish the book covers more on the SysDB mount points that Mark Barley points out on EOS Central. 
3. IPv6 chapter in the works with 4.10.x code? 
4. Nice tip about generating traffic at 'ping -s 15000 -c 10000 10.10.10.15 > /dev/null &' I have done that before but couldn't see the traffic right away and killed it. 
5. Why woundn't cron work on Arista (chapter 23)? 
6. Nice tips: didn't know that tcpdump can be executed directly from EOS, files other than selected few locations do not survive reloads, emails, etc. 
7. ZTP chapter: chapter typo, should be EOS 4.7 and after, not 3.7. 
8. ZTP chapter: instead of identifying by mac address, should identify via relay agent or the place of kingdom (show lldl) via script instead of by mac address. The mac address change due to RMA, typo. Also manually mocking DHCP config file does not scale. 
9. Event-Handler chapter: More event-handler trigger is indeed needed in Arista in order for the feature to be more useful.
10. Event-Handler chapter: tJust like regular bash script, you can 'demonize' and chain the commands with ';'.
22. Event-handler chapter: There is at least one bug in event-handler in 4.8.3 that configuring 'on boot-up' triggers the event-handler right away. Be careful if the startup script include anything that is production impacting. 
23. I like the 'advance usage of sqlite' a lot, gives me some ideas for using sqlite for other features as well. Maybe show the Python integration with sqlite for script purposes?
27. I like what the author pointed out the different between the default flash: location vs. having to specify full Unix path via file: command. I wish I had known this, would've saved me a some time copying stuff from /var/log -> /mnt/flash -> transfer. 
28. CloudVision: I wouldn't recommend the use of XMPP in production either. Use the upcoming JSon API instead. 
29. Page 360, pretty sure that 'spline' is a typo for 'spine'. 
30. Here is a talk by Andy Bechtolsheim in NANOG 55, helps to understand Arista's vision: http://www.nanog.org/meetings/nanog55/abstracts.php?pt=MTk0MSZuYW5vZzU1&nm=nanog55

*** Commands that I wish the book included ***

1. favorite command: 'show interface counters rates | nz'
2. switch(s1)#sh logging last ?
  <1-9999>  Number of time units (sec|min|hr|day)



1 comment:

  1. I want to thank you for writing this article.This is great Article for me. It also more very informative & awesome. I expect more articles from you in future.
    Awesome information. Great Contribution to blog. I
    expect more articles from you in future. Keep it up!
    XMPP

    ReplyDelete