пятница, 18 марта 2016 г.

VLAN calculation algorithm design

I've had a task to automate calculation and setting up VLANs on switches of our regional network.

The task was naturally split into two phases:
  1. calculate VLAN numbers on the switches and trunks;
  2. synchronize switch configurations to the calculated result.
The first one was not quite obvious, a new algorithm needed to be designed. The second was done by another programmer.

I have designed and implemented the algorithm. Key points:
  • We have list of devices, trunks, client port VLANs on each device as input data. VLAN sets for each trunk is the output. The network has multiple loops for redundancy.
  • VLAN should be included on a trunk if it is two-way on the trunk, this means that we can spread VLANs from their endpoints on the graph and then calculate VLANs on trunks as intersection of sets of one direction and the other.
  • We spread VLAN set from each device as a bit vector. Each trunk has two associated  VLAN sets, one for each direction.
  • If we use STP protocol, then VLANs should be unified across each group of connected cycles (which have a common trunk).
  • If we use ERPS, then the VLANs should be unified starting from most outer half-loop to the base loop.
  • We can prune VLAN propagation if the trunk already contains all VLANs from the propagation set (an exception was necessary for multiply connected non-switch devices), thus runtime is reduced significantly.
I used perl and Bit::Vector module. The first version did not prune VLAN propagation and it took 1-2 minutes to complete on our network topology (more than 1600 devices), and after implementing the pruning it takes just 4 seconds (including database queries).

The following article was invaluable for understanding multi-ring ERPS topologies:
D. Lee, K. Lee, S. Yoo and J. K. K. Rhee, "Efficient Ethernet Ring Mesh Network Design," in Journal of Lightwave Technology, vol. 29, no. 18, pp. 2677-2683, Sept.15, 2011.

вторник, 15 марта 2016 г.

Reflections on the ancient ruins

When travelling in Greece I have visited Delphi and Athens. Watching the ruins in Delphi I wondered why all the buildings were destroyed and how.


And suddenly I have noticed a bronze peg in one of the stones. What is that?
Other stones had pits in them, but no pegs. Probably the stones of columns were connected with bronze pegs. They could be poured in via a small opening with molten bronze.

I have noticed only a few such pegs, but many pits. Also I have noticed that many of the pegs have holes from a wooden core. Bronze was not cheap at that time, armour and weapons were made of it. And the wooden inserts could be used to save bronze and reduce costs.

Next I thought: if the bronze was costly, it was logical for looters to try and get it.

So probably it was looters who finished the column destruction to get the bronze pegs. (Disclaimer: I'm not an archaeologist)

пятница, 26 февраля 2016 г.

upgrade-routeros script

I have developed "upgrade-routeros" perl script for safe and client-friendly graceful upgrade (or reboot) of RouterOS on ASBR and BRAS MikroTik routers.

It solves some problems automatically, which required manual work before.
  • BGP route updates don't propagate instantly. If we just upgrade the software by rebooting an ASBR, the traffic would be blackholed or looped for several minutes. To solve this problem, we need to disable BGP peers and wait for route propagation. Then we may safely reboot the router as the traffic is not directed to it any more. The new script waits for 5 mintes to let the network converge.
  • PPPoE sessions should be terminated gracefully. Otherwise cheap CPEs may hang or stop reconnecting automatically. To solve this problem, the script gracefully disconnects PPPoE users before rebooting the BRAS. To avoid new PPPoE sessions it sets max-sessions=1. To avoid disturbing the users too much it waits for the sessions to be at least 2 hours old. To avoid hammering the radius server the sessions are disconnected one per second. PPPoE servers are disabled after disconnecting all users (but one) to allow one more reboot for firmware upgrade.
This allows quite graceful software upgrade, provided that the network has at least N+1 redundancy in ASBR and BRAS.

After upgrade the script checks if the upgrade was successful, upgrades routerboard firmware if needed, then re-enables BGP peers and/or PPPoE servers.

The script uses MikroTik::API perl module.