Management Process Network Traffic Scaling

Overview of  Management Traffic Scalability
Management solutions and applications available in the market today tends to inject or in fact flood networks and network devices with many SNMP MIB object requests through SNMP Get-Next and through Get-Bulk operations. Such flooding of management information has proven to be very costly as it has been a major reason for (breaking!) network devices (Core devices included) and for congesting major network links!
Another very important observation largely overlooked by administrators and MIS managers is the fact that management traffic is the lowest priority processed traffic within the network; this means that management traffic is the first to be dropped whenever a link or a device are overloaded by the very same traffic that is supposed to manage the network devices and resources! Actually in the midst of all of this; the priority (derived from importance and value) of the management traffic itself over other management traffic is never addressed! To take this a layer deeper; it’s a question of which management traffic has more weight to the administrator or to the organization when compared to other management traffic and at which time or on which network resource, service, and/or set of events?
Plenty of open questions; however, all converge on addressing the issue of scalability of network management traffic. It is very important to control the type and the amount of traffic managing network resources, services, and/or set of events without exhausting or compromising those very network resources, services, and/or mismanagement of the set of events detected.    
An example of such MIB Objects is the RFC1213 ifAdminStatus object. This object is being polled today by many management solutions and applications, a typical report output would then display the administrative status of ALL the interfaces polled and there could be tens of them on each managed element  polled.
Any NMS application available in the market today should test the value of the ifAdminStatus to verify that it is in the ‘UP’ state before reporting on any other data pertaining to the corresponding interface. It is indeed logical to ONLY manage an interface that is set to the ‘UP’ state (or at least interfaces NOT set to the ‘DOWN’ state) by the human administrator and avoid unnecessary wasting of management infrastructure resources.
SISL Engine and commands are exclusively network Traffic-Centric. This facility allows the administrator or the management script designer a greater freedom and control over the amount and over the nature of traffic to be injected into the network or to be directed to certain managed elements over other elements. It also allows the designer to selectively poll certain MIB Objects and estimate the amount of traffic that impacts the overall network, and the individual managed elements per running management script. This type of control leads to higher process design and implementation granularity when dealing with the traffic volumes generated by polling those MIB Objects on the one hand, and on the other hand leads to more control over the flow of the management process itself and how its structured.
Management Traffic Scalability Case Analysis
When evaluating/estimating device support for any NMS product available in the market today; NMS vendors (in fine print) declare that their respective products have been tested on a 1000 or 2000 device networks. Sometimes they would say: "WELL... DEPENDING ON YOUR NETWORK CONDITIONS; OUR PRODUCT CAN SUPPORT UP TO 5000 DEVICES." Hence; no vendor today can say for sure how many devices exactly a particular NMS product could support because even if they know YOUR network very well; They WILL NEVER know the set of condition changes that any particular network goes through from day to day or even from hour to the next hour! In fact they need a network management type statistics to evaluate that in the first place before declaring the number of device supports! and here lays the irony!
Now; if we assume support of 10000 devices; then this is surely an uneducated guess on our part. But so are all claims from other vendors! The fundamental difference between SOSL and other NMS tools and applications is the use of SNMP Bulk-Requests and Get-next Requests. To give an example:
You have a network of 100 devices each of 50 interfaces in total (of whom 10 interfaces are of "ethernetCsmacd" types for each device). You have a management policy to calculate the Utilization on each of your "ethernetCsmacd" type interfaces. For such calculation you need to collect the In/OutOctets counters, IfSpeed, sysUpTime MIB Objects as minimum requirements for reasonably accurate calculations.
A typical NMS application will have to poll the ifTable MIB Objects which includes all standard 24 objects to learn the interfaces and determine which of the interfaces are of the "ethernetCsmacd" type at the NMS station, or else it will poll all In/OutOctets, IfSpeed and ifType on all interfaces; this will insert:
100*( (4 MIB objects * 50 interfaces) * 2) = 40,000 SNMP messages
This will have to be repeated to get the second run values of the counters necessary for the calculation; so the total number of SNMP messages necessary for this calculation is 80,000 SNMP messages.
However; running a SOSL based management script to learn the interfaces say by interface description first (by polling the ifDescr MIB Object to learn all those interfaces that are Ethernet types); will insert:
100 * (50 interfaces * 2) = 10,000 SNMP messages.
Now a SISL based management script could be programmed to check which of the interface indices are of type Ethernet (we could also perform another POLL operation on the ifType object which costs another 10000 messages but this is unnecessary as we could also learn indices by the ifType directly). Now SOSL Engine knows which are the "ethernetCsmacd" type interfaces; then it can bundle all remaining objects (plus the sysUpTime) to each device with a single POLL operation; this will insert:
100* (10 interfaces *2) = 2000 SNMP messages.
For the second run we ONLY need to repeat the 2000 SNMP messages above and test the sysUpTime value to check if the device reloaded before learning the interfaces again!
Hence the total messages consumed by SOSL Engine are only 14,000 SNMP messages.
Hence SISL uses 66,000 messages less to do the same set of calculations compared to a typical NMS application. This is 17.5% of the total traffic! It also suffices to consider the amount of savings on the NMS host Server machine’s resources and on the resources of the managed devices being polled! Not to forget to mention the accuracy improvement brought about by using the sysUpTime bundled in the same SNMP requests to calculate the Time interval in the formula for calculating Utilization (please check our Solution Center’s Case Studies).
It is really you (the network administrator) who decides on how much of the  management traffic is to be inserted and what data to retrieve from which of the managed devices and at which times of the day as well as the order of which devices are contacted first and so on. Hence, its is fair to say that SISL gives the network administrator(s) the opportunity to answer for themselves the following key management questions:
WHAT information do I need to manage?
WHEN do I exactly manage this information?
WHERE on the network do I need to manage the specific information? and finally.
The EXTENT of the management of this information to be performed.
The "Control" that the network administrator exercises over the above mentioned questions of WHAT, WHEN, WHERE, and the EXTENT of the management information determines the bottom line number of devices supported on a particular network. Please also read on our About IMMS to learn more about our Frame Work in which we have based the SISL Engine.
Autonomous Management Zones (AMZs) & Traffic Scalability
Traffic Scalability is influenced at the ‘Logical Distribution Level’ of network management as well. Since the SOSL Engine allows network administrators to logically break large networks into a number of Autonomously Managed Zones (AMZ) with each logical Zone containing any number of network devices where a SOSL management script could be executed to poll MIB Objects for only those devices in the specified AMZ or set of AMZs. At the same time administrators could allow certain access rights and controls to different SmartMIB Users, with each user administering own respective AMZ(s) and own respective Process(es) running against that AMZ This allows administrators to build a much more scaled, distributed, and consistent approach to process architecture.
To add devices to Zones is a simple GUI based point and click procedure available on the  SmartMIB User ' s Manual; Managing Autonomous Zones.
Traffic Scalability is refined by logically grouping those managed elements where a specific set of MIB Objects could be polled. The typical flow of such solutions is as follows:
< Learn all responding instance indexes  >
POLL <(MIB_Objects to be used as conditions)>;
            STORE-INDEX “< To store the instances that met the criteria >”;
            POLL <(ONLY those MIB_Objects that met the criteria above)>;
To give an example;
MODULE FilterTables ();
DESCRIPTION "Interface INVENTORY information";
            ALL-DEV BY DEV
                        SET-INDEX ifTable ( ifDescr [ ifIndex, INT ] );
                        WITH-INDEX ifTable
                                    POLL (   ifDescr;
                                                ifOperStatus      );
                                    DEFINE ifAdminStatus     DB DISPL;        
                                    DEFINE ifType                DB DISPL;        
                                    DEFINE ifOperStatus       DB DISPL;
                                    DEFINE ifSpeed              DB DISPL;        
                                    IF  ( ifAdminStatus    == "up"                                    AND
                                           ifOperStatus       == "up"                                    AND
                                           ifType                    == "^ethernetCsmacd"      AND
                                           ifSpeed                 >  20000000      )
                                                STORE-INDEX "IfHCEthernet";
                                                # “ifHCEthernet“ index list now contains the list of High Speed
                                                # Ethernet csmacd type interfaces on each of the managed
                                                # elements
Here, the ifTable on each managed element is learnt first and the interfaces within this table that adhere to the following criteria: an Ethernet, Administratively as well as Operationally in an ‘UP’ state, with a nominal bandwidth speed set to higher than 20 Mbits/Second. Those interfaces –and only those interfaces- are stored by their unique indexes in a new index list ‘ifHCEthernet’ where only this list is used by the managing entity to bundle the requests to the managed devices 
From the above discussion; we could conclude that understanding and pre-defining the management goal(s) directly contributes to the scale and volumes of the SNMP MIB data to be polled when deploying the overall management process(es). Controlling Traffic volumes is critical to achieve efficiently an engineered Traffic Scalability of network service statuses and events by implementing appropriate measures to meet the management process objectives set by the network administrators