Monday, August 17, 2009

ADC - Design guidelines for Virtual DCs

Different types of virtualization


As we are aware, there are different types of virtualization technologies. In this article, I will try to analyze which one to opt for ADCs

- CLI virtualization
- OS virtualization
- Network Virtualization

Many ADCs have already implemented CLI virtualization.

CLI virtualization is a quick solution to addresses various customers needs in Virtual Data centers.
The main challenge ends up in the resource reservation per subscriber. Such a Virtualization, can only distribute the number of Virtual IPs among the customers and cannot restrict the throughput per user basis. It cannot restrict a subscriber using up all the route entries or Layer 2 forwarding data base tables etc.,
Most importantly, it cannot suit Virtual DCs with overlapped networks and IPs.

OS virtualization suits perfectly for the software based ADC. Any attempt to do it in hardware based
LBs will be a risky attempt as it impacts the performance and makes it more complex to manage and enhance in future for scalability or version upgrades. But, the advantage with single interface for provisioning the instances. This type of virutalization adds lot of OS overhead to each instance. But, this provides us memory and crash protection from other instances which is worth it. But, creating multiple VMs on a high end hardware still has its own disadvantages. The Virtualization OS is not designed or tailor made to run network switching applications. It carries lot of OS overhead and CPU scheduling and work load sharing that may not match ADC requirements. Running more VMs will also impact each ADC instance performance.
Right approach is to have specialized hardware with tailor made OS virtualization to suit ADC enviroment is a best bet.

Network Virtualization works by virutalizing the network stack. Its like each customer having a different
network stack from Layer 2 to Layer 7. It means, the Layer 2 FDB, L3 Routing tables and SLB tables are different for each subscriber. The differentiating factor would be VLANs. Each subscriber is allocated one VLAN. This helps in Virtual DC environments with overlapped networks and IPs. Ofcourse, this is not crash proof from other instances. And, it has no overhead as in OS virtualisation.

Now the question,which is best?
Every virtualization technology has its own pros and cons. It does not matter what path it takes, but it should
pass the customer needs.
What should the market look up in choosing the virtualization ADCs?
- Scalability: To suit the virtual DCs, scalability is must. Ideally, the ADC should not restrict the
connections per second, throughput etc., But, ofcourse, it can be license based provided it can
scalable to maximum possible for that system and should be reasonable to use in Virtual DCs.

- Provision ready: A third party solution should be able to provision the Virtualized ADC to create instances as well LB related config. These ADCs should support XML based configuration. Hope, a standard emerges to configure any type of ADCs.

- Seamless updates: Dynamic configuration changes must be supported. It should not impact other instance run time behavior with configuration changes or version upgrades. When one instance shuts down or restarts, there should not be problem for other instances even wrt to resources.

- Throughput protection : The virtualised ADCs can have abilitly to add more subscribers, but it should not effect the throughput promised to the subscriber.

- Overlapped networks: Should support overlapped IPs that are quite possible in Virtual DCs.

- Usage reports per subscriber: These reports are useful for Virtual DC administrators to know about the subscriber usage of the ADC for accounting or debug purposes.

Conclusion:
Every virtualization has its own pros and cons. Network and OS virtualization are better than CLI virtualization. OS virtualization on tailor made hypervisor to suit running ADCs is a better bet. It can provide better support from the vendor and its crash proof from other instances. ADCs running on hypervisors like Xen, Vmware etc., will be completely crash proof but tuning the parameters for the hypervisor to give good performance is a challenge. Its more visisble with more number of VMs running on the hypervisor. One should look for overall network performance with ADCs as VMs instead of just ADC instance performance.
Network virtualization gains in providing better performance as it has less OS overhead but loses wrt to crash proof from other instances.

Sunday, August 16, 2009

Guide to select best GSLB solution

There was a question to me
'Ok, If a XX ADC supports client proximity which is not based on DNS, is it best to go with. Do you suggest any other features to look for?'

As I see all the current ADCs do support most eye catching features, there are some things to check before selecting a GSLB equipped ADC.

Here is a check list
- Client proximity: Most LBs calculate Round Trip Time (RTT) by ICMP, TCP probe etc.,
But, think if this works in real time scenario.
a) RTT calculation should work with firewalls sitting at the client network
b) It should not add overhead in TCP communication with the client
c) Most importantly, RTT should be calculated to the real client. Assume a proxy between the client and ADC. If the ADC under consideration does ICMP or TCP probe, it may not reach the real client. So, the RTT is to the proxy but not the client.
d) VPN environment - If the client is behind a VPN gateway. The original location of the client is from private network. This private network can be geographically anywhere. But the source IP of the client NATs to a public IP of the VPN gateway that is connected to Internet. In this case, TCP probe or ICMP based RTT will go completely wrong. This is especially true with Hub and Spoke networks.
e) This solution should work for all protocols including HTTP, SIP, RTSP, SMTP etc.,
f) The client proximity solution should honor Layer 7 features like cookie persistency.
g) The client proximity information obtained must be in sync with all other sites.

- Site health checks- Health checks should be content aware and must be done to the remote server farms.

- system properties - The GSLB solution must consider system health properties like CPU, session utilization of all sites

- Persistency and Availability - GSLB persistency must be maintained. If a client connects to site B because site A is down. The next request should continue to go to Site B even when Site A comes back.
- Ask for throughput and connections per second with GSLB configured in the ADCs.
- Remote forward - When local servers are down, the client request must be proxied to remote site transparent to the client. Client should not even know that site is down.