The Emerging Revolution In Server Architecture

Kushagra Vaid, GM-Azure Cloud HW Engineering, Microsoft [NASDAQ: MSFT]

There has been significant progress in industry-standard server designs over the past two decades, starting with the introduction of the first rack mounted server in 1993. Since then, technological advances in semiconductor manufacturing and microprocessor architecture have pushed the boundaries in server design, and the industry has driven consistent improvements in performance, power efficiency and cost savings over the years. Propelled by Moore’s law, this rate of progress has been one of the foundational pillars for the growth of Enterprise IT starting in the early 2000’s and the rise of public cloud in recent years.

Going forward, as the industry accelerates the shift from Enterprise to Cloud computing, new solution stacks are being architected using global scale public cloud services to deliver entirely new product experiences to customers. The Datacenter infrastructure to deliver such global scale services is also evolving at a rapid pace, with a corresponding level of innovation in hardware technologies unlike anything seen before in the industry. Driven by the rapid growth of the public cloud and scale-out workloads, we are now on the cusp of a revolution in computing architecture which will completely redefine the classical notion of a “Server”.

Emergence of new Hyperscale workloads

The public cloud has grown in three main dimensions: Infrastructure-as-a- Service (IaaS) for lift-and-shift of Enterprise workloads to the cloud, Platform-as-a-Service (PaaS) for building cloud native applications that are designed for global scale and fault tolerance, and Software-as-a-Service (SaaS) that provides full turnkey solutions as a cloud offering. The initial demand for cloud computing was driven primarily by IaaS, but more recently the PaaS and SaaS offerings are emerging as focal points for disruptive innovation on how such services can be consumed across various market verticals. Some examples are Cognitive services using Machine Learning algorithms for image, video and speech processing, Chatbots supporting Conversations-as-a-Platform (CaaP), and Internet-of-Things (IoT) services for commercial scenarios such as Jet engines and Connected cars.

  The years ahead will witness a radical departure from traditional server designs towards innovative new architectural paradigms optimized for large scale computing 

The newly emerging cloud services are quite distinct from traditional “legacy” IaaS applications in how they utilize the underlying hardware resources. These highly parallel workloads operate over hundreds (and sometimes thousands) of machines in the datacenter, requiring significant networking bandwidth and compute resources. Some of these workloads stream large amounts of real-time data which needs to be acted upon instantly before being stored. Most of these workloads can run entirely in the server’s Input/Output (I/O) complex with minimal microprocessor interaction, and in many cases the operations do not map well to the architecture of current microprocessor designs. Another property common to these datacenter workloads is the processing overhead incurred during large-scale server-server communication (also referred to as the Datacenter Tax) for intensive operations such as compression and encryption for all data-in-transit and data-at-rest.

These next generation cloud workloads are driving a complete re-think on the computing architectures and hardware infrastructure required to efficiently host such distributed scale-out applications.

Moving beyond the classic Von Neumann bottleneck

Designing highly performant hardware to host such Hyperscale services requires reevaluating the basic principles of the underlying computing architecture. Current server systems are based on the Von Neumann architecture (which dates its origins to 1945) and is defined by a distinct separation between the compute, memory and Input/ Output (I/O) devices attached to the server (see Figure 1)

In such machines, the Von Neumann bottleneck is defined as the limitation on performance arising from “chokepoint” between where computation happens and where data is stored.

The hardware industry has addressed this bottleneck so far using techniques such as larger microprocessor caches, multi-threading, multi-core and 3D packaging. But the bottleneck persists and leads to inefficient use of transistor gains from Moore’s law. The net result is that current server designs cannot efficiently execute the full spectrum of Hyperscale cloud workloads using the “one-size-fits-all” approach that worked so well in the past. This has significant implications to the total cost of ownership (TCO) for Datacenter infrastructure, and it is imperative for the industry to evolve the computing paradigm and associated hardware designs beyond the limitations of today’s Von Neumann based machines.

The Road Ahead

To meet these new set of computing challenges, the hardware industry is responding with a wide variety of architectural choices, each tuned for high performance execution of a specific Hyperscale workload. Some examples are FPGAs for accelerated compute and network processing, GPUs and Dataflow engines for Machine Learning, Processing-in-Memory designs for high throughput pattern processing and Neuromorphic computation for Artificial Intelligence. While there is no single silver bullet architecture that maps well to all workloads, the trend being observed is a creative blending of the classic Von Neumann architecture with such alternate computing architectures. This is leading the industry into a new era of disruptive innovation and massive experimentation. Startups are getting funded to explore breakthrough ideas for systems with customized chip designs, silicon providers are investing heavily to address these new workloads and augment their product roadmap, and large Cloud Service Providers (CSVs) are deploying such blended computing architectures to deliver a whole new class of cloud services to customers while improving operational margins. This industry disruption is already resulting in mergers and acquisitions as existing players vie to maintain their lead and protect their market segments. As this trend progresses, the classical notion of the “Server” is being redefined in ways that were previously unimaginable.

Over the past two decades the server industry has progressed on a predictable cadence with incremental improvements to hardware primarily driven by Moore’s law gains. With the growth of the cloud computing and vertical solution stacks built on emerging Hyperscale services, the years ahead will witness a radical departure from traditional server designs towards innovative new architectural paradigms optimized for large scale computing. This trend is expected to even further accelerate the adoption of public cloud services, since only the large CSVs will have the scale and capability to sustain the heavy R&D investment required to efficiently design and deliver this next generation Datacenter infrastructure. We are on the verge of a revolution in computing architecture, and the “Server” as we know it today will no longer exist in this new world of Hyperscale services.

Read Also

Managing Big Analog Data from the Internet of Things

Tom Bradicich, VP & GM-Servers and IoT Systems, Hewlett Packard Enterprise [NYSE:HPE]

Getting Moore Out of Your Cloud

Doug Sandy, Chief Architect, Hyperscale & Cloud Solutions, Artesyn Embedded Technologies

The Expansiveness of Microsoft SQL

Denny Cherry, CEO, Denny Cherry & Associates Consulting

Implementing Technology That Understands Your Business

Michael Everly, CIO, D&H Distributing