DOC

Windows Compute Cluster Server 2003 - Reviewers Guide

By Alma Kelly,2014-08-10 06:53
10 views 0
Windows Compute Cluster Server 2003 - Reviewers Guide

Reviewers Guide

    Published: May 2006

    For the latest information, please see http://www.microsoft.com/hpc

Abstract

     ??For customers solving complex computational problems, Microsoft Windows Compute Cluster Server 2003 accelerates time-

    to-insight by providing a high-performance computing (HPC) platform that is easy to deploy, operate, and integrate using an organization’s existing infrastructure and tools. In the past, setting up and configuring a cluster was a technically complex task that often required dedicated support staff. Windows Compute Cluster Server 2003 has made the task easy by offering prescriptive setup procedures that simplify network configuration, the ability to load nodes remotely using Remote Installation

    Services (RIS), automated node configuration, and tools and technologies that help organizations configure cluster security settings. The integrated Job Scheduler, which can be accessed through a command-line interface (CLI) or through several application programming interfaces (APIs), can be used to submit and manage cluster workloads. Active Directory? integration provides end-to-end identity management and security features, while the Microsoft Management Console (MMC) supports extensible snap-ins and integration with Microsoft Operations Manager (MOM). Organizations can run leading applications from key independent software vendors (ISVs) on their clusters to help meet their needs in a timely and cost-effective manner while maintaining high performance. They can run parallel jobs using the Message Passing Interface (MPI) support provided in Windows Compute Cluster Server 2003, a full implementation of the MPI chameleon (MPICH) standard. Microsoft Visual Studio? 2005 provides developers an integrated development environment that includes parallel compiling and debugging capabilities. Windows Compute Cluster Server 2003 provides seamless integration with the Microsoft Windows Server? 2003

    operating systems, resulting in security, storage, and productivity gains for organizations.

Contents

    Overview of Windows Compute Cluster Server 2003 .............................................................. 1 Primer on Windows Compute Cluster Server 2003 ................................................................. 2 What is Windows Compute Cluster Server 2003? ............................................................... 2

    About MPI, MPICH, and MS-MPI ..................................................................................... 4 System Requirements......................................................................................................... 5

    Hardware Requirements................................................................................................... 5

    Software Requirements .................................................................................................... 6

    Network Requirements ..................................................................................................... 7 Getting Started ....................................................................................................................... 8 Setting up a Cluster ............................................................................................................ 8 Create a head node ............................................................................................................ 8 Install the Compute Cluster Pack ........................................................................................ 8 Configure the Cluster ........................................................................................................ 11 Technical Overview .............................................................................................................. 13 Architecture ...................................................................................................................... 13 Network Topology ............................................................................................................. 14

    Scenario One: Two NICs on the Head Node; One NIC on the Compute Nodes .............. 14

    Scenario Two: Two NICs on Each Node ......................................................................... 16

    Scenario Three: Three NICs on the Head Node; Two NICs on the Compute Nodes ....... 17

    Scenario Four: Three NICs on Each Node ..................................................................... 18

    Scenario Five: One NIC Per Node ................................................................................. 20 Features ........................................................................................................................... 20

    Compute Cluster Administrator ...................................................................................... 20

    Compute Cluster Job Scheduler ..................................................................................... 24 Task Execution ................................................................................................................. 26

    CLI ................................................................................................................................. 28

    The Compute Cluster Pack Application Programming Interface (CCPAPI) ..................... 29

    MS-MPI Features ........................................................................................................... 30 Cluster Security ................................................................................................................ 31

    Cluster Administrators .................................................................................................... 31

    Cluster Users ................................................................................................................. 31 Security Considerations for Jobs and Tasks ...................................................................... 31 Summary .............................................................................................................................. 33 Related Links ........................................................................................................................ 34

    1

Overview of Windows Compute Cluster Server 2003

    ??Microsoft Windows Compute Cluster Server 2003 provides an integrated application

    platform for developing, deploying, running, and managing high-performance computing (HPC) applications. Using this platform, individuals and organizations can perform multi-node workload computing using commodity hardware in an environment that will shorten their time to insight.

    HPC is increasingly being achieved with clusters of industry standard servers that can range from a few nodes (individual computers) to hundreds of nodes. Wiring, provisioning, configuring, monitoring, and managing these nodes and providing appropriate, secured user access is a complex endeavor that often requires costly support and administrative resources. Because users typically spend more time on cluster administration and management tasks than on running jobs, organizations experience a loss in productivity, as well. The goals of Windows Compute Cluster Server 2003 are to simplify management and reduce the total cost of ownership (TCO) of compute clusters, making them accessible to a broader audience. Based on these goals, Windows Compute Cluster Server 2003 has been designed to be intuitive to administer and manage. Its installation and system configuration processes are fully prescribed and largely automated. In addition, users will probably be familiar with the standard Windows features that it includes for deploying and managing clusters remotely. For example, because Windows Compute Cluster Server 2003 is fully integrated with the Microsoft Windows Server System? solution stack, users can also benefit from the advanced ?management technologies available in the Active Directory directory service and in Microsoft

    Operations Manager (MOM). Users familiar with the Windows Server? platform can become

    productive faster.

    Users whose work demands HPC solutions also require applications that execute complex computations and elaborate data output. Microsoft has worked with independent software vendors (ISVs) to port applications to Windows Compute Cluster Server 2003 that serve several markets, such as manufacturing, life sciences, geological sciences, and financial services. To help deliver on the promise of usability, a full-function Job Scheduler is provided which enables comprehensive job management through a Job Manager user interface (UI) or through a command line interface (CLI).

    Windows Compute Cluster Server 2003 supports the execution of parallel applications based on the Message Passing Interface (MPI) standard. Users can take advantage of the ?enhancements in Microsoft Visual Studio 2005 aimed at parallel computing, including

    support for the OpenMP standard and a parallel debugging capability that supports MPI. When a user submits a job to the cluster, the job is recorded in the head node database along with its properties, entered into the execution queue, and then run when the resources it requires become available. Because the cluster is in the user’s Active Directory domain, jobs

    execute using that user’s permissions. As a result, the complexity of using and synchronizing different credentials is eliminated, and the user does not have to use different methods of sharing data or compensate for permission differences among different operating systems. This means that Windows Compute Cluster Server 2003 offers transparent execution, access to data, and integrated security technologies.

    1 Windows Compute Cluster Server 2003 Reviewers Guide 1

Primer on Windows Compute Cluster Server 2003

    HPC has evolved considerably during the last 15 years. The “supercomputer” solutions of the

    early 1990s provided enormous parallel computing power, but they cost tens of millions of U.S. dollars and required a great deal of expertise to deploy, manage, and maintain. Consequently, customers for these solutions were generally limited to governments and large research institutions.

    The most advanced supercomputing solutions available today are still too expensive for many organizations, costing about one hundred million dollars. Almost any organization can harness the power of HPC by clustering inexpensive and readily available commodity hardware, thanks to the increased computing power based on 64-bit processor architecture and the associated increase in memory address space. The cost to create a small cluster for HPC is on the order of five to 10 thousand U.S. dollarsabout 10,000 times less than it cost 15 years

    ago to produce the same computing power.

    This decrease in the cost of clustered HPC solutions gives organizations more options. For example, they can submit large scale or high precision jobs to supercomputer systems while executing smaller, routine tasks using a local cluster.

    Unfortunately, most solutions are still difficult and costly to use, manage, and integrate into an overall computing infrastructure. In addition, they often do not support standard applications. With the release of Windows Compute Cluster Server 2003, Microsoft aims to remove the barriers that prevent individual engineers and scientists from independently leveraging the computing power available today. In addition, large enterprises with HPC needs can also benefit, because their resources are freed up to deploy local clusters that solve more problems more quickly.

    Windows Compute Cluster Server 2003 provides:

    ; Faster time-to-insighta superior out-of-the-box deployment experience; an integrated

    software stack that includes the operating system, Job Scheduler, and MPI layer; and the

    leading applications for each targeted vertical market.

    ; Better integration with IT infrastructureseamless integration with existing Windows

    infrastructures through features such as Active Directory, making it possible to leverage

    existing skills and technology.

    ; A familiar development environmentintuitive design so developers can leverage their

    existing Windows-based skills and experience. In addition, Visual Studio is the most

    widely used integrated development environment (IDE) in the industry, and Visual Studio

    2005 includes support for developing HPC applications, including parallel compiling and

    debugging capabilities. Microsoft Partners provide additional compiler and math library

    options. Also supports the MPI standard through the Microsoft MPI stack (called MS-MPI)

    or through non-Microsoft stacks.

    Windows Compute Cluster Server 2003 puts an HPC solution within reach of individuals and organizations that might not otherwise be able to afford its TCO. They can now leverage their investment in a Windows-based infrastructure and skilled resources to harness the power of HPC.

    What is Windows Compute Cluster Server 2003?

    Windows Compute Cluster Server 2003 uses clustered commodity servers that are built with a combination of the Windows Server 2003, Compute Cluster Edition operating system and the Microsoft Compute Cluster Pack.

    2 Windows Compute Cluster Server 2003 Reviewers Guide 2

    Windows Server 2003, Computer Cluster Edition is a specialized, 64-bit operating system based on the 64-bit edition of Microsoft Windows Server 2003. Although it is a full version of the Windows Server 2003 64-bit operating system, it is not intended for use as a general purpose server. Server roles in Windows Server 2003, Computer Cluster Edition are restricted to make Windows Compute Cluster Server 2003 available at a lower price. If users want to install Microsoft SQL Server? 2005 on a cluster server, for example, they must license and

    install Windows Server 2003, x64 Standard Edition or Windows Server 2003, x64 Enterprise Edition. Also, it is not possible to deploy Windows Compute Cluster Server 2003 on 32-bit hardware. For more information, see “System Requirements.”

    The Compute Cluster Pack contains the services, interfaces, and supporting software that users need to create and configure the cluster nodes, as well as the utilities and management infrastructure. The Compute Cluster Pack provides support for the Argonne National Labs Open Source MPI2 standard. For more information about MP12, see “About MPI, MPICH, and

    MS-MPI.” The Compute Cluster Pack also contains an integrated Job Scheduler and cluster resource management tools.

    Windows Compute Cluster Server 2003 is made up of the components listed in Table 1 and is deployed by installing two CDs. CD1 contains Windows Server 2003, Computer Cluster Edition. CD2 contains the Compute Cluster Pack.

    Table 1. Components of Windows Compute Cluster Server 2003

    Component Description

    Active Directory directory Each node of a cluster must be a member of an Active Directory domain,

    service because Active Directory provides authorization and authentication

    services for Windows Compute Cluster Server 2003. The domain can be

    independent of the cluster; for example, with a cluster running in a

    production Active Directory domain. Alternatively, it can run within the

    cluster, on the head node in scenarios where the cluster is a production

    environment.

    Head Node Provides user interfaces (UIs) and management services to the server

    cluster. The UIs include the Compute Cluster Administrator, the Compute

    Cluster Job Manager, and a command line interface (CLI). Management

    services include job scheduling, job and resource management, and

    Remote Installation Services (RIS). The head node can also serve as a

    network address translation (NAT) gateway between the public network

    and the private network that make up the cluster.

    Compute Nodes Computers configured as part of a compute cluster that provide

    computational the resources users to run jobs. Compute nodes can only

    be created on computers running a supported operating system, but

    nodes within the same cluster do not have to run the same operating

    system or use the same hardware configurations. However a similar

    configuration simplifies deployment, administration, and (especially)

    resource management. Tasks. Those with different hardware

    configurations will limit the cluster’s capabilities, because jobs running in

    Parallel mode and requiring nodes of different capabilities will be able to

    run only at the speed of the slowest processor in the selected nodes.

    Job Scheduler A service that runs on the head node and manages the job queue,

    resource allocation, and job execution by communicating with the Node

    Manager Service that runs on each compute node.

    MS-MPI Software A cluster’s key networking component that can use any Ethernet

    connection supported by Windows Server 2003, as well as low-latency

    and high-bandwidth connections such as InfiniBand or Myrinet. Gigabit

    Ethernet provides a high-speed and cost-effective connection fabric, while

    InfiniBand is ideal for latency sensitive and high-bandwidth applications.

    MS-MPI supports several networking scenarios.

    3 Windows Compute Cluster Server 2003 Reviewers Guide 3

    Management Infrastructure The Compute Cluster Pack offers a complete management infrastructure

    that enables the cluster administrator to deploy and manage compute

    nodes. This infrastructure consists of the cluster services running on the

    head node and all compute nodes, providing the administrative, user, and

    command-line interfaces used to administer the cluster, submit jobs, and

    manage the job queue.

    Compute Cluster Interfaces used by cluster administrators and users for cluster operations,

    Administrator and Job job submissions, and management. The Compute Cluster Administrator is

    Manager used to configure the cluster, manage nodes, and monitor cluster activity

    and health. The Job Manager is used for job creation, submission, and

    monitoring.

    Command Line Interface The Compute Cluster Pack offers a CLI for node and job management.

    (CLI) These operations can also be scripted. Administrators can use the CLI to

    automate job, job queue, and node operations.

    Public and Private Networks Compute nodes are often connected to each other through network

    interfaces. To manage and deploy nodes, administrators can configure

    compute clusters with a private network. They can also use a private

    network for MPI traffic. This traffic can be shared with the private network

    used for management, but the highest level of performance is achieved

    with a second, dedicated private network that supports only MPI traffic.

    Windows Compute Cluster Server 2003 allows organizations to easily and quickly deploy a computing cluster using standard Windows deployment technologies, and they can add more compute nodes to a compute cluster automatically by plugging the nodes in and connecting them to the cluster. The MS-MPI implementation is fully compatible with the MPICH2 standard and implements end-to-end security on all jobs. Integration with Active Directory enables role-based security for administration and users, and the use of the Microsoft Management Console (MMC) provides a familiar administrative and scheduling interface. About MPI, MPICH, and MS-MPI

    MPI is a standard API and specification for message passing. It was designed specifically for HPC scenarios executed on large computer systems or on clustered commodity computers. The MS-MPI is a version of the Argonne National Labs Open Source MPICH2 implementation that is widely used by existing HPC clusters. MS-MPI is compatible with the MPICH2 Reference Implementation and includes a full-featured API with more than 160 function calls. The MS-MPI software in Windows Compute Cluster Server 2003 is built on the Windows Sockets networking API (WinSock) so MS-MPI networking traffic can use TCP/IP as normal, or for best performance and CPU efficiency can use WinSock Direct Provider (driver) to bypass the TCP stack and go directly to the networking hardware’s native interface. MS-MPI

    can utilize any Ethernet interconnect that is supported on Windows Server 2003 as well as low-latency and high-bandwidth interconnections, such as InfiniBand or Myrinet, through Winsock Direct drivers provided by hardware manufacturers. As a result, a single MPI stack supports many fabrics. This flexibility should ease the burden of management for network administrators. More importantly, such flexibility means that applications do not have to be rebuilt for specific network hardware. As a result, ISVs have a much smaller test matrix for their products (speeding their time to market and lowering cost) and customers do not have to buy and maintain multiple versions (one for each type of network hardware they have) of a single application. Gigabit Ethernet provides a high-speed, cost-effective interconnection fabric, while InfiniBand is ideal for latency-sensitive and high-bandwidth applications. MS-MPI includes support (bindings) for the C, Fortran77, and Fortran90 programming languages. Microsoft Visual Studio 2005 includes a parallel debugger that works with MS-MPI. Developers can launch their MPI applications on multiple compute nodes from within the Visual Studio environment, and Visual Studio automatically connects the processes on each node, enabling developers to individually pause and examine program variables on each node.

    4 Windows Compute Cluster Server 2003 Reviewers Guide 4

System Requirements

    In Windows Compute Cluster Server 2003, any computer designated as either a head node or a compute node must meet minimum hardware and software requirements. If an organization’s plans include installing the administrative and user components on a remote computer, the remote computer must have an operating system that is compatible with the Compute Cluster Pack. Head nodes may also have additional software requirements, such as Remote Installation Services (RIS) or Internet Connection Sharing (ICS) Network Address Translation (NAT), depending on the networking environment in which the cluster is installed. Windows Compute Cluster Server 2003 requires the Microsoft .NET Framework version 2.0, which is included on the Compute Cluster Pack CD. The Compute Cluster Administrator requires Microsoft Management Console (MMC) version 3.0.

    Hardware Requirements

    The minimum system hardware requirements for Windows Compute Cluster Server 2003 are similar to the hardware requirements for Windows Server 2003, Standard x64 Edition. Table 2. System requirements of Windows Compute Cluster Server 2003 Requirement Windows Server 2003, Compute Cluster Edition

    CPU Requirement 64-bit architecture computer

    Intel Pentium, or Xeon family with Intel Extended Memory

    64 Technology (EM64T) processor architecture, or

    AMD Opteron family, AMD Athlon family, or compatible

    processor(s)

    Note:

    32-bit hardware can run CCS client software:

    ; command line interface

    ; job console

    ; administrator console

    CCS SDK supports 32-bit hardware: developers can

    create both 32-bit and 64-bit applications that will run on

    CCS nodes

    Minimum RAM 512 MB

    Maximum RAM 32 GB

    Multiprocessor Support Up to 4 processors

    Disk Space for Setup 4 GB

    Disk Volumes For a head node, the Windows Compute Cluster Server

    2003 installation process requires a minimum of two

    volumes (C:\ and D:\)one for the system partition, and

    one that RIS can use.

    Compute nodes only require one volume.

    RAID 0,1,5 may be used as appropriate, although it is not

    required.

    Network Interface Cards (NICs) All nodes require at least one NIC.

    5 Windows Compute Cluster Server 2003 Reviewers Guide 5

Requirement Windows Server 2003, Compute Cluster Edition

    If a private network is planned, and depending on the

    network topology selected, the head node requires a

    minimum of two NICs to create a public and private

    network.

    Each node may require more NICs as appropriate in case

    of public network access or in support of an MPI-based

    network.

Software Requirements

    This section describes the software requirements for Windows Compute Cluster Server 2003. The following are the required or compatible operating systems for Windows Compute Cluster Server 2003:

    Head Node and Compute Nodes

    The Computer Cluster Pack must be installed on a supported operating system. The supported operating systems are the same for both head node and compute nodes and include:

    ; Windows Server 2003, Compute Cluster Edition

    ; Windows Server 2003, Standard x64 Edition

    ; Windows Server 2003, Enterprise x64 Edition

    ; Windows Server 2003 R2, Standard x64 Edition

    ; Windows Server 2003 R2, Enterprise x64 Edition

    Remote Workstation Computer

    The Compute Cluster Administrator and the Cluster Job Manager are installed on the head node by default, but clusters can be managed and operated from a remote workstation. If the Compute Cluster Administrator or the Cluster Job Manager are installed on a remote computer, the computer must be running one of the following operating systems:

    ; Windows XP Professional with Service Pack 2 (SP2)

    ; Windows XP Professional x64 Edition

    ; Windows Server 2003 with Service Pack 1 (SP1), Standard Edition

    ; Windows Server 2003, Standard x64 Edition

    ; Windows Server 2003 with Service Pack 1 (SP1), Enterprise Edition

    ; Windows Server 2003, Enterprise x64 Edition

    ; Windows Server 2003 R2, Standard Edition

    ; Windows Server 2003 R2, Standard x64 Edition

    ; Windows Server 2003 R2, Enterprise Edition

    ; Windows Server 2003 R2, Enterprise x64 Edition

    RIS

    RIS can be used to automatically install compute nodes that are part of the cluster. Third party system imaging tools can also be used for deploying compute nodes.

    6 Windows Compute Cluster Server 2003 Reviewers Guide 6

Active Directory

    The Active Directory directory service is a central component of the Windows platform that provides the means to manage the identities and relationships that make up network environments. Active Directory stores information about objects on a network and makes this information available to users and network administrators. It gives network users access to permitted resources anywhere on the network using a single logon process. Membership in an Active Directory domain is required for Windows Compute Cluster Server 2003 head and compute nodes.

    All compute nodes must be in the same Active Directory domain as the head node. All computers in the compute cluster should be joined to an existing corporate Active Directory domain to leverage the existing directory service, authentication, and security infrastructure. If an Active Directory domain is not available, the head node can be used as a domain controller by running DCpromo.exe. If a unique domain is used for the cluster, trust relationships must be created between the new cluster domain and existing domains. To permit access from one domain to the other, more administrative tasks may have to be performed.

    Software Development Kit (SDK)

    The Windows Compute Cluster Server 2003 Software Development Kit (SDK) and associated utilities are supported by the operating systems in the Remote Workstation Computer list. Network Requirements

    Windows Compute Cluster Server 2003 supports five different network topologies, each with implications for performance and accessibility. The topologies involve at least one and possibly combinations of the following types of networks: public, private, and MPI. A public network is an organizational network connected to one or more cluster nodes. This is often a preexisting Ethernet network that most users log onto to perform their work. If the cluster is not connected to a dedicated private network, then all intra-cluster management and deployment traffic is carried on the public network.

    A private network is a dedicated cluster network that carries intra-cluster communication between nodes. This network carries management, deployment, and MPI traffic (if no MPI network exists) between nodes in the cluster.

    An MPI network is a dedicated, high-speed network that carries parallel application communication between compute nodes in a cluster. If no MPI network exists, then MPI communication is carried by a private network. If a private network does not exist, MPI traffic is carried by an external public network.

    Using a separate network to carry intra-cluster and MPI traffic between compute nodes and the head node will improve cluster performance and offload that traffic from the public network. Access from compute nodes to the public network can still be achieved by using NAT services on the head node.

    Four of the five network scenarios supported by the Compute Cluster Pack offer varying degrees of performance improvements attained by using one or more private networks to support cluster communications. Each scenario also provides a different degree of accessibility to the compute nodes from the public network. In the fifth scenario, all nodes comprising the cluster are attached to the public network. This scenario offers the greatest access to each compute node but also creates the heaviest network traffic demands on the public network.

    For detailed information on network topology scenarios see Technical Overview.”

    7 Windows Compute Cluster Server 2003 Reviewers Guide 7

Getting Started

    Installing Windows Compute Cluster Server 2003 involves installing the Windows Server 2003, Compute Cluster Edition operating system (on CD1) and the Compute Cluster Pack (on CD2).

    Setting up a Cluster

    Setting up a computing cluster with Windows Compute Cluster Server 2003 begins with confirming that the computer selected to be the head node of the cluster meets the minimum hardware requirements. For more information, see “Hardware Requirements.” The next step is

    to install the operating system and other prerequisite software, and then install the Compute Cluster Pack software. Configuring the network and RIS (where appropriate), and then adding nodes and users are the final steps.

    Create a head node

    1. Install the operating system using the Windows Server 2003, Compute Cluster Edition

    disk.

    2. If RIS will be used to create compute nodes, create a second disk volume (for example,

    D:\) that RIS can use while installing the operating system, because RIS cannot be

    installed on a system partition.

    3. Join the computer to an existing Active Directory domain (recommended), or create a new

    domain for the cluster by running DCpromo.exe and establishing a trust relationship

    between domains as necessary.

    4. Eject the Windows Server 2003, Compute Cluster CD and insert the Compute Cluster

    Pack CD.

    5. Configure the head node as explained in the following procedures.

    Note: You can also create a head node on a computer running a supported operating system by using the Compute Cluster Pack CD alone. The Compute Cluster Pack will not install if the computer used for the head node is not a member of an Active Directory domain. Install the Compute Cluster Pack

    1. Insert CD2, the Compute Cluster Pack CD into the CD-ROM drive of the computer;

    Setup.exe should automatically run.

    2. In the Microsoft Compute Cluster Pack Installation Wizard, select one or more of

    the three installation options, as shown in Figure 1. Then click Next.

    8 Windows Compute Cluster Server 2003 Reviewers Guide 8

Report this document

For any questions or suggestions please email
cust-service@docsford.com