Survey of distributed file system

By Samuel Turner,2014-10-25 23:03
7 views 0
Survey of distributed file system

    Survey of distributed file system

    1. Coda

    Table of Contents

    ; Introduction

    ; Publications


     The Coda distributed file system is a state of the art experimental file system developed in the group of M. Satyanarayanan at Carnegie Mellon University. Numerous people contributed to Coda which now incorporates many features not found in other systems: Mobile Computing

    ; disconnected operation for mobile clients

    o reintegration of data from disconnected clients

    o bandwidth adaptation

    ; Failure Resilience

    o read/write replication servers

    o resolution of server/server conflicts

    o handles of network failures which partition the servers

    o handles disconnection of clients client

    ; Performance and scalability

    o client side persistent caching of files, directories and attributes for high


    o write back caching

    ; Security

    o kerberos like authentication

    o access control lists (ACL's)

    ; Well defined semantics of sharing

    ; Freely available source code

    Coda was originally implemented on Mach 2.6 and has recently been ported to Linux, NetBSD and FreeBSD.Michael Callahan ported a large portion of Coda to Windows 95, and we are studying Windows NT to understand the feasibility of porting Coda to NT Currently, our efforts are on ports and on making the system more robust. A few new features arc being implemented (write-back caching and cells for example), and in several areas, components of Coda are being reorganized.


    ; Braam, P. J. The Coda Distributed File System. Linux Journal, #50 June 1998

    ; Satyanarayanan, M. Fundamental Challenges in Mobile Computing. Fifteenth

    ACM Symposium on Principles of Distributed Computing May 1996,

    Philadelphia, PA

    ; Satyanarayanan, M. Mobile Information Access. IEEE Personal

    Communications, Vol. 3, No. 1, February 1996

    ; Noble, B., Satyanarayanan, M. A Research Status Report on Adaptation for

    Mobile Data Access. SIGMOD Record, Vol. 24, No. 4, December 1995.

    ; Satyanarayanan, M. Scalable, Secure, and Highly Available Distributed File

    Access. IEEE ComputerMay 1990, Vol. 23, No. 5

    ; Satyanarayanan, M. Coda: A Highly Available File System for a Distributed

    Workstation Environment. Proceedings of the Second IEEE Workshop on

    Workstation Operating Systems Sep. 1989, Pacific Grove, CA

    ; Satyanarayanan, M. Autonomy or Interdependence in Distributed Systems?

    Third ACM SIGOPS European Workshop Sep. 1988, Cambridge, England

    ; Satyanarayanan, M., Kistler, J.J., Siegel, E.H. Coda: A Resilient Distributed File

    System. IEEE Workshop on Workstation Operating Systems, Nov. 1987,

    Cambridge, MA

2. Distributed File System(Microsoft)

    Table of Contents

    ; Introduction

    ; DFS Terminology


    DFS allows administrators to group shared folders located on different servers by transparently connecting them to one or more DFS namespaces. A DFS namespace is a virtual view of shared folders in an organization. Using the DFS tools, an administrator selects which shared folders to present in the namespace, designs the hierarchy in which those folders appear, and determines the names that the shared folders show in the namespace. When a user views the namespace, the folders appear to reside on a single, high-capacity hard disk. Users can navigate the namespace without needing to know the server names or shared folders hosting the data. DFS also provides other benefits, including the following:

    ; Simplified data migration

    DFS simplifies the process of moving data from one file server to another.

    ; Increased availability of file server data

    in the event of a server failure, DFS refers client computers to the next available server,

    so users can always access shared folders without interruption

; Load sharing

    DFS provides a degree of load sharing by mapping a given logical name to shared folders

    on multiple file servers.

    ; Security integration

    Administrators do not need to configure additional security for DFS namespaces

    because file and folder security is enforced by existing the NTFS file system and shared

    folder permissions on each target.

    DFS Terminology

     The following terms are used to describe the basic components of DFS:

    ; DFS namespace

    A virtual view of shared folders on different servers as provided by DFS. A DFS

    namespace consists of a root and many links and targets. The namespace starts with a

    root that maps to one or more root targets. Below the root are links that map to their

    own targets.

    ; DFS link

    A component in a DFS path that lies below the root and maps to one or more link


    ; DFS path

    Any Universal Naming Convention (UNC) path that starts with a DFS root. ; DFS root

    The starting point of the DFS namespace. The root is often used to refer to the

    namespace as a whole. A root maps to one or more root targets, each of which

    corresponds to a shared folder on a separate server. The DFS root must reside on an

    NTFS volume. A DFS root has one of the following formats: \\ServerName\RootName or


    ; domain-based DFS namespace

    A DFS namespace that has configuration information stored in Active Directory. The

    path to access the root or a link starts with the host domain name. A domain-based DFS

    root can have multiple root targets, which offers fault tolerance and load sharing. ; link referral

    A type of referral that contains a list of link targets for a particular link. ; link target

    The mapping destination of a link. A link target can be any UNC path. For example, a link

    target could be a shared folder or another DFS path.

    ; Referral

    A list of targets, transparent to the user, which a DFS client receives from DFS when the

    user is accessing a root or a link in the DFS namespace. The referral information is

    cached on the DFS client for a time period specified in the DFS configuration. ; root referral

    A type of referral that contains a list of root targets for a particular root. ; root target

    A physical server that hosts a DFS namespace. A domain-based DFS root can have

    multiple root targets, whereas a stand-alone DFS root can only have one root target.

    Root targets are also called root servers.

    ; stand-alone DFS namespace

    A DFS namespace whose configuration information is stored locally in the registry of the

    root server. The path to access the root or a link starts with the root server name. A

    stand-alone DFS root has only one root target. Stand-alone roots are not fault tolerant;

    when the root target is unavailable, the entire DFS namespace is inaccessible. You can

    make stand-alone DFS roots fault tolerant by creating them on server clusters.

3. Fraunhofer Parallel file System(FhGFS)

    Table of Contents

    ; Introduction


     Fraunhofer Parallel file System(FhGFS) is the new parallel File System from the Fraunhofer Competence Center for High Performance Computing. FhGfs is written from scratch and incorporate results from our experience with existing systems. FhGfs is a fully POSIX compliant, scalable file system with nice features like:

    ; Distributed metadata:

    Although parallel file systems usually distribute the file contents over multiple storage

    nodes, the metadata is often bound to single nodes. This leads to performance

    bottlenecks and limited fault tolerance. FhGFS distributes the metadata across all the

    available storage nodes in a special way that keeps the lookup time at a minimum.

    ; Easy installation:

    FhGFS requires no kernel patches, is able to connect storage nodes and servers with

    zero-config and allows you to add more clients and storage nodes to the running system

    whenever you want it.

    ; Support for high performance technologies:

    FhGFS is built on a scalable multithreaded architecture with native InfiniBand support.

    Storage nodes can serve InfiniBand and Ethernet clients at the same time and

    automatically switches to a redundant connection path in case any of them fails.

4. Lustre

    Table of Contents

    ; Introduction

    ; Architecture

    ; Features and Benefits

    ; Publications


     Lustre is an object-based, distributed file system, generally used for large scale cluster computing. The name Lustre is a blend of the words Linux and cluster. The project aims to provide a file system for clusters of tens of thousands of nodes with petabytes of storage capacity, without compromising speed or security. Lustre is available under the GNU GPL.

    Lustre file systems can support up to tens of thousands of client systems, petabytes (PBs) of storage and hundreds of gigabytes per second (GB/s) of I/O throughput. Businesses ranging from Internet service providers to large financial institutions deploy Lustre file systems in their data centers. Due to the high scalability of Lustre file systems, Lustre deployments are popular in the oil and gas, manufacturing, rich media and finance sectors.


     A Lustre file system has three major functional units:

    ; A single metadata target (MDT) per filesystem that stores metadata, such as filenames,

    directories, permissions, and file layout, on the metadata server (MDS)

    ; One or more object storage targets (OSTs) that store file data on one or more object

    storage servers (OSSes). Depending on the server’s hardware, an OSS typically serves

    between two and eight targets, each target a local disk filesystem up to 8 terabytes (TBs)

    in size. The capacity of a Lustre file system is the sum of the capacities provided by the


    ; Client(s) that access and use the data. Lustre presents all clients with standard POSIX

    semantics and concurrent read and write access to the files in the filesystem.

    The MDT, OST, and client can be on the same node or on different nodes, but in typical installations, these functions are on separate nodes with two to four OSTs per OSS node communicating over a network. Lustre supports several network types, including Infiniband, TCP/IP on Ethernet, Myrinet, Quadrics, and other proprietary technologies. Lustre can take advantage of remote direct memory access (RDMA) transfers, when available, to improve throughput and reduce CPU usage.

    The storage attached to the servers is partitioned, optionally organized with logical volume management (LVM) and/or RAID, and formatted as file systems. The Lustre OSS and MDS servers read, write, and modify data in the format imposed by these file systems.

    An OST is a dedicated filesystem that exports an interface to byte ranges of objects for read/write operations. An MDT is a dedicated filesystem that controls file access and tells clients which object(s) make up a file. MDTs and OSTs currently use a modified version of ext3 to store data. In the future, Sun's ZFS/DMU will also be used to store data.

    When a client accesses a file, it completes a filename lookup on the MDS. As a result, a file is created on behalf of the client or the layout of an existing file is returned to the client. For read or

    writer operations, the client then passes the layout to a logical object volume (LOV), which maps the offset and size to one or more objects, each residing on a separate OST. The client then locks the file range being operated on and executes one or more parallel read or write operations directly to the OSTs. With this approach, bottlenecks for client-to-OST communications are eliminated, so the total bandwidth available for the clients to read and write data scales almost linearly with the number of OSTs in the filesystem.

    Clients do not directly modify the objects on the OST filesystems, but, instead, delegate this task to OSSes. This approach ensures scalability for large-scale clusters and supercomputers, as well as improved security and reliability. In contrast, shared block-based filesystems such as Global File System and OCFS must allow direct access to the underlying storage by all of the clients in the filesystem and risk filesystem corruption from misbehaving/defective clients. Features and Benefits

    Lustre's unprecedented scalability, bulletproof reliability, and proven performance help you meet the uptime requirements of your most demanding business and national-security applications.

    Key Benefits

    ; Unparalleled scalability