Swift RESAR Project Part 3

This continues the RESAR saga. Last time I explained the project design; now I’d like to focus on how RESAR was implemented in Swift.

Two Available Swift Approaches

During our research, we realized that there were essentially two approaches available. The first approach emphasized leveraging existing code in the Python community. Swift is implemented in the Python programming language. So the Swift RESAR project is also implemented in Python. So for the first approach, the primary goal was to minimize the amount of new code thus resulting in timely results. On the other hand, the second approach emphasized performance. We were willing to write new code as long as it resulted in better RESAR performance. This approach will be presented in a future blog.

Swift RESAR Database

I will now describe the first approach to Swift RESAR. During this research, we quickly realized that the RESAR data (devices, disklets, reliability groups) needed to be stored in a database. A database is an established and optimal manner of storing persistent data that needs to be updated and queried. So for the first approach we chose MySQL as the database since it is well established, free, easy to install, and has a Python interface.

So the MySQL RESAR database consisted of five tables: MetaData, Device, Disklet, ReliabilityGroup, ReliabilityGroupsDisklets.

The MetaData Table is defined as:

    1. CreateTime TIMESTAMP NOT NULL
    1. DiskletSize INT NOT NULL

The MetaData Table contains a single row and is used to store the attributes:

CreateTime – the date and time that the database was created.
DiskletSize – the disklet size for the cluster.

The MetaData Table is created when the database is initialized and is not changed
henceforth.

The Device Table is defined as:

    1. ID INT NOT NULL AUTO_INCREMENT
    1. CreateTime TIMESTAMP NOT NULL
    1. HostName TEXT NOT NULL
    1. DeviceName TEXT NOT NULL
    1. DeviceStart INT NOT NULL
    1. DeviceSize INT NOT NULL
    1. InUse INT NOT NULL
    1. NumDisklets INT NOT NULL
    1. UNIQUE (ID), PRIMARY KEY (ID)
    1. INDEX (HostName(20))
    1. INDEX (DeviceName(20))

The Device Table contains an entry for each device in a cluster. Its attributes are:

ID – unique table identifier.
CreateTime – the date and time that the device was created.
HostName – the host name contains this device.
DeviceName – the device name.
DeviceStart – starting position in the device.
DeviceSize – the usable size of the device.
InUse – is this device enabled?
NumDisklets – number of disklets in this device.

The ID attribute is required so that each Device Table row can be referenced from the Disklet and ReliabilityGroupsDisklets Tables. A device is uniquely identifed by the tuple: HostName, DeviceName. To facilitate optimal device lookup, HostName and DeviceName are indexed. The InUse attribute allows a device to be taken out of service for maintenance purposes.

The Disklet Table is defines as:

    1. ID INT NOT NULL AUTO_INCREMENT
    1. DeviceID INT NOT NULL
    1. DeviceIndex INT NOT NULL
    1. Type TEXT NOT NULL
    1. UNIQUE (ID), PRIMARY KEY (ID)
    1. UNIQUE (DeviceID, DeviceIndex)
    1. INDEX (DeviceID)

The Disklet Table is used to keep track of disklet use in the cluster.
Thus each device (in the cluster) results in multiple Disklet Table entries.
The Disklet Table attributes are:

ID – unique table identifier.
DeviceID – Device Table id.
DeviceIndex – device offset.
Type – {none, parity, data}.

The ID attribute is required so that each Disklet Table row can be referenced from the Reliability GroupsDisklets Table. A disklet is uniquely identifed by the tuple: DeviceID, DeviceIndex. To facilitate optimal device lookup, DeviceID is indexed. The Type attribute indentifes that a disklet is used for “parity” or “data”, but not for both. If a disklet is not used by a reliability group, then its type is irrelevant and the Type attribute is thus set to “none”.

The ReliabilityGroup Table is defines as:

      1. ID INT NOT NULL AUTO_INCREMENT
      1. CreateTime TIMESTAMP NOT NULL
      1. NumDisklets INT NOT NULL
      1. InUse INT NOT NULL
      1. UNIQUE (ID), PRIMARY KEY (ID)

The ReliabilityGroup Table is used to manage reliability groups in the cluster. Its attributes are:

  • ID – unique table identifier.
  • CreateTime – the date and time that the reliability group was created.
  • NumDisklets – number of disklets in the reliability group.
  • InUse – is this reliability group enabled?

The ID attribute is required so that each ReliabilityGroup Table row can be referenced from the ReliabilityGroupsDisklets Table. The InUse attribute allows a reliability group to be taken out of service for maintenance purposes.

The ReliabilityGroupsDisklets Table is defined as:

          1. ID INT NOT NULL AUTO_INCREMENT
          1. ReliabilityGroupID INT NOT NULL
          1. DiskletID INT NOT NULL
          1. DiskletType TEXT NOT NULL
          1. DeviceID INT NOT NULL
          1. UNIQUE (ID), PRIMARY KEY (ID)
          1. UNIQUE (ReliabilityGroupID, DiskletID, DeviceID)
          1. INDEX (ReliabilityGroupID)
          1. INDEX (DiskletID)
          1. INDEX (DeviceID)

The ReliabilityGroupsDisklets Table is used to map disklets to reliability groups. Its attributes are:

  • ID – unique table identifier.
  • ReliabilityGroupID – ReliabilityGroup Table id.
  • DiskletID – Disklet Table id.
  • DiskletType – {parity, data}.
  • DeviceID – Device Table id.

The ID attribute is required so that each ReliabilityGroup Table row can be externally referenced. The DiskletType attribute indentifes that a disklet is used for “parity” or “data”, but not for both. A ReliabilityGroupsDisklets Table entry is uniquely identifed by the tuple: ReliabilityGroupID, DeviceID, DiskletID. To facilitate optimal table lookup, ReliabilityGroupID, DiskletID, and DeviceID are indexed.

It is of the utmost importance to employ indexing for many of the database attributes. Indexing helps optimize table query performance.

So now I have describe how RESAR was implemented in Swift using MySQL. My next blog will present performance measurements.