Previously I presented some background on Swift. I’d like to continue by focusing on The RESAR Project. This project leverages previous work done by Ignacio Corderi on the subject of Cloud Device Management. He presented his research in a paper: “RESAR storage: a system for two-failure tolerant, self-adjusting million disk storage clusters”. This paper was co-authored by Dr. Darrell D. E. Long (a professor at the University of California, Santa Cruz), Dr. Thomas. M. Kroeger (of Sandia National Laboratories) and by Dr. Thomas Schwarz (a professor at Santa Clara University).
Previously I presented a new project that I am working on for my PhD: “Swift RESAR“. To better understand it, here’s some background on Swift. Swift is a Cloud Storage implementation, a free open source software released under the terms of the Apache License. This project is managed by the OpenStack Foundation, a non-profit corporation established in September 2012. It is gaining wide popularity in that over 150 companies are currently participating in this project.
I would like to describe the latest project that I am working on for my PhD thesis program. This project involves a number of participants, so I will first provide short biographies.
First off, we have my thesis adviser (at Santa Clara University) Dr. Ahmed Amer. Dr. Amer received his Doctorate in Computer Science from the University of California, Santa Cruz. He is now an Associate Professor at both UC Santa Cruz and Santa Clara University. Ignacio Corderi is one of Dr. Amer’s PhD students at UC Santa Cruz. Recently Ignacio was the lead author of a rather interesting paper: “RESAR Storage: a System for Two-Failure Tolerant, Self-Adjusting Million Disk Storage Clusters.” Additional authors of this paper are:
In my first Data Mining blog post, I defined the concept of the Data Mining Stack, that the Data Mining Stack consists of three layers. The top layer is comprised of Data Mining Algorithms. The middle layer consists of the OLAP HyperCube. And the bottom layer is defined by the Ingest DataBase.
In the second Data Mining post, I discussed how the Data Mining Stack fits into the Cloud. And we want to implement it in the Cloud in the first place.
In my last Data Mining post, I defined the concept of the Data Mining Stack and that it consists of three layers. The top layer is comprised of Data Mining Algorithms. The middle layer consists of the OLAP HyperCube, and the bottom layer is defined by the Ingest DataBase. So how does the Data Mining Stack fit into the Cloud? And why would we want to implement it in the Cloud in the first place?
Data Mining in the cloud is a hot topic nowadays. As such, I’d like to define the Data Mining Stack and how it fits in the Cloud.
Data Mining Stack & The Cloud
Firstly, I will define the Data Mining Stack to consist of three layers: Algorithms (Heuristics), Hyper Cube, and Ingest DataBase. The first layer (Algorithms) consists of mathematical methods that draw conclusions from data sets. Some of the more common methods are: Gaussian Elimination, Neural Network, and Rule Association. It is of the utmost importance to note that deriving inferences is based on the field of statistics. And statistics gets its power from large data sets. It does not make sense to base decisions on small data sets. Small data sets are random and thus insignificant.
Memory Allocation (and subsequent deallocation) is central to process management. It is a very powerful mechanism. But great power also comes with a cost. Memory Allocation is easily abused and the results can be catastrophic. For the C programming language, Memory Allocation is performed by malloc() and free().
Processes are allowed to call malloc() and free() how often and when ever they choose. However, there are some dangers to avoid.
After 20+ years in software engineering, the transition to the Cloud Storage is creating some incredible opportunities. For example, the concept of “Path to Cloud” as apposed to “Path to Tape” is revolutionizing business. Traditionally, tape drives have been used for long-term backups. Government laws, like the Sarbanes Oxley Act of 2002, require certain types of data to be saved for 7 years. Using disk drives to fulfill this requirement would require a large number of disk drives, and therefore are prohibitively expensive.
Let’s spend some time on FIFO queue construction. A FIFO is a First In First Out queue. There are two methods of manipulation. The first is general purpose, in that it will work with any number of readers (consumers) and writers (producers). It does require a lock (also called a mutex or semaphore) to prevent concurrent threads from corrupting global data.
The second FIFO queue method does not employ locking; therefore it can only be used in the case of a single reader and writer. The trick to this new implementation is that we allow one slot in the queue to be wasted.
Last time I was comparing operating systems, namely RedHat vs. Windows. So what about free Unix? Is it still available and how useful is it? The answer is Ubuntu . Essentually Ubuntu is the latest and greatest free incantation of Unix and in short, it rocks.
It has a large, active user community and public developers, so it is vibrant and strong. Installing software on Ubuntu is easy via the aptget utility. To install the latest and greatest python use: “aptget install python pep8 pylint.” Nice and easy.
Let’s take a little time for operating system comparisons, specifically RedHat Linux, Windows, and Ubuntu. Let’s narrow the focus and concentrate on ease of use.
I have both a RedHat Linux desktop server and a Microsoft Windows desktop server at home. My family uses the Windows server, and I use the Linux server for my personnal research – PhD and such. I have tried to use the Windows server to some extent. Its graphics are nice, but it seems that Windows should be rebooted about once a week to keep the system sane. Otherwise, the server gets very slow. Perhaps memory fragmentation and garbage collection are a problem.
In my opinion, there’s a lack of respect for databases in the development community. I will present two data points to prove my point, including a MySQL example.
Data Point One
I first went to grad school at Washington State University in the early 80’s. When it came time to pick a Master’s Thesis, I was told, and I quote: “Thy shalt not do your Master’s Thesis on database.” “Database is a dead, full proved, subject.” Tell that to Oracle. Funny how I now work for Oracle. And my PhD thesis is in the area of database.
Back in May, I had the pleasure of attending the SSRC Retreat. The SSRC is short for System Services Resource Center. Sponsored by the UC Santa Cruz Computer Engineering department, the SSRC group was formed to enable Computer Engineering graduate students to form working relationships with companies and entities in the computer industry, both private and government. I got involved with the SSRC since my PhD thesis advisor (Dr. Ahmed Amer) received his PhD from UC Santa Cruz and he is currently an advisor to the SSRC.
The purpose of the SSRC Retreat is to provide a venue for graduate students to present their current work. I was very impressed by all the students’ presentations as each were interesting. However, I’d like to spotlight a just few that related to my perspective and experience.
Openstack Swift Storage Account Rings
The Account Ring is used by the Proxy nodes to determine where each account is stored. There is one and only one Account Ring for a given Swift Cluster. Input to the Account Ring is the tuple: (account name). The account name is then hashed, and the hash is used as an index into the Account Ring. The result is the three Storage nodes that contain a copy of the account data. All three copies are identical; so it does not matter which copy (Storage node) is used by the Proxy node. In practice, a specific Proxy node, given repeated requests for a specific account, will round robin the Storage node requests to help balance the load.
I would like to take this opportunity to describe how Openstack Swift stores data. Swift storage basically consists of four components: Ring, Database, Zones, and File system.
Openstack Swift Accounts, Containers & Objects
Openstack Swift data consists of Accounts, Containers, and Objects. A Swift item then, can be an account, a container, or an object. A Swift Cluster can have multiple accounts, an account can have multiple containers, and a container can have multiple objects. Containers cannot be nested. That is, a container can only contain objects, it can not contain other containers. So Swift storage is considered to be a flat file system.
I’m wrapping up a nine-part series on Symmetric Multi Processors (SMP). Following up on #8 the Random Back-off Problem which focused on a solution to isolating locks so that a cache line can only hold a single lock, I’ll conclude with another solution that completely solves the Thundering Herd problem in an optimal manner: Queued Locks.
I’ve shown you how a 32 bit cpu commonly uses an 8-word cache line. And we only want a single lock (or word) on a cache line. We can use the rest of the cache line to implement a Queued Lock.
Last time I presented a solution to the Thundering Herd problem: the Random Backoff, but noted a problem with solution. It is quite possible that when a cpu releases a contentious lock, waiting cpus will not wake up immediately. That is, it could take up to a second before a waiting lock tries to obtain the now-free lock. This solution is thus not optimal.
As a follow-up I’d like to present a SMP refinement that I believe solves this problem: isolating and separating all locks so that there is a single lock per cache line. Why do we do this?
In Part 6 I presented Symmetric Multi-Processor’s (smp) Thundering Herd problem; it’s the result of a lack of fairness in the mutex algorithm. This problem can occur when multiple cpus are using a lock under high contention. Next I’d like to share one of the smp solutions we used at Pyramid to solve this problem.
SMP Solution 1: Random Backoff
My last four blogs posts have presented:
- Symmetric Multi Processor (SMP) scaling problem.
- Solution as presented by Pyramid Technology.
- User space SMP problem.
- Various lock primitives.
It’s logical to now explore how to debug an SMP system. But first I need to present the classic locking problem as defined by Dijkstra.
My last three blog posts have presented the SMP (Symmetric Multi Processor scaling problem, Performance Solution as presented by Pyramid Technology, and the user space SMP problem. Now, I’d like to delve into the locks themselves. What actually is a lock?
There are essentually two types of SMP locks: mutexes and semaphores.
SMP Mutex Lock
A mutex (short for mutual exclusion), is a lock that waits for a field to be released before proceeding. In its simplist incantation, a mutex constantly checks the field in question. This type of lock thus uses the following alorithm: