Campus Compute Cooperative (CCC)

Welcome to the Campus Compute Cooperative.

Universities and other research organizations struggle to provide both the quantity and diversity of compute resources that their researchers need when their researchers need them. Purchasing resources to meet peak demand for all resource types is cost prohibitive for all but a few institutions. Renting capacity on commercial clouds is seen as an alternative to owning. Commercial clouds though expect to be paid. The Campus Compute Cooperative (CCC) provides an alternative to purchasing capacity from commercial providers that provides increased value to member institutions at reduced cost. Member institutions trade their resources with one another to meet both local peak demand as well as provide access to resource types not available on the local campus that are available elsewhere. Only when CCC resources are completely exhausted and there is more high priority demand than supply are commercial clouds considered. The CCC builds upon the XSEDE Execution Management Services (EMS), the Global Federated File System (GFFS), and the Campus Bridging use cases.

Participating institutions have dual roles. First as consumers of resources as their researchers use CCC resources, and second as producers of resources as CCC users use their resources. In order to avoid the tragedy of the commons in which everyone only wants to use resources, when a member-owned resource is used the resource owner receives a credit and the consumer is charged based on the quality of service (high, medium, low) and the particulars of the resource provided (speed, interconnection network, memory, etc.).

Background:

The cyberinfrastructure needs of Universities and other research organizations in the United States are rapidly increasing. These cyberinfrastructure needs include:

  1. Diverse computational resources. Different applications perform best on different computational platforms. For example, machine-learning applications are currently fastest on GPUs; large-scale astronomy simulations (MHD) have been tuned for traditional large-scale distributed memory machines; applications such as Galaxy work best with at least 0.5 TB memory; and many applications require only generic HTC computing resources. All but the largest organizations are unable to maintain a diverse set of state-of-the-art resources.
  2. On demand, elastic compute and storage capacity. For most research groups, cyberinfrastructure usage is bursty; a steady, low level of activity punctuated by periods of intense activity. As these periods of intense activity can benefit from a large number of resources, organizations tend either to over-provision or under-provision. In the former case, these 'extra' resources sit mostly idle, while in the latter case, a lack of resources slows down research progress.
  3. Secure data sharing. Even within a single campus, today's research infrastructures are siloed. Researches on projects that span institutions or rely on community data sets must be able to effectively and easily share data. While there are tools for moving data among organizations (e.g., gridFTP, scp), these tools have several shortcomings. First, they require researchers to explicitly manage data movement and worry about different versions of the data. Second, they require collaborators to have accounts on one another's infrastructure (or to create a shared "DMZ" infrastructure). Finally, existing applications cannot work directly with remote data without modification.

The Campus Compute Cooperative addresses these needs via a shared cyber-infrastructure. The CCC: i) connects and federate campus physical infrastructures; ii) uses market-based mechanisms for resource allocation and quality of service to avoid the tragedy of the commons; iii) leverages InCommon and local identity management systems to provide integrated cross-institution authentication and access control of resources; iv) facilitates secure data and storage sharing between institutions and research labs; v) offers "cloud bursting" to member institutions; and, vi) provides paid-for, differentiated quality of service.

CCC Objectives:

The goal of the CCC is to improve and accelerate the scientific and engineering enterprise. Specifically,

Through increased efficiency, improved value/cost ratio--for both institutions and the funding agencies.
Through on-demand resource provisioning, reduced time to insight for scientists.
Though easy-to-use tools, improved science via secure shared computational and data resources. Both within and among institutions, such tools reduce the barriers to collaboration for researchers.

By knitting together human and machine resources at participating institutions, the CCC provides an alternative to purchasing capacity from commercial providers. This federated use not only provides increased value to member institutions at reduced cost, but it also maximizes the efficiency of already-owned resources at all institutions. In the context of the CCC, compute cycles of different types, storage, and data are traded among members, not only in order to meet periods of local peak demand, but also to provide access to resource types not available on the local campus. All participating institutions will have dual roles as consumers of resources and producers/providers of resources. To allow the federation to function as a market, an accounting system is employed so that when a member-owned resource is used, the resource owner receives a credit and the consumer is charged based on the quality of service (high, medium, low) and the particulars of the resource provided (speed, interconnection network, memory, etc.).

The CCC combines four basic ideas into a production compute environment:

  1. Resource providers charge for the use of their resources and resource consumers pay for the use of resources. Buying and selling does not necessarily involve the exchange of real money. Instead, allocation units are exchanged between participating institutions. Charging for resources is not novel in academia; creating a market for resources in which actors are both buyers and sellers is.
  2. Users specify and pay for the quality of service they require. Quality of service attributes include the urgency of the job, whether the job can be pre-empted, the amount of memory required, particular interconnections or compute types such as GPUs, and so on. Users select, and pay for, the quality of service they desire. By allowing users to express the value of the job in terms of what they are willing to pay, we ensure that the CCC executes the most important job next, increasing overall institutional value.
  3. Resource federation, i.e., combining the resources of multiple institutions into a single, larger compute environment. This is akin to using an M/M/K queue versus K independent M/M/1 queues. By increasing the resource pool size, we can provide better quality of service.
  4. Open source, open standards. The implementation is open source and based on open standards. Researchers may develop new, improved, implementations that have different quality attributes such as performance, “quality”, robustness, and security properties.

The idea of sharing resources between institutions is not new, and such efforts have been attempted previously. However, due to technological advances, we believe that the CCC has a unique chance for success. Specifically, it is now possible to decouple the federation of the resource layer from the policy/human resource layers.

The team. Dr. Grimshaw was the co-architect of the XSEDE architecture and was deeply involved in the architectural responses to the XSEDE use cases including the Campus Bridging Use Cases (CBUC). Dr. Jha is a leader in the scientific distributed computing community and is co-PI of the NSF Molecular Science Institute. Dr. Skow. Dr. Skow is the Executive Director, Center for Computationally Assisted Science and Technology at North Dakota State University, and has been involved with a number of US National and International organizations including TeraGrid, Open Science Grid, Global Grid Forum, Research Data Alliance, and the Coalition for Academic Scientific Computation. In addition there are the initial members of the CCC, shown below in no particular order.

Want to join the CCC?

Any faculty or researcher at a participating institution is invited to share resources of various types (compute resources, data resources, and group and identity resources). See here for more details on what kind of resources can be shared.

Start using the CCC.

1) The first step toward using the CCC is to get an identity that allows you to use CCC resources. To get an identity, or bind an existing identity such as an XSEDE identity to the CCC you should send email to your institutional representative with your name, email address, and department.

InstitutionPrimary Institutional Contact
University of VirginiaAndrew Grimshaw
George Mason UniversityJayshree Sarma
University of Nevada-RenoScotty Strachan
North Dakota State UniversityDane Skow
StanfordRuth Marinshaw
RutgersShantenu Jha/James Barr von Oehsen
UtahTom Cheatham
University of Texas - DallasJerry Perez
Texas Tech UniversityAlan Sill

2) After receiving your identity, decide whether you want to install the client or server version of Genesis II (the client and server software that implements the GFFS).

Client Version
The client version allows you to access data and compute resources in the CCC. This includes the ability to define and execute jobs on the CCC job queues and share files via the GFFS. You may access remote resources even if you are behind a NAT or firewall, as long as you are able to make outgoing connections to the Internet.
Server Version
The server version allows you to share (export) compute and data resources that are located on your machine/server (without copying them to the cloud). However, you must be sure that your machine has a public IP address and, if there is a firewall, you must be able to open a port in the firewall. Additionally, you should be aware that when you export resources, any changes made to them via the GFFS are propagated to the original files in their original location.

3) Once you decide which version you want, proceed to the downloads page and select which version of Genesis 2 you want (client or server) and download the appropriate installer for your platform (Windows, Linux, Mac).

4) Install Genesis # on your machine. Genesis II can be installed via the command line or via the GUI installer.

Installation considerations and instructions

Help Resources

Running Jobs using CCC

Tutorials on YouTube

  1. Download and Install the GFFS Client (3:09)
  2. GUI Client Basics (8:28)
  3. Install a GFFS Container (21:37)
  4. Copy files in and out of the GFFS (28:37)
  5. Map the GFFS into your Linux file system using FUSE (9:39)
  6. Run a simple job with the GFFS (16:30)

Visit our YouTube Channel

Downloads

Campus Compute Cooperative (XSEDE) Installers

Citation

When using resources through CCC for your paper or presentation, please use the following citation as an acknowledgement. Thank you!

@inproceedings{grimshaw2016campus,

  title={Campus Compute Co-operative (CCC): A service oriented cloud federation},
  author={Grimshaw, Andrew and Prodhan, Md Anindya and Thomas, Alexander and Stewart, Craig and Knepper, Richard},
  booktitle={e-Science (e-Science), 2016 IEEE 12th International Conference on},
  pages={1--10},
  year={2016},
  organization={IEEE}

}