CS260r is Topics and Close Readings in Computer Systems, spring 2014. Our topic: Cloud Big Data Systems.
4/21 Easy freshness with Pequod cache joins. Bryan Kate, Eddie Kohler, Michael Kester, Yandong Mao, Neha Narula, and Robert Morris. Proc. NSDI’14.
4/14 Fabric: A platform for secure distributed computation and storage (alternate link). Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers. Proc. SOSP’09. (Lucas presented)
4/14 Question: To improve Fabric’s resiliency to failure, one could implement storage nodes using Paxos or VR groups. New object versions would be stored in a replicated log. Contrast this hypothetical VR log with the distributed transaction logs that Fabric uses to commit transactions. Specifically, describe the contents of entries in the hypothetical VR log, and the contents of entries in the Fabric distributed transaction logs.
4/2 Transaction chains: achieving serializability with low latency in geo-distributed systems. Yang Zhang, Russell Power, Siyuan Zhou, Yair Sovran, Marcos K. Aguilera, and Jinyang Li. Proc. SOSP’13. (Scott presented)
4/2 Question: Transaction chains, like Spanner, can fall back on two-phase locking and two-phase commit for transaction execution. Explain how Spanner’s two-phase locking differs from that of transaction chains. For instance, when does each system use two-phase locking, and in each system what granularity of data is locked?
3/31 Spanner: Google’s globally distributed database. 26 authors. Proc. OSDI’12. (Eddie presented)
3/31 Question: Describe two different ways Spanner’s user-visible behavior would change if the TrueTime API’s ε value (the error bound) changed a lot.
3/26 In search of an understandable consensus algorithm. Diego Ongaro and John Ousterhout. Technical report. (Nate presented)
3/26 Question: Describe a scenario (e.g., number of servers, failure pattern, client requests) where Raft would do something substantively different than viewstamped replication (e.g., commit a different client request, recover a different server). Explain what Raft would do and what VR would do.
3/12 Transactional storage for geo-replicated systems. Yair Sovran, Russell Power, Marcos K. Aguilera, and Jinyang Li. Proc. ACM SOSP’11. (Marco presented)
3/12 Question: Describe an execution of two or more transactions, written for an application like the Walter ReTwis (using their description as a reference), that produce a result under Walter’s PSI that is impossible under normal snapshot isolation.
3/10 Sinfonia: A new paradigm for building scalable distributed systems. Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, and Christos Karamanolis. ACM Trans. Computer Systems 27(3), Nov. 2009. (Andrew presented)
2/26 Fast crash recovery in RAMCloud. Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Proc. ACM SOSP’11. (Stephen presented)
2/26 Question: Section 3.5.2 discusses how RAMCloud ensures that a recovery master can detect that a recovered log is complete (has at least one copy of every segment). The design uses a recovery digest, and a protocol that occurs every time a new segment replica is added: a new digest is inserted in the new replica and marked “active”; once this is persisted, the previous active digest is marked as inactive. Does this design require that RAMCloud block client requests while waiting for a disk write? If yes, then how often does this happen? If no, then why not?
2/24 The Chubby lock service for loosely-coupled distributed systems. Mike Burrows. Proc. OSDI’06.
2/24 Paxos made live: An engineering perspective. Tushar Chandra, Robert Griesemer, and Joshua Redstone. Proc. ACM PODC’07. (Mike presented)
2/24 There is more consensus in egalitarian parliaments. Iulian Moraru, David G. Andersen, and Michael Kaminsky. Proc. SOSP’13. (Brad presented)
Experience with ePaxos: Systems Research using Go, by Dave Andersen
2/12 Background Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems. Amar Phanishayee, Elie Krevat, Vijay Vasudevan, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Srinivasan Seshan. Proc. FAST’08 (USENIX Conference on File and Storage Technologies, Feb. 2008.
2/12 Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication. Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Brian Mueller. Proc. SIGCOMM’09, Aug. 2009. (Bob presented)
2/12 Data Center TCP (DCTCP). Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. Proc. SIGCOMM’10, Aug. 2010. (Lucas presented)
2/12 Question: The Vasudevan et al. paper (call it “fine-grained RTO”) and the Alizadeh et al. paper (“DCTCP”) both address the incast problem, but they address it at different scales. What are those scales? Does either paper dominate the other with respect to the incast problem? Does either paper dominate the other overall?
2/10 Background Events Can Make Sense. Maxwell Krohn, Eddie Kohler, and M. Frans Kaashoek. Proc. 2007 USENIX Annual Technical Conference, June, 2007, 87–100.
2/5 MapReduce and Parallel DBMSs: Friends or Foes?. Michael Stonebraker, Daniel Abadi, David J. DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo, and Alexander Rasin. Communications of the ACM 53(1), Jan. 2010, 64–71.
2/5 MapReduce: A Flexible Data Processing Tool. Jeffrey Dean and Sanjay Ghemawat. Communications of the ACM 53(1), Jan. 2010, 72–77.
2/5 MapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. Proc. OSDI’04: 6th Symposium on Operating System Design and Implementation, Dec. 2004.
2/5 A Comparison of Approaches to Large-Scale Data Analysis. Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. Proc. SIGMOD’09, June 2009.
2/5 MapReduce: A Major Step Backwards. David J. DeWitt and Michael Stonebraker. The Database Column (The Vertica Systems Blog), Jan. 17, 2008.
1/29 Paxos Made Simple. Leslie Lamport. ACM SIGACT News (Distributed Computing Column) 32(4), Dec. 2001, 51–58.
1/29 Viewstamped Replication Revisited. Barbara Liskov and James Cowling. MIT Technical Report MIT-CSAIL-TR-2012–021, July 2012.
Background:
Paxos Benchmarks:
Chubby Implementations: