High-scale interactive services demand high throughput with low latency and high availability, difficult goals to meet with the traditional stateless 3-tier architecture. The actor model makes it natural to build a stateful middle tier and achieve the required performance. However, the popular actor model platforms still pass many distributed systems problems to the developers.
Who What Where?
Big ups to Thomas Street for hosting! Show up at 6:30 for food, discussion of the paper starts at 7:00 on the dot.
Be an adult, don't be a jerk. You can find more details in the
We will be talking about Peter Bailis' 2015 paper "Feral Concurrency Control.
The rise of data-intensive “Web 2.0” Internet services has led to a range of popular new programming frameworks that collectively embody the latest incarnation of the vision of Object-Relational Mapping (ORM) systems, albeit at unprecedented scale. In this work, we empirically investigate modern ORM-backed applications’ use and disuse of database concurrency control mechanisms. Specifically, we focus our study on the common use of feral, or application-level, mechanisms for maintaining database integrity, which, across a range of ORM systems, often take the form of declarative correctness criteria, or invariants. We quantitatively analyze the use of these mechanisms in a range of open source applications written using the Ruby on Rails ORM and find that feral invariants are the most popular means of ensuring integrity (and, by usage, are over 37 times more popular than tr…
We've got another two-for-the-price-of-one paper-loving spectacular!
Brandon Sherman presents Why Should I Trust You? Explaining the Predictions of Any Classifier.
"Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust in a model. Trust is fundamental if one plans to take action based on a prediction, or when choosing whether or not to deploy a new model. Such understanding further provides insights into the model, which can be used to turn an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction."
Andrew Beyer presents <…
Scott Francis (@mechazoidal) will be guiding us through two papers on the Styx Architecture. The Styx Architecture for Distributed Systems and Styx-on-a-Brick invented by Rob Pike (he invented the Go language) and Dennis Ritchie (he co-invented the C language).
To parse or not to parse: a two-for-one paper-loving showdown!
David Murray presents Projecting a Modular Future
"We describe two innovations in programming languages: modularity and projectional editing. Language modularity refers to the ability to combine independently developed languages without changing their respective definitions. A language is not anymore a fixed quantity, instead it can be extended with domain-specific constructs as needed. Projectional editing refers to a technique of building editors and IDEs that avoid the need for parsers. They support a wide range of tightly integrated notations including textual, symbolic, tabular and graphical. In addition, by avoiding parsers, the well-known limitations of grammar composition are avoided as well."
David Graunke presents Parsing with Derivatives…
The Main Event
George Reilly presents LKRhash.
LKRhash is a scalable hashtable. It scales to multiple processors and to millions of items. It was invented at Microsoft in the late 90s by Paul Larson, Murali Krishnan, and George Reilly. This talk is based on an unpublished paper that was submitted to Software: Practice & Experience.
Who What Where
Big ups to Comcast for hosting this month! There will be a person at the front door ushering folks up to the 11th Floor for the event.
Like all chapters of Papers We Love, we abide by and enforce the PWL Code of Conduct. Please give it a read, plan on conducting yourself accordingly, and contact one of the organizers if you need to report an incident.
Error rates across one of Facebook’s sites were spiking. The problem had first shown up through an automated alert triggered by an in-memory time-series database called Gorilla a few minutes after the problem started. One set of engineers mitigated the immediate issue. A second group set out to find the root cause. They fired up Facebook’s time series correlation engine built on top of Gorilla, and searched for metrics showing a correlation with the errors. This showed that copying a release binary to Facebook’s web servers (a routine event) caused an anomalous drop in memory used across the site
Open source version: https://github.com/facebookincubator/beringei
Lazy people paper summary: https://www.google.com/amp/s/blog.acolyer.org/2016/05/03/goril…
Quicksort is one of the most important and ubiquitous algorithms of Computer Science. If you use Google Chrome, you use it. In this short presentation, Walé will summarize the paper and tease out some of its finer points. Please feel free to read the paper ahead of time - it weighs in as 6 pages, so it should be a relatively quick (gah, no pun intended) read.…
The Main Event
David Murray presents New Directions in Cryptography and A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.
These two classic papers are big favorites of mine. New Directions in Cryptography introduced the idea of digital signatures and public key cryptosystems, and reduced both problems to the search for a trap-door one-way permutation. Although Diffie and Hellman were unable to come up with such a permutation (settling for "just" Diffie-Hellman-Merkle key exchange), they laid the theoretical framework and invited others to join the search. Rivest, Shamir, and Adleman read the paper, did a lot of hard thinking, and a year later published A Method for Obtaining Digital Signatures and Public-Key Cryptosystems - now known as the RSA algorithm.
David works on Infrastructure Security for Salesforce.com.
It's a two-for-one Paper-loving holiday bonanza!
Brandon Sherman presents Boruta - A System for Feature Selection.
"Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system."
Ryan Cox presents <a>Maglev: A Fast and Reliable Software Network Load Balancer</a>.
"Maglev is Google’s network load balancer. It is a large distributed software system that runs on commodity Linux servers. Unlike traditional hardware network load balancers, it does not require a spec…
David Murray presents Inferring and Debugging Path MTU Discovery Failures.
The Internet was built on rough consensus and running code, and the code does in fact mostly run. This paper from 2005 discusses one of the edge cases where it doesn't - an edge case that still occasionally bothers Papers We Love presenters named David in 2016.
The Main Event
Denis Rystsov presents Flexible Paxos: Quorum intersection revisited.
The paper explores how non-standard quorum configurations can improve latency without affecting correctness of a system. In this talk Denis will describe how Paxos works and how Flexible Paxos differs from the other algorithms of the Paxos family.
Who What Where?
Big ups to Comcast for hosting this month! There will be a person at the front door ushering folk…
Tristan Penman presents A Few Useful Things to Know about Machine Learning.
At just over eight pages, this paper by Pedro Domingos delivers an approachable summary of some of the challenges and misunderstandings faced by those new to the field of Machine Learning.
The Main Event
David Graunke presents A New Implementation Technique for Applicative Languages from 1979.
The paper describes a technique for eliminating bound variables from lambda calculus programs by compiling to a small set of combinators, and describes a machine for executing the resulting combinator programs.
This paper is a pretty gentle introduction to the challenges of implementing functional languages in a way that's simple and efficient. The compilation algorithm in the pap…
The Main Event:
Derek Elkins presentsTango: Distributed Data Structures over a Shared Log.
Distributed systems are easier to build than ever with the emergence of new, data-centric abstractions for storing and computing over massive datasets. However, similar abstractions do not exist for storing and accessing metadata. To fill this gap, Tango provides developers with the abstraction of a replicated, in-memory data structure (such as a map or a tree) backed by a shared log. Tango objects are easy to build and use, replicating state via simple append and read operations on the shared log instead of complex distributed protocols; in the process, they obtain properties such as linearizability, persistence and high availability from the shared log. Tango also leverages the shared log to enable fast transactions across different objects, allowing applications to partition state across m…
Scott Francis presents The Art of the Propagator.
"We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs."
We will also be covering some extra material from the Revised Report on the Propagator Model , which contains the practical results from implementing the Propagator Model in MIT Scheme.
Thanks to thePlatform for hosting!
[Hi friends! For June we're going to try something a little different - two presenters talking about two different papers in a shorter format. I'm excited to see how it works! As usual, if you have a paper you love and would like to present at a future meetup, give me a shout! -d]
PayWord and MicroMint: Two simple micropayment schemes
This paper from 2001 written by Ronald Rivest and Adi Shamir (The R and S from RSA) is discussing how to design a micropayments scheme when making one of those was a popular idea. It uses relatively simple cryptography and some neat tricks in order to build systems that solve a complex problem.
Harley graduated from the University of California, Riverside and has been a developer for the past decade. They are interested in microservices, security, and free software.…
Many prior efforts have suggested that Internet video Quality of Experience (QoE) could be dramatically improved by using data-driven prediction of video quality for different choices (e.g., CDN or bitrate) to make optimal decisions. However, building such a prediction system is challenging on two fronts. First, the relationships between video quality and observed session features can be quite complex. Second, video quality changes dynamically. Thus, we need a prediction model that is (a) expressive enough to capture these complex relationships and (b) capable of updating quality predictions in near real-time. Unfortunately, several seemingly natural solutions (e.g., simple machine learning approaches and simple network models) fail on one or more fronts. Thus, the potential benefits promised by these prior efforts remain unrealized. We address these challenges and present the design and implementation of Critical Feature Analytics (CFA). The design of CFA is driven by domain-specif…
Y'all have called my bluff, we're talking about CRDTs! We'll use "A comprehensive study of Convergent and Commutative Replicated Data Types."
Eventual consistency aims to ensure that replicas of some mutable shared object converge without foreground synchronisation. Previous approaches to eventual consistency are ad-hoc and error-prone. We study a principled approach: to base the design of shared data types on some simple formal conditions that are sufficient to guarantee eventual consistency. We call these types Convergent or Commutative Replicated Data Types (CRDTs). This paper formalises asynchronous object replication, either state based or operation based, and provides a sufficient condition appropriate for each case. It describes several useful CRDTs, including container data types supp…
[Hi fellow paper lovers! One of *my* favorite paper-lovers, Pat Helland, is going to be in town later this month, and we talked him into doing a PWL while he's here! It's kind of cheating because he loves a paper that he wrote, but he got shoehorned into this by well-meaning fan-persons without much time to prepare so I'll still count it :)]
"There is an inexorable trend towards storing and sending immutable data. We need immutability to coordinate at a distance and we can afford immutability, as storage gets cheaper.
This paper is simply an amuse-bouche on the repeated patterns of computing that leverage immutability. Climbing up and down the compute stack really does yield a sense of déjà vu all over again."
Who is this Pat person?
"Pat Helland has been implementing transaction systems,…
Fuzzing is a technique to find software defects by providing random(ish) input to a system, and watching to see which patterns of input cause bad behavior.
The paper is "Fuzzing: The State of the Art", available here:
The talk will go over the history of fuzzing, different methods of fuzzing, and how fuzzing can be applicable to your everyday life.…
Tristan Penman will present Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
"A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis, simulations, and ex- periments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes."
[Hi fellow paper lovers! We're starting to build up a decent backlog of topics, I've heard from a couple interested people who have trouble making the usual Thursday timeslot, so I figured it might be fun to try a second event in January on a different night of the week to spread the love around. For the moment I'm only committing to one, but if it's a hit maybe we'll turn it into a regular thing. Let me know your feelings on the subject, if you have them!]
Denis Rystsov will present "Scalable Atomic Visibility with RAMP Transactions" by Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein and Ion Stoica.
It's a fresh paper of last year (2014) where authors propose a new isolation model — Read Atomic (RA) isolation — that helps to achieve incredible scalability in multi-partition distributed databases.
Read Atomic isolation is similar to Read Committed isolation, and provides atomic visibility: either all or none of each transaction’s updates are …
Ankur Chauhan will present the latest research on stream processing systems that was recently presented at VLDB 2015 by the team at Google. The paper is titled: "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing". This presents the state of the art in stream processing systems at scale and are also the technologies that lie at the heart of Google Cloud Dataflow.
Unbounded, unordered, global-scale datasets are increasingly common in day-to-day business (e.g. Web logs, mobile usage statistics, and sensor networks). At the same time, consumers of these datasets have evolved sophisticated requirements, such as event-time ordering and windowing by features of the data themselves, in addition to an insatiable hunger for faster answers. Meanwhile, practicality dictates that one can ne…
Trevor Lalish-Menagh will present "A Critique of the CAP Theorem." Abstract:
The CAP Theorem is a frequently cited impossibility result in distributed systems, especially among NoSQL distributed databases. In this paper we survey some of the confusion about the meaning of CAP, including inconsistencies and ambiguities in its definitions, and we highlight some problems in its formalization. CAP is often interpreted as proof that eventually consistent databases have better availability properties than strongly consistent databases; although there is some truth in this, we show that more careful reasoning is required. These problems cast doubt on the utility of CAP as a tool for reasoning about trade-offs in practical systems. As alternative to CAP, we propose a "delay-sensitivity" framework, which analyzes the sensitivity of operation latency to network delay, and which may help practitioners reason about the trade-offs between consistency guarantees and tolerance of network fa…
Ryan Cox will cover "Survivable Key Compromise in Software Update Systems". What happens when your signing keys are compromised or checked into GitHub? He will demo Notary, Docker's implementation of TheUpdateFramework; described in the paper. TUF is a system that grew out of Tor and is capable of surviving key compromises as well as several other issues in current update managers.
Samuel, Justin, et al. "Survivable key compromise in software update systems." Proceedings of the 17th ACM conference on Computer and communications security. ACM, 2010.…
This time Jose Contreras will be presenting a KDD paper on "Predicting Voice Elicited Emotions".
We present the research, and product development and deployment, of Voice Analyzer™ by Jobaline Inc. This is a patent pending technology that analyzes voice data and predicts human emotions elicited by the paralinguistic elements of a voice.
Human voice characteristics, such as tone, complement the verbal communication. In several contexts of communication, “how” things are said is just as important as “what” is being said. This paper provides an overview of our deployed system, the raw data, the data processing steps, and the prediction algorithms we experimented with. A case study is included where, given a voice clip, our model predicts the degree in which a listener will find the voice “engaging”. Our prediction results were verified through indep…
David Murray will present Ideal Hash Trees:
"Hash Trees with nearly ideal characteristics are described. These Hash Trees require no initial root hash table yet are faster and use significantly less space than chained or double hash trees. Insert, search and delete times are small and constant, independent of key set size, operations are O(1). Small worst-case times for insert, search and removal operations can be guaranteed and misses cost less than successful searches. Array Mapped Tries(AMT), first described in Fast and Space Efficient Trie Searches, Bagwell , form the underlying data structure. The concept is then applied to external disk or distributed storage to obtain an algorithm that achieves single access searches, close to single access inserts and greater than 80 percent disk block load factors. Comparisons are made with Linear Hashing, Litwin, Neimat, and Schneider  and B-Trees, R.Bayer an…
We present Mesos, a platform for sharing commodity clusters between multiple diverse cluster computing frameworks, such as Hadoop and MPI. Sharing improves cluster utilization and avoids per-framework data replication. Mesos shares resources in a fine-grained manner, allowing frameworks to achieve data locality by taking turns reading data stored on each machine. To support the sophisticated schedulers of today's frameworks, Mesos introduces a distributed two-level scheduling mechanism called resource offers. Mesos decides how many resources to offer each framework, while frameworks decide which resources to accept and which computations to run on them. Our results show that Mesos can achieve near-optimal data locality when sharing the cluster among diverse frameworks, can scale to 50,000 (emulated) nodes, and is resilient to failures.
Derek Elkins will present on "Dedalus: Datalog in Time and Space" touching also on some follow-on work.
Recent research has explored using Datalog-based languages to express a distributed system as a set of logical invariants. Two properties of distributed systems proved difficult to model in Datalog. First, the state of any such system evolves with its execution. Second, deductions in these systems may be arbitrarily delayed, dropped, or reordered by the unreliable network links they must traverse. Previous efforts addressed the former by extending Datalog to include updates, key constraints, persistence and events, and the latter by assuming ordered and reliable delivery while ignoring delay. These details have a semantics outside Datalog, which increases the complexity of the language and its interpretation, and forces programmers to think operationally. We argue that the missing component from these previous languages is a notion of time.
David Murray will present chain replication: a high-throughput alternative to quorum-based replication protocols like PAXOS and RAFT.
Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several objectplacement strategies (including schemes based on distributed hash table routing) are discussed.
Link to the paper: http://www.cs.cornell.edu/home/rvr/papers/osdi04.pdf
This time Ankur Chauhan will present the paper: The LCA Problem Revisited by Michael A. Bender and Martin Farach-Colton. The lowest common ancestor problem was first stated in 1973 and it took 11 years before an optimal solution was discovered, and another 16 before an understandable and implementable solution with the same bounds was presented. This deceptively simple problem comes together in the end and uses techniques that are powerful in plenty of other places.
Link to the paper: http://www.ics.uci.edu/~eppstein/261/BenFar-LCA-00.pdf
Some great papers embody insights, others package up those insights into digestible bites. "Programing with Algebraic Effects and Handlers" is the later sort of great paper. After two decades of fundamental research in to the nature of computation, a lot of mysterious ideas in computer science such as continuations and exception handling finally made sense to a number of mathematically inclined geniuses. Bauer and Pretnar's Eff programming language cuts right through the heart of the theory in a way that makes sense to anybody who has ever written a functional program. This paper uses the Eff language to explore a number of simple com…
We will be covering the paper that launched many other NoSQL databases over the years. Apache Cassandra, Voldemort, Riak to name a few.
ABSTRACT Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience…
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.
Arunabha Ghosh, director of Engineering at Moz, will be presenting this paper. Before Moz, Arunabha spend…
After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.
This time I will cover B-Trees, LSM Trees and Fractal Trees papers and provide some real world use cases (and data) along with the a discussion on the respective papers.…
Immutable vectors are a convenient data structure for functional programming and part of the standard library of modern languages like Clojure and Scala. The common implementation is based on wide trees with a fixed number of children per node, which allows fast indexed lookup and update operations. In this paper, the authors extend the vector data type with a new underlying data structure, Relaxed Radix Balanced Trees (RRB-Trees), and show how this structure allows immutable vector concatenation, insert-at and splits in O(logN) time while maintaining the index, update and iteration speeds of the original vector data structure.
Chris Bilson will be presenting this paper followed by a Q&A session.
Link: <a>RRB-Trees: Efficient Immutable Vectors</a>…
The idea is to bridge the gap between theory and practice and the first step is to disseminate the knowledge and we have and explore new horizons.
For the first meetup we will be discussing Paxos - The part time parliament by Leslie Lamport. This is one of the most widely known papers in the distributed systems community but also known to be notoriously complicated and rarely well understood. We shall endeavour to change that!
Reading list for Paxos:
• [The Part-Time Parliament](http://research.microsoft.com/en-us/um/people/lamport/pubs/lamport-paxos.pdf)