Chapter Meetups

PWL Seattle: A Fast File System for UNIX

Date/Time: 2022-10-06 06:30pm Location: false

Details

• What we'll do: Max will be presenting on [A Fast File System for Unix](https://people.eecs.berkeley.edu/~brewer/cs262/FFS.pdf)

Abstract

A reimplementation of the UNIX file system is described. The reimplementation provides substantially higher throughput rates by using more flexible allocation policies that allow better locality of reference and can be adapted to a wide range of peripheral and processor characteristics. The new file system clusters data that is sequentially accessed and provides two block sizes to allow fast access to large files while not wasting large amounts of space for small files. File access rates of up to ten times faster than the traditional UNIX file system are experienced. Long needed enhancements to the programmers’ interface are discussed. These include a mechanism to place advisory locks on files, extensions of the name space across fil…

Read more about this Meetup

Merkle Search Trees: Efficient State-Based CRDTs in Open Networks

Date/Time: 2022-08-04 06:30pm Location: false

Details

• What we'll do: Aaron will be presenting on [Merkle Search Trees: Efficient State-Based CRDTs in Open Networks Alex Auvolat, François Taïani](https://hal.inria.fr/hal-02303490/document)

Abstract

Most recent CRDT techniques rely on a causal broadcast primitive to provide guarantees on the delivery of operation deltas. Such a primitive is unfortunately hard to implement efficiently in large open networks, whose membership is often difficult to track. As an alternative, we argue in this paper that pure state-based CRDTs can be efficiently implemented by encoding states as specialized Merkle trees, and that this approach is well suited to open networks where many nodes may join and leave. At the core of our contribution lies a new kind of Merkle tree, called Merkle Search Tree (MST), that implements a balanced search tree while maintaining key ordering. This latter property makes it …

Read more about this Meetup

A brief history of data visualization

Date/Time: 2022-07-07 06:30pm Location: false

Details

• What we'll do: Max will be presenting on [A brief history of data visualization](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.69.4737&rep=rep1&type=pdf)

Abstract

It is common to think of statistical graphics and data visualization as relatively modern developments in statistics. In fact, the graphic representation of quantitative information has deep roots. These roots reach into the histories of the earliest map-making and visual depiction, and later into thematic cartography, statistics and statistical graphics, medicine, and other fields. Along the way, developments in technologies (printing, reproduction) mathematical theory and practice, and empirical observation and recording, enabled the wider use of graphics and new advances in form and content. This chapter provides an overview of the intellectual histo…

Read more about this Meetup

Grammars for Free: Toward Grammar Inference for Ad Hoc Parsers

Date/Time: 2022-06-02 06:30pm Location: false

Details

• What we'll do: Trevor will be presenting on [Grammars for Free: Toward Grammar Inference for Ad Hoc Parsers](https://arxiv.org/pdf/2202.01021.pdf)

Abstract

Ad hoc parsers are everywhere: they appear any time a string is split, looped over, interpreted, transformed, or otherwise processed. Every ad hoc parser gives rise to a language: the possibly infinite set of input strings that the program accepts without going wrong. Any language can be described by a formal grammar: a finite set of rules that can generate all strings of that language. But programmers do not write grammars for ad hoc parsers—even though they would be eminently useful. Grammars can serve as documentation, aid program comprehension, generate test inputs, and allow reasoning about language-theoretic security. We propose an automatic grammar inference system for ad hoc parsers that would enable all of these use cases,…

Read more about this Meetup

A Compiler for 3D Machine Knitting

Date/Time: 2022-05-05 06:30pm Location: false

Details

• What we'll do: Scott will be presenting on [A Compiler for 3D Machine Knitting!](https://la.disneyresearch.com/publication/machine-knitting-compiler/)

Abstract

Industrial knitting machines can produce finely detailed, seamless, 3D surfaces quickly and without human intervention. However, the tools used to program them require detailed manipulation and understanding of low-level knitting operations. We present a compiler that can automatically turn assemblies of high-level shape primitives (tubes, sheets) into low-level machine instructions. These high-level shape primitives allow knit objects to be scheduled, scaled, and otherwise shaped in ways that require thousands of edits to low-level instructions. At the core of our compiler is a heuristic transfer planning algorithm for knit cycles, which we prove is both sound and complete. This algorithm enable…

Read more about this Meetup

THE SWIRLDS HASHGRAPH CONSENSUS ALGORITHM: FAIR, FAST, BYZANTINE FAULT TOLERANCE

Date/Time: 2022-04-07 06:30pm Location: false

Details

• What we'll do: Aaron will be leading a discussion on Leemon Baird's Swirlds paper!

Abstract

A new system, the Swirlds hashgraph consensus algorithm, is proposed for replicated state machines with guaranteed Byzantine fault tolerance. It achieves fairness, in the sense that it is difficult for an attacker to manipulate which of two transactions will be chosen to be first in the consensus order. It has complete asynchrony, no leaders, no round robin, no proof-of-work, eventual consensus with probability one, and high speed in the absence of faults. It is based on a gossip protocol, in which the participants don’t just gossip about transactions. They gossip about gossip. They jointly build a hashgraph reflecting all of the gossip events. This allows Byzantine agreement to be achieved through virtual voting. Alice does not send Bob a vote over the Internet. Instead, Bob calculates what vote Alice would have sent, based on his knowledge of what Alice knows.…

Read more about this Meetup

A discussion on Differential Privacy

Date/Time: 2022-03-03 06:30pm Location: false

Details

• What we'll do: Max Payton will be leading a discussion on Cynthia Dwork's Differential Privacy paper!

Abstract

In 1977 Dalenius articulated a desideratum for statistical
databases: nothing about an individual should be learnable from the
database that cannot be learned without access to the database. We give a general impossibility result showing that a formalization of Dalenius’ goal along the lines of semantic security cannot be achieved. Contrary to intuition, a variant of the result threatens the privacy even of someone not in the database. This state of affairs suggests a new measure, differential privacy, which, intuitively, captures the increased risk to one’s privacy incurred by participating in a database.

• Important to know

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com…

Read more about this Meetup

Papers We Love @ Seattle General Chat

Date/Time: 2022-02-03 06:30pm Location: false

Details

• What we'll do: We will chat about future paper discussions we want to have and plan out what we are thinking about for the upcoming year! Please join us and help us get organized!

• Important to know

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a paper you'd like to present, or even just a mini, please hit up one of the organizers :) We're always looking for more presenters.

…

Read more about this Meetup

PWL #54: TBD

Map Date/Time: 2022-01-06 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: TBD
• Important to know

Big ups to Glympse for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a paper you'd like to present, or even just a mini, please hit up one of the organizers :) We're always looking for more presenters.

…

Read more about this Meetup

PWL #54: TBD

Map Date/Time: 2021-12-02 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: TBD
• Important to know

Big ups to Glympse for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a paper you'd like to present, or even just a mini, please hit up one of the organizers :) We're always looking for more presenters.

…

Read more about this Meetup

PWL #70: Running BGP in Data Centers at Scale

Date/Time: 2021-11-04 06:30pm Location: false

Details
• What we'll do
Main Event: David Murray will be speaking on Running BGP in Data Centers at Scale - Facebook Research (https://research.fb.com/publications/running-bgp-in-data-centers-at-scale/)

In this paper, we present Facebook’s BGP-based data center routing design and how it marries data center’s stringent requirements with BGP’s functionality. We present the design’s significant artifacts, including the BGP Autonomous System Number (ASN) allocation, route summarization, and our sophisticated BGP policy set.

• Important to know

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting lik…

Read more about this Meetup

PWL #54: TBD

Map Date/Time: 2021-10-07 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: TBD
• Important to know

Big ups to Glympse for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a paper you'd like to present, or even just a mini, please hit up one of the organizers :) We're always looking for more presenters.

…

Read more about this Meetup

PWL #69: Metastable Failures in Distributed Systems

Date/Time: 2021-07-01 06:30pm Location: false

Details
• What we'll do

Main Event: Metastable Failures in Distributed Systems with David Murray! (https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s11-bronson.pdf)

We describe metastable failures—a failure pattern in distributed systems. Currently, metastable failures manifest themselves as black swan events; they are outliers because nothing in the past points to their possibility, have a severe impact, and are much easier to explain in hindsight than to predict. Although instances of metastable failures can look different at the surface, deeper analysis shows that they can be understood within the same framework.

We introduce a framework for thinking about metastable failures, apply it to examples observed during years of operating distributed systems at scale, and survey ad-hoc techniques developed post-factum for making systems resilient …

Read more about this Meetup

PWL #??: CLEVER: Combining Code Metrics with Clone Detection

Date/Time: 2021-06-03 06:30pm Location: false

Details
• What we'll do
Main Event: Arick Grootveld presents CLEVER: Combining Code Metrics with Clone Detection for Just-In-Time Fault Prevention and Resolution in Large Industrial Projects
Automatic prevention and resolution of faults is an important research topic in the field of software maintenance and evolution.
Existing approaches leverage code and process metrics to build
metric-based models that can effectively prevent defect insertion in
a software project. Metrics, however, may vary from one project to
another, hindering the reuse of these models. Moreover, they tend to
generate high false positive rates by classifying healthy commits as
risky. Finally, they do not provide sufficient insights to developers
on how to fix the detected risky commits. In this paper, we propose
an approach, called CLEVER (Combining Levels of Bug Prevention
and Resolution techniques), which relies on a two-phase process for
intercepting risky …

Read more about this Meetup

PWL #67: Kademlia: A Peer-to-Peer Information System Based on the XOR Metric

Date/Time: 2021-03-04 06:30pm Location: false

Details
• What we'll do

PWL Mini: Krishnamoorthy Venkatraman will be leading a discussion about Here We Go Again: Why Is It Difficult for Developers to Learn
Another Programming Language? (https://www.microsoft.com/en-us/research/uploads/prod/2020/05/herewegoagain_icse2020.pdf)

Once a programmer knows one language, they can leverage concepts and knowledge already learned, and easily pick up another programming language. But is that always the case? To understand if programmers have difficulty learning additional programming languages, we conducted an empirical study of Stack Overflow questions across 18 different programming languages. We hypothesized that previous knowledge could potentially interfere with learning a new programming language. From our inspection of 450 Stack Overflow questions, we found 276 instances of interference that occ…

Read more about this Meetup

PWL #COVID - Virtual Consensus in Delos

Date/Time: 2021-02-04 06:30pm Location: false

Details
• What we'll do
Main Event: Krishna Venkat will present Virtual Consensus in Delos
Consensus-based replicated systems are complex, monolithic, and difficult to upgrade once deployed. As a result, deployed systems do not benefit from innovative research, and new consensus protocols rarely reach production. We propose virtualizing consensus by virtualizing the shared log API, allowing services to change consensus protocols without downtime. Virtualization splits the logic of consensus into the VirtualLog, a generic and reusable reconfiguration layer; and pluggable ordering protocols called Loglets. Loglets are simple, since they do not need to support reconfiguration or leader election; diverse, consisting of different protocols, codebases, and even deployment modes; and composable, via RAID-like stacking and striping. We describe a production database called Delos which leverages virtual consensus for rapid, incremental development and deployment. Delos reached pr…

Read more about this Meetup

PWL #COV: STAR-Vote: A Safe, Transparent, Auditable, Reliable Voting Algorithm

Date/Time: 2020-12-03 06:30pm Location: false

Details
• What we'll do
Main Event: Max will present STAR-Vote: A Secure, Transparent, Auditable, and Reliable Voting System (https://www.usenix.org/system/files/conference/evtwote13/jets-0101-bell.pdf)

STAR-Vote is a collaboration between a number of academics and the Travis County (Austin), Texas elections office, which currently uses a DRE voting system and previously used an optical scan voting system. STAR-Vote represents a rare opportunity for a variety of sophisticated technologies, such as end-to-end cryptography and risk limiting audits, to be designed into a new voting system, from scratch, with a variety of real world constraints, such as election-day vote centers that must support thousands of ballot styles and run all day in the event of a power failure. This paper describes the current design of STAR-Vote which is now largely settled and whose deve…

Read more about this Meetup

PWL #64: Lamport Time Clocks

Date/Time: 2020-10-01 06:30pm Location: false

Details
• What we'll do
Main Event: David Murray will present https://lamport.azurewebsites.net/pubs/time-clocks.pdf
The concept of one event happening before another
in a distributed system is examined, and is shown to
define a partial ordering of the events. A distributed
algorithm is given for synchronizing a system of logical
clocks which can be used to totally order the events.
The use of the total ordering is illustrated with a
method for solving synchronization problems. The
algorithm is then specialized for synchronizing physical
clocks, and a bound is derived on how far out of
synchrony the clocks can become.
https://meet.jit.si/paperswelove
• Important to know

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (

Read more about this Meetup

PWL #54: TBD

Date/Time: 2020-07-02 06:30pm Location: false

Details
• What we'll do
Main Event:
Shashwat will present Conflict-free Replicated Data Types(https://hal.inria.fr/hal-00932836/file/CRDTs_SSS-2011.pdf)
Replicating data under Eventual Consistency (EC) allows any replica to accept updates without remote synchronisation. This ensures performance and scalability in large-scale distributed systems (e.g., clouds). However, published EC approaches are ad-hoc and error-prone. Under a formal Strong Eventual Consistency (SEC) model, we study sufficient conditions for convergence. A data type that satisfies these conditions is called a Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed to converge in a self-stabilising manner,
despite any number of failures. This paper formalises two popular approaches (state- and operation-based) and their relevant sufficient conditions. We study a number of useful CRDTs,…

Read more about this Meetup

PWL #62: Racial Equity in Algorithmic Criminal Justice

Map Date/Time: 2020-06-04 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Mini Paper: Shashwat will present The Tail at Scale https://cseweb.ucsd.edu/~gmporter/classes/fa17/cse124/post/schedule/p74-dean.pdf
SYSTEMS THAT RESPOND to user actions quickly (within 100ms) feel more fluid and natural to users than those that take longer. Improvements in Internet connectivity and the rise of warehouse-scale computing systems have enabled Web services that provide fluid responsiveness while consulting multi-terabyte datasets spanning thousands of servers; for example, the Google search system updates query results interactively as the user types, predicting the most likely query based on the prefix typed so far, performing the search and showing the results within a few tens of milliseconds. Emerging augmented-reality devices (such as the Google Glass prototype) will need associated Web services with even greater resp…

Read more about this Meetup

PWL #54: TBD

Map Date/Time: 2020-05-07 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: TBD
• Important to know

Big ups to Glympse for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a paper you'd like to present, or even just a mini, please hit up one of the organizers :) We're always looking for more presenters.

…

Read more about this Meetup

PWL #62: Fixing Faults: Abbreviated vs. Full-word Identifier Names

Date/Time: 2020-04-02 06:30pm Location: false

Details
• What we'll do
Main Event: Max Payton will be presenting Fixing Faults in C and Java Source Code: Abbreviated vs. Full-word Identifier Names (http://www2.unibas.it/gscanniello/Giuseppe_Scanniello%40unibas/Home_files/TOSEM.pdf)
" We carried out a family of controlled experiments to investigate whether the use of abbreviated identifier names, with respect to full-word identifier names, affects fault fixing in C and Java source code. This family consists of an original (or baseline) controlled experiment and three replications. We involved 100 participants with different backgrounds and experiences in total. Overall results suggested that there is no difference in terms of effort, effectiveness, and efficiency to fix faults, when source code contains either only abbreviated or only full-word identifier names. We also conducted a qualitative study to u…

Read more about this Meetup

PWL #61: Backpropagation applied to handwritten zip code recognition

Map Date/Time: 2020-02-06 06:30pm Location: 999 3rd Ave - Seattle 98104

Details
• What we'll do
Main Event: Zachary New will present Backpropagation applied to handwritten zip code recognition (http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf)

The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.

• Important to know

Big ups to Create33 and Vanessa for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (

Read more about this Meetup

PWL #60: The Properties and Promises of UTF-8

Map Date/Time: 2020-01-02 06:30pm Location: Comcast Technology Solutions - Seattle 98104

Details
"The Properties and Promises of UTF-8", by Martin J. Dürst (1997).

Presented by Marvin Humphrey.

OVERVIEW:

Created by Ken Thompson one night in September 1992 on a placemat in a New Jersey diner, UTF-8 is a spectacular quintuple bank shot of design: optimal across so many criteria that it's hard to believe anybody could pull it off.

Over the years, UTF-8 has increasingly eclipsed all of the alternatives. UTF-1 and UTF-7 are obsolete and only of historical interest; UTF-32 has limited practical utility; UCS-2 can't express all of Unicode... the only serious competitor remaining is UTF-16, but UTF-8 has continued to gain market share.

In 1997, Martin J. Dürst gave a talk at the 11th International Unicode Conference in San Jose on "The Properties and Promises of UTF-8". Two decades later at Papers We Love San Diego, we will review those marvelous properties and contemplate all of the promises that UTF-8 has fulfilled.

DOWNLOAD:

…

Read more about this Meetup

PWL #59: Lightweight Asynchronous Snapshots for Distributed Dataflows

Map Date/Time: 2019-12-05 06:30pm Location: Comcast Technology Solutions - Seattle

Details
• What we'll do
Lightning Talk: A very special guest David Murray will be presenting on the Convoy Phenomenon (https://jimgray.azurewebsites.net/papers/Convoy%20Phenomenon%20RJ%202516.pdf)

A congestion phenomenon on high-traffic locks is described
and a non-FIFO strategy to eliminate such congestion is presented.

Main Event: Max Payton will be presentingLightweight Asynchronous Snapshots for Distributed Dataflows (https://arxiv.org/pdf/1506.08603.pdf)

Distributed stateful stream processing enables the deployment and execution of large scale continuous computations in the cloud, targeting both low latency and high throughput. One of the most fundamental challenges of this paradigm is providing processing guarantees under potential failures. Existing approaches rely on …

Read more about this Meetup

PWL #58: Attention Is All You Need

Map Date/Time: 2019-11-07 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Lightning Talk: Marvin Humphrey, visiting from the San Diego chapter of Papers We Love, will present on the evolution of error handling. http://joeduffyblog.com/2016/02/07/the-error-model/

Main Event: Alec Morgan will be presenting "Attention Is All You Need" (https://arxiv.org/abs/1706.03762)
"The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more paralleli…

Read more about this Meetup

PWL #57: Full-Stack Teams, not Engineers

Map Date/Time: 2019-10-03 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: Trevor Lalish-Menagh
will be presenting "Full-Stack Teams, not Engineers" (https://itrevolution.com/forum-paper-downloads/)
"Randy is a full-stack engineer. He knows the company’s software
better than anyone else. He has experience with all aspects of the
company’s production services: literally “from chips to CSS.” He
writes server code and front-end code, makes user-interface decisions, administers the database, configures the CI/CD system, and
handles outages. For all this he is paid handsomely.
There are two things that Randy doesn’t know. First, he’s about to
burn out. Nobody can sustain his level of stress for long. His health
is starting to suffer, and he can’t figure out why working more hours
doesn’t enable him to stay ahead of all the projects that need to be
completed. He hasn’t had a vacation in years.

Read more about this Meetup

PWL #56: Special Relativity

Map Date/Time: 2019-09-05 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event:
Shuheng presents on Einsteins Special Relativity (https://www.academia.edu/375613/Einsteins_Original_Paper_on_General_Relativity, https://web.stanford.edu/~oas/SI/SRGR/notes/srHarris.pdf)
Special relativity was introduced by Einstein in 1905. Before Einstein's special relativity, we had a concept called Newtonian/Galilean relativity. The idea is that if you sit in an inertial reference, like an uniformly moving train, you won't be able to tell that it is moving and the physics laws are unchanged. An outside observer to the moving train, given the train's velocity, could re-write all coordinates local to the moving train using coordinates local to the outside observer. This could be done in a simple transformation. If you transform Newt…

Read more about this Meetup

PWL #55: "New directions in cryptography" Difie-Hellman and RSA

Map Date/Time: 2019-08-01 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: Eric Hopper will present "New directions in cryptography" (https://ee.stanford.edu/~hellman/publications/24.pdf) and "A method for obtaining digital signatures and public-key cryptosystems" (https://people.csail.mit.edu/rivest/Rsapaper.pdf), a classic paper month covering RSA and diffie-hellman encryption
• Important to know
Big ups to Glympse for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking an…

Read more about this Meetup

PWL #54: Prime Factorization on Quantum Computers

Map Date/Time: 2019-06-06 06:30pm Location: Glympse - Seattle 98122

Details
• What we'll do
Main Event: Polynomial-Time Algorithms for Prime Factorization
and Discrete Logarithms on a Quantum Computer - Daniel Muldrew
(https://arxiv.org/pdf/quant-ph/9508027.pdf)

"A digital computer is generally believed to be an efficient universal computing device; that is, it is believed able to simulate any physical computing device with an increase in computation time by at most a polynomial factor. This may not be true when quantum mechanics is taken into consideration. This paper considers factoring integers and finding discrete logarithms, two problems which are generally thought to be hard on a classical computer and which have been used as the basis of several proposed cryptosystems. Efficient randomized algorithms are given for these two problems on a hypothetical quantum computer. These algorithms take a number of steps polynomial in the input size, e.g., the…

Read more about this Meetup

PWL #53: Toxic Workers

Map Date/Time: 2019-05-02 06:30pm Location: Glympse - Seattle 98122

• What we'll do
Main Event: Toxic Workers by Max Payton

http://www.hbs.edu/faculty/Publication%20Files/16-057_d45c0b4f-fa19-49de-8f1b-4b12fe054fea.pdf

Abstract:
"While there has been a strong focus in past research on discovering and developing top performers in the workplace, less attention has been paid to the question of how to manage those workers on the opposite side of the spectrum: those who are harmful to organizational performance. In extreme cases, aside from hurting performance, such workers can generate enormous regulatory and legal fees and liabilities for the firm. We explore a large novel dataset of over 50,000 workers across 11 firms to document a variety of aspects of worker characteristics and circumstances that lead them to engage in what we call "toxic" behavior.
We also explore the relationship between toxicity …

Read more about this Meetup

PWL #52: GAN Dissection

Map Date/Time: 2019-04-04 06:30pm Location: 83 S King St - Seattle 98104

• What we'll do
Main Event:
Ankur Kalra: "GAN Dissection":

https://arxiv.org/abs/1811.10597

This is a really interesting paper with neat results, and it takes a rigorous and principled approach to interpretability in ML (at least, in the vision domain). It's also one of my favorite papers from 2018.

I'll cover some of the motivations, previous/related work, and some of the key ideas in the paper. This particular conversation will be mostly focused on providing an intuitive understanding instead of a rigorous mathematical understanding (at least during the talk itself). No prior machine learning experience will be necessary to follow along.

Dinner will be provided! We usually eat between 6:30 and 7:00, with papers starting at 7.

• Important to know

Big ups to Lyft for hosting this month!

Unfortunately, in order to host at Lyft, we need to email you an NDA form, so…

Read more about this Meetup

PWL #52: The Ultimate Display

Map Date/Time: 2019-02-07 06:30pm Location: 83 S King St - Seattle 98104

• What we'll do
PWL mini:
Max Payton will be presenting "Reflections on Trusting Trust"
https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf
"To what extent should one trust a statement that a program is free of Trojan horses? Perhaps it is more important to trust the people who wrote the software."

Main Event:
Scott Francis will be presenting "The Ultimate Display" (1965)
http://worrydream.com/refs/Sutherland%20-%20The%20Ultimate%20Display.pdf
"A display connected to a digital computer gives us a chance to gain familiarity with concepts not realizable in the physical world. It is a looking glass into a mathematical wonderland." Scott will also be talking about how the prediction in this paper turned out.

• Im…

Read more about this Meetup

PWL #51: Kademlia - A peer to peer Information system based on XOR Metric

Map Date/Time: 2019-01-03 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Maie - A Unified approach to interpreting model predictions
(http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions)

Main Event:
Eric Wolak - Kademlia - A peer to peer Information system based on XOR Metric
(https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf)

Dinner will be provided! We usually eat between 6:30 and 7:00, with papers starting at 7.

• Important to know

Big ups to Microsoft Reactor for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/…

Read more about this Meetup

PWL #50: FaRM: a distributed ACID compliant in-memory store

Map Date/Time: 2018-12-06 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do

Main Event:
John Shuheng will be speaking on FaRM (https://pdos.csail.mit.edu/6.824/papers/farm-2015.pdf).
Abstract:
FaRM is a distributed in-memory storage system with a single virtual address space and support for strictly serializable transactions at very low latency. The system achieves much lower latency previously possible by skipping CPU involvement in most transactions through heavily optimizing the transaction protocol using one-sided RDMA reads and writes

Supplemental paper:
https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf

Dinner will be provided! We usually eat between 6:30 and 7:00, with papers starting at 7.

• Important to know

Big ups to Microsoft Reactor for hosting this mo…

Read more about this Meetup

PWL #49: The Tangle

Map Date/Time: 2018-11-01 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Maybe you!!!

Main Event:
Joshulyne Park will be leading a discussion on "The Tangle." This is the paper behind IOTA, a cryptocurrency for the Internet of Things (IoT).

Abstract:
In this paper we analyze the mathematical foundations of IOTA, a cryptocurrency for the Internet-of-Things (IoT) industry. The main feature of this novel cryptocurrency is the tangle, a directed acyclic graph (DAG) for storing transactions. The tangle naturally succeeds the blockchain as its next evolutionary step, and offers features that are required to establish a machineto-machine micropayment system.

An essential contribution of this paper is a family of Markov Chain Monte
Carlo (MCMC) algorithms. These algorithms select attachment sites on the tangle for a transaction that has just arrived.

Paper Link:

Read more about this Meetup

PWL #48: One VM to Rule Them All

Map Date/Time: 2018-10-04 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Maybe you!!!

Main Event:
One VM to Rule Them All by Brandon Bloom

https://the.gregor.institute/papers/onward2013-wuerthinger-truffle.pdf

Download and try Graal (https://www.graalvm.org/docs/getting-started/); you can call back and forth between Java, R, Python, Ruby, etc!

Abstract

Building high-performance virtual machines is a complex
and expensive undertaking; many popular languages still
have low-performance implementations. We describe a new
approach to virtual machine (VM) construction that amor-
tizes much of the effort in initial construction by allowing
new languages to be implemented with modest additional
effort. The approach relies on abstract syntax tree (AST) in-
terpretation where a node can …

Read more about this Meetup

PWL #47: Keys Under Doormats

Map Date/Time: 2018-09-06 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

PWL Mini:

Maybe you! (10-15 minutes to talk about a paper YOU love)

The Main Event:

The paper that caught my attention is is Keys Under Doormats a paper published in November of 2015 that was co-authored by Susan Landau, Peter G. Neumann, Whitfield Diffie, et. al.

This paper is the most recent of a series of scholarly papers which discusses the risks to the general public that are created when the U.S. Government requests or issues mandates for technology changes to provide "greater security".

These papers published going back to at least 1994 discuss the trade offs between the desired result, the privacy of individuals and the overall reliability of the targeted technology (such as cell phones, internet traffic, etc.) I found the cycle of mandates and responses to be quite complex and compelling. There are many actions and consequences that are non-obvious at first glance and these papers do a good job of explaining these relationships.

Oth…

Read more about this Meetup

PWL #46: TritanDB

Map Date/Time: 2018-08-02 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Jared Anderson will be discussing a CAPM paper.

Main Event:
Ankur Chauhan will be speaking on TritanDB (http://www.tritandb.com/).

Dinner will be provided! We usually eat between 6:30 and 7:00, with papers starting at 7.

• Important to know

Big ups to Microsoft Reactor for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (https://github.com/papers-we-love/seattle/blob/master/code-of-conduct.md) at our events. Please give it a read, plan on acting like an adult, and involve one of the organizers if you need help.

Stop slacking and join us in the #seattle channel at https://papersweloveslack.herokuapp.com!

If you have a …

Read more about this Meetup

PWL #45: Boss Competence and Worker Well-Being

Map Date/Time: 2018-07-05 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Finallizing

Main Event:
Trevor Lalish-Menagh presents Boss Competence and Worker Well-Being
(https://www.researchgate.net/publication/268491675_Boss_Competence_and_Worker_Well-Being)
"Nearly all workers have a supervisor or ‘boss’. Yet there is almost no published research by economists into how bosses affect the quality of employees’ lives. This study offers some of the first formal evidence. First, it is shown that a boss’s technical competence is the single strongest predictor of a worker’s well-being. Second, we examine equivalent instrumental-variable results. Third, we demonstrate longitudinally that even if a worker stays in the same job and workplace then a newly competent supervisor greatly improves the worker’s well-being. Finally, we discuss analytical possibilities, and consider necessary future rese…

Read more about this Meetup

PWL #44: Dynamic Split Points in a Decision Tree

Map Date/Time: 2018-06-07 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
PWL mini:
Brandon Sherman presents Tidy Data
(http://vita.had.co.nz/papers/tidy-data.pdf)
A huge amount of effort is spent cleaning data to get it ready for analysis, but there has been little research on how to make data cleaning as easy and effective as possible. This paper tackles a small, but important, component of data cleaning: data tidying. Tidy datasets are easy to manipulate, model and visualize, and have a specific structure:
each variable is a column, each observation is a row, and each type of observational unit is a table. This framework makes it easy to tidy messy datasets because only a small set of tools are needed to deal with a wide range of un-tidy datasets. This structure also makes it easier to develop tidy tools for data analysis, tools that both input and output tidy datasets. The advantages of a consistent data structure and matching tools are demon…

Read more about this Meetup

PWL #43: Paxos from the ground up

Map Date/Time: 2018-05-03 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do
Main Event
Immad Naseer present "Paxos from the ground up"

Consensus protocols are at the heart of most distributed systems and
Paxos is one of the more widely used of such protocols. Instead of
introducing Paxos with all its subtle details from the get-go, we'll
instead derive the protocol step by step, starting with the simplest
algorithm which comes to mind and then repeatedly refining it together
till we finally reach the actual protocol. This will allow you to
better appreciate its simplicity and beauty and use this important
building block with more confidence and understanding in your next
distributed systems project.

Dinner will be provided! If you have any dietary restrictions, please send a message to the organizers ASAP

• Important to know

Big ups to Microsoft Reactor for hosting this month!

As a chapter of Papers We Love we abide by and enforce the PWL Code of Conduct (

Read more about this Meetup

PWL #42: Blossom V and NL2Bash

Map Date/Time: 2018-04-05 06:30pm Location: Surf Incubator - Seattle

• What we'll do
PWL mini:
David Murray presents NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System
(https://arxiv.org/pdf/1802.08979.pdf)
"We present new data and semantic parsing methods for the problem of mapping English sentences to Bash commands (NL2Bash). Our long-term goal is to enable any user to perform operations such as file manipulation, search, and application-specific scripting by simply stating their goals in English. We take a first step in this domain, by providing a new dataset of challenging but commonly used Bash commands and expert-written English descriptions, along with baseline methods to establish performance levels on this task."

Main Event
Max Payton present Blossom V: A new implementation of a minimum cost perfect matching algorithm
(htt…

Read more about this Meetup

PWL #41: Spectre and Meltdown

Map Date/Time: 2018-03-01 06:30pm Location: Microsoft Reactor Westlake - Seattle 98109

• What we'll do

George Reilly presents Meltdown and Spectre

https://meltdownattack.com/meltdown.pdf
https://spectreattack.com/spectre.pdf
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

On January 3rd, we learned about Meltdown and Spectre, serious flaws in computer hardware that lead to side-channel attacks against privileged memory through flaws in speculative execution.

“Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispensable performance feature and present in a wide range of modern processors. The attack…

Read more about this Meetup

PWL #40: FLP and Bitcoin

Map Date/Time: 2018-02-01 06:30pm Location: Comcast Technology Solutions - Seattle

PWL Mini:
Vaidhy presents the FLP impossibility result
(https://homes.cs.washington.edu/~arvind/cs425/doc/fischer.pdf)

“The consensus problem involves an asynchronous system of processes, some of which may be unreliable. The problem is for the reliable processes to agree on a binary value. In this paper, it is shown that every protocol for this problem has the possibility of nontermination, even with only one faulty process. By way of contrast, solutions are known for the synchronous case, the “Byzantine Generals” problem.”

The Main Event:
John presents Bitcoin: A Peer-to-Peer Electronic Cash System
(https://bitcoin.org/bitcoin.pdf)

"A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial ins…

Read more about this Meetup

PWL #40: FLP and Bitcoin

Map Date/Time: 2018-01-13 06:30pm Location: Comcast Technology Solutions - Seattle

PWL Mini:
Vaidhy presents the FLP impossibility result
(https://homes.cs.washington.edu/~arvind/cs425/doc/fischer.pdf)

“The consensus problem involves an asynchronous system of processes, some of which may be unreliable. The problem is for the reliable processes to agree on a binary value. In this paper, it is shown that every protocol for this problem has the possibility of nontermination, even with only one faulty process. By way of contrast, solutions are known for the synchronous case, the “Byzantine Generals” problem.” The Main Event:
John presents Bitcoin: A Peer-to-Peer Electronic Cash System
(https://bitcoin.org/bitcoin.pdf)

"A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institutio…

Read more about this Meetup

PWL #39: Grover's Algorithm

Map Date/Time: 2018-01-04 07:00pm Location: Comcast Technology Solutions - Seattle

• What we'll do
PWL Mini: FLP impossibility result: https://homes.cs.washington.edu/~arvind/cs425/doc/fischer.pdf. Presented by Vaidhy!

The main event: Casey presents "A fast quantum mechanical algorithm for database search."
(https://arxiv.org/pdf/quant-ph/9605043.pdf)

"Imagine a phone directory containing N names arranged in completely random order. In order to find someone's phone number with a probability of 1/2, any classical algorithm (whether deterministic or probabilistic) will need to look at a minimum of N/2 names. Quantum mechanical systems can be in a superposition of states and simultaneously examine multiple names. By properly adjusting the phases of various operations, successful computations reinforce each other while others interfere randomly. As a result, the desired phone numb…

Read more about this Meetup

PWL #38: LittleTable / Robust Composition

Map Date/Time: 2017-12-07 06:30pm Location: Comcast Technology Solutions - Seattle

PWL Mini:
Ankur Chauhan presents "LittleTable: A Time-Series Database and Its Uses"
(https://dl.acm.org/citation.cfm?id=3056102)

"We present LittleTable, a relational database that Cisco Meraki has used since 2008 to store usage statistics, event logs, and other time-series data from our customers’ devices. LittleTable optimizes for time-series data by clustering tables in two dimensions. By partitioning rows by timestamp, it allows quick retrieval of recent measurements without imposing any penalty for retaining older history."

The Main Event:
Clive Boulton presents "Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control"
(http://www.erights.org/talks/thesis/markm-thesis.pdf)

"When separately written programs are composed so that they may cooperate, they …

Read more about this Meetup

PWL #37: Argus

Map Date/Time: 2017-11-02 06:30pm Location: Thomas Street - Seattle

Happy (slightly-more-than) three year anniversary, PWL @ Seattle!

The Main Event

Caitie McCaffrey presents Distributed Programming in Argus

"Argus -- a programming language and system -- was developed to support the implementation and execution of distributed programs. Distribution gives rise to some problems that do not exist in a centralized system, or that exist in a less complex form. For example, a centralized system is either running or crashed, but a distributed system may be partly running and partly crashed. The goal of Argus is to provide mechanisms that make it easier for programmers to cope with these problems."

Caitie McCaffrey is a Backend Brat and Distributed Systems Diva at Microsoft Research. Prior to that she was the Tech Lead for the Observability team at Twitter, and also built large scale services an…

Read more about this Meetup

PWL #36: Deep Speech

Map Date/Time: 2017-10-05 06:30pm Location: Comcast Technology Solutions - Seattle

PWL Mini

Want to talk about a paper you love (or think is kind of interesting), but not for very long? A PWL Mini is a 10-15 minute opening act, and we've got time for one this month. If you're interested, talk to David and/or Trevor!

The Main Event

Morgan Gellert presents Deep Speech and Deep Speech 2

I will be presenting a brief overview of modern speech recognition. We will discuss the challenges that are latent in the problem, how classical methods addressed the problem, and how modern systems are changing this model. Deep Speech (Hannun et al.) and Deep Speech 2 (Amodei et al.) present a remarkably simpler architecture that achieves massive improvements over previous works. Their work exemplifies how deep learning is taking over traditional methods across the board on recognition tasks.

Who What Where?

Big ups to

Read more about this Meetup

PWL #35: Orleans

Map Date/Time: 2017-09-07 06:30pm Location: Thomas Street - Seattle

Victor Hurdugaci presents "Orleans: Distributed Virtual Actors for Programmability and Scalability"

High-scale interactive services demand high throughput with low latency and high availability, difficult goals to meet with the traditional stateless 3-tier architecture. The actor model makes it natural to build a stateful middle tier and achieve the required performance. However, the popular actor model platforms still pass many distributed systems problems to the developers.

Who What Where?

Big ups to Thomas Street for hosting! Show up at 6:30 for food, discussion of the paper starts at 7:00 on the dot.

Please Remember

Be an adult, don't be a jerk. You can find more details in the

Read more about this Meetup

PWL #34: Feral Concurrency Control

Map Date/Time: 2017-08-02 06:30pm Location: Comcast Technology Solutions - Seattle

We will be talking about Peter Bailis' 2015 paper "Feral Concurrency Control.

Abstract

The rise of data-intensive “Web 2.0” Internet services has led to a range of popular new programming frameworks that collectively embody the latest incarnation of the vision of Object-Relational Mapping (ORM) systems, albeit at unprecedented scale. In this work, we empirically investigate modern ORM-backed applications’ use and disuse of database concurrency control mechanisms. Specifically, we focus our study on the common use of feral, or application-level, mechanisms for maintaining database integrity, which, across a range of ORM systems, often take the form of declarative correctness criteria, or invariants. We quantitatively analyze the use of these mechanisms in a range of open source applications written using the Ruby on Rails ORM and find that feral invariants are the most popular means of ensuring integrity (and, by usage, are over 37 times more popular than transac…

Read more about this Meetup

PWL #33: LIME && Sqlcache

Map Date/Time: 2017-07-06 06:30pm Location: Thomas Street - Seattle

We've got another two-for-the-price-of-one paper-loving spectacular!

Round One

Brandon Sherman presents Why Should I Trust You? Explaining the Predictions of Any Classifier.

"Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust in a model. Trust is fundamental if one plans to take action based on a prediction, or when choosing whether or not to deploy a new model. Such understanding further provides insights into the model, which can be used to turn an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction."

Round Two

Andrew Beyer presents

Read more about this Meetup

PWL #32: The Styx Architecture for Distributed Systems

Map Date/Time: 2017-05-31 06:30pm Location: Comcast Technology Solutions - Seattle

Scott Francis (@mechazoidal) will be guiding us through two papers on the Styx Architecture. The Styx Architecture for Distributed Systems and Styx-on-a-Brick invented by Rob Pike (he invented the Go language) and Dennis Ritchie (he co-invented the C language).

The protocol is effectively 9P (specifically 9P2000). Pike and Ritchie renamed the protocol after the papers were written to unify with Plan 9.

…

Read more about this Meetup

PWL #31: Projectional Editors and Parsing with Derivatives

Map Date/Time: 2017-05-04 06:30pm Location: Thomas Street - Seattle

To parse or not to parse: a two-for-one paper-loving showdown!

David Murray presents Projecting a Modular Future

"We describe two innovations in programming languages: modularity and projectional editing. Language modularity refers to the ability to combine independently developed languages without changing their respective definitions. A language is not anymore a fixed quantity, instead it can be extended with domain-specific constructs as needed. Projectional editing refers to a technique of building editors and IDEs that avoid the need for parsers. They support a wide range of tightly integrated notations including textual, symbolic, tabular and graphical. In addition, by avoiding parsers, the well-known limitations of grammar composition are avoided as well."

David Graunke presents Parsing with Derivatives<…

Read more about this Meetup

PWL #30: LKRhash

Map Date/Time: 2017-04-06 06:30pm Location: Comcast Technology Solutions - Seattle

The Main Event
George Reilly presents LKRhash.

LKRhash is a scalable hashtable. It scales to multiple processors and to millions of items. It was invented at Microsoft in the late 90s by Paul Larson, Murali Krishnan, and George Reilly. This talk is based on an unpublished paper that was submitted to Software: Practice & Experience.

Who What Where
Big ups to Comcast for hosting this month! There will be a person at the front door ushering folks up to the 11th Floor for the event.

Please Remember
Like all chapters of Papers We Love, we abide by and enforce the PWL Code of Conduct. Please give it a read, plan on conducting yourself accordingly, and contact one of the organizers if you need to report an incident.

…

Read more about this Meetup

PWL #29: Gorilla: A fast, scalable, in-memory time series database

Map Date/Time: 2017-03-02 06:30pm Location: Thomas Street - Seattle

Error rates across one of Facebook’s sites were spiking. The problem had first shown up through an automated alert triggered by an in-memory time-series database called Gorilla a few minutes after the problem started. One set of engineers mitigated the immediate issue. A second group set out to find the root cause. They fired up Facebook’s time series correlation engine built on top of Gorilla, and searched for metrics showing a correlation with the errors. This showed that copying a release binary to Facebook’s web servers (a routine event) caused an anomalous drop in memory used across the site

Open source version: https://github.com/facebookincubator/beringei

Lazy people paper summary: https://www.google.com/amp/s/blog.acolyer.org/2016/05/03/goril…

Read more about this Meetup

PWL #28: Quicksort and Haskore

Map Date/Time: 2017-02-02 06:30pm Location: Comcast Technology Solutions - Seattle

We will have <a>Wale Ogundipe</a> presenting on Quicksort (paper link), and <a>Chris Kolodin</a> presenting on Haskore (paper link).

Quicksort is one of the most important and ubiquitous algorithms of Computer Science. If you use Google Chrome, you use it. In this short presentation, Walé will summarize the paper and tease out some of its finer points. Please feel free to read the paper ahead of time - it weighs in as 6 pages, so it should be a relatively quick (gah, no pun intended) read.

…

Read more about this Meetup

PWL #27: New Directions in Cryptography

Map Date/Time: 2017-01-05 06:30pm Location: Salesforce @ WeWork Westlake Tower - Seattle

The Main Event
David Murray presents New Directions in Cryptography and A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.

These two classic papers are big favorites of mine. New Directions in Cryptography introduced the idea of digital signatures and public key cryptosystems, and reduced both problems to the search for a trap-door one-way permutation. Although Diffie and Hellman were unable to come up with such a permutation (settling for "just" Diffie-Hellman-Merkle key exchange), they laid the theoretical framework and invited others to join the search. Rivest, Shamir, and Adleman read the paper, did a lot of hard thinking, and a year later published A Method for Obtaining Digital Signatures and Public-Key Cryptosystems - now known as the RSA algorithm.

David works on Infrastructure Security for Salesforce.com.

Read more about this Meetup

PWL #26: Boruta && Maglev

Map Date/Time: 2016-12-01 06:30pm Location: Thomas Street - Seattle

It's a two-for-one Paper-loving holiday bonanza!

Act I
Brandon Sherman presents Boruta - A System for Feature Selection.

"Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system."

Act II
Ryan Cox presents <a>Maglev: A Fast and Reliable Software Network Load Balancer</a>.

"Maglev is Google’s network load balancer. It is a large distributed software system that runs on commodity Linux servers. Unlike traditional hardware network load balancers, it does not require a specializ…

Read more about this Meetup

PWL #25: Flexible Paxos

Map Date/Time: 2016-11-03 06:30pm Location: thePlatform - Seattle

PWL Mini
David Murray presents Inferring and Debugging Path MTU Discovery Failures.

The Internet was built on rough consensus and running code, and the code does in fact mostly run. This paper from 2005 discusses one of the edge cases where it doesn't - an edge case that still occasionally bothers Papers We Love presenters named David in 2016.

The Main Event
Denis Rystsov presents Flexible Paxos: Quorum intersection revisited.

The paper explores how non-standard quorum configurations can improve latency without affecting correctness of a system. In this talk Denis will describe how Paxos works and how Flexible Paxos differs from the other algorithms of the Paxos family.

Who What Where?
Big ups to Comcast for hosting this month! There will be a person at the front door ushering folk…

Read more about this Meetup

PWL #23: A New Implementation Technique for Applicative Languages

Map Date/Time: 2016-09-08 06:30pm Location: Whitepages - Seattle 98101

PWL Mini
Tristan Penman presents A Few Useful Things to Know about Machine Learning.

At just over eight pages, this paper by Pedro Domingos delivers an approachable summary of some of the challenges and misunderstandings faced by those new to the field of Machine Learning.

The Main Event
David Graunke presents A New Implementation Technique for Applicative Languages from 1979.

The paper describes a technique for eliminating bound variables from lambda calculus programs by compiling to a small set of combinators, and describes a machine for executing the resulting combinator programs.

This paper is a pretty gentle introduction to the challenges of implementing functional languages in a way that's simple and efficient. The compilation algorithm in the paper is base…

Read more about this Meetup

PWL #22: Tango: Distributed Data Structures over a Shared Log

Map Date/Time: 2016-08-11 06:30pm Location: Whitepages - Seattle 98101

The Main Event:
Derek Elkins presentsTango: Distributed Data Structures over a Shared Log.

```
Distributed systems are easier to build than ever with the emergence of new, data-centric abstractions for storing and computing over massive datasets. However, similar abstractions do not exist for storing and accessing metadata. To fill this gap, Tango provides developers with the abstraction of a replicated, in-memory data structure (such as a map or a tree) backed by a shared log. Tango objects are easy to build and use, replicating state via simple append and read operations on the shared log instead of complex distributed protocols; in the process, they obtain properties such as linearizability, persistence and high availability from the shared log. Tango also leverages the shared log to enable fast transactions across different objects, allowing applications to partition state across m…

Read more about this Meetup

PWL #21: The Art of the Propagator

Map Date/Time: 2016-07-14 06:30pm Location: thePlatform - Seattle

The Paper
Scott Francis presents The Art of the Propagator.

"We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs."

We will also be covering some extra material from the Revised Report on the Propagator Model , which contains the practical results from implementing the Propagator Model in MIT Scheme.

Boring Details
Thanks to thePlatform for hosting!

We …

Read more about this Meetup

PWL #20: Liquid Haskell / PayWord and Micromint

Map Date/Time: 2016-06-09 06:30pm Location: Whitepages - Seattle 98101

[Hi friends! For June we're going to try something a little different - two presenters talking about two different papers in a shorter format. I'm excited to see how it works! As usual, if you have a paper you love and would like to present at a future meetup, give me a shout! -d]

PayWord and MicroMint: Two simple micropayment schemes

This paper from 2001 written by Ronald Rivest and Adi Shamir (The R and S from RSA) is discussing how to design a micropayments scheme when making one of those was a popular idea. It uses relatively simple cryptography and some neat tricks in order to build systems that solve a complex problem.

http://people.csail.mit.edu/rivest/RivestShamir-mpay.pdf

Harley graduated from the University of California, Riverside and has been a developer for the past decade. They are interested in microservices, security, and free software.

<…

Read more about this Meetup

PWL #19: CFA: A Practical Prediction System for Video QoE Optimization

Map Date/Time: 2016-05-12 06:30pm Location: Whitepages - Seattle 98101

Many prior efforts have suggested that Internet video Quality of Experience (QoE) could be dramatically improved by using data-driven prediction of video quality for different choices (e.g., CDN or bitrate) to make optimal decisions. However, building such a prediction system is challenging on two fronts. First, the relationships between video quality and observed session features can be quite complex. Second, video quality changes dynamically. Thus, we need a prediction model that is (a) expressive enough to capture these complex relationships and (b) capable of updating quality predictions in near real-time. Unfortunately, several seemingly natural solutions (e.g., simple machine learning approaches and simple network models) fail on one or more fronts. Thus, the potential benefits promised by these prior efforts remain unrealized. We address these challenges and present the design and implementation of Critical Feature Analytics (CFA). The design of CFA is driven by domain-specif…

Read more about this Meetup

PWL #18: Conflict-Free Replicated Data Types

Map Date/Time: 2016-04-14 06:30pm Location: Whitepages - Seattle 98101

Y'all have called my bluff, we're talking about CRDTs! We'll use "A comprehensive study of Convergent and Commutative Replicated Data Types."

http://hal.upmc.fr/inria-00555588/document

Eventual consistency aims to ensure that replicas of some mutable shared object converge without foreground synchronisation. Previous approaches to eventual consistency are ad-hoc and error-prone. We study a principled approach: to base the design of shared data types on some simple formal conditions that are sufficient to guarantee eventual consistency. We call these types Convergent or Commutative Replicated Data Types (CRDTs). This paper formalises asynchronous object replication, either state based or operation based, and provides a sufficient condition appropriate for each case. It describes several useful CRDTs, including container data types supportin…

Read more about this Meetup

PWL #17.5: Immutability Changes Everything

Map Date/Time: 2016-03-29 06:00pm Location: Salesforce.com - Seattle

[Hi fellow paper lovers! One of *my* favorite paper-lovers, Pat Helland, is going to be in town later this month, and we talked him into doing a PWL while he's here! It's kind of cheating because he loves a paper that he wrote, but he got shoehorned into this by well-meaning fan-persons without much time to prepare so I'll still count it :)]

http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf

Abstract:

"There is an inexorable trend towards storing and sending immutable data. We need immutability to coordinate at a distance and we can afford immutability, as storage gets cheaper.

This paper is simply an amuse-bouche on the repeated patterns of computing that leverage immutability. Climbing up and down the compute stack really does yield a sense of déjà vu all over again."

Who is this Pat person?

"Pat Helland has been implementing transaction systems,…

Read more about this Meetup

PWL #17: Fuzzing: The State of the Art

Map Date/Time: 2016-03-10 06:30pm Location: Whitepages - Seattle 98101

Fuzzing is a technique to find software defects by providing random(ish) input to a system, and watching to see which patterns of input cause bad behavior.

The paper is "Fuzzing: The State of the Art", available here:
http://www.dtic.mil/dtic/tr/fulltext/u2/a558209.pdf

The talk will go over the history of fuzzing, different methods of fuzzing, and how fuzzing can be applicable to your everyday life.

…

Read more about this Meetup

PWL #16: Chord

Map Date/Time: 2016-02-11 06:30pm Location: Whitepages - Seattle 98101

Tristan Penman will present Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.

"A fundamental problem that confronts peer-to-peer applications is to efficiently locate the node that stores a particular data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data item pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis, simulations, and ex- periments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes."

Read more about this Meetup

PWL #15.5: Scalable Atomic Visibility with RAMP Transactions

Map Date/Time: 2016-01-27 06:30pm Location: Thomas Street - Seattle

[Hi fellow paper lovers! We're starting to build up a decent backlog of topics, I've heard from a couple interested people who have trouble making the usual Thursday timeslot, so I figured it might be fun to try a second event in January on a different night of the week to spread the love around. For the moment I'm only committing to one, but if it's a hit maybe we'll turn it into a regular thing. Let me know your feelings on the subject, if you have them!]

Denis Rystsov will present "Scalable Atomic Visibility with RAMP Transactions" by Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein and Ion Stoica.

It's a fresh paper of last year (2014) where authors propose a new isolation model — Read Atomic (RA) isolation — that helps to achieve incredible scalability in multi-partition distributed databases.

Read Atomic isolation is similar to Read Committed isolation, and provides atomic visibility: either all or none of each transaction’s updates are …

Read more about this Meetup

PWL #15: The Dataflow Model and Millwheel: Fault tolerant stream processing

Map Date/Time: 2016-01-14 06:30pm Location: Whitepages - Seattle 98101

Ankur Chauhan will present the latest research on stream processing systems that was recently presented at VLDB 2015 by the team at Google. The paper is titled: "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing". This presents the state of the art in stream processing systems at scale and are also the technologies that lie at the heart of Google Cloud Dataflow.

Abstract:

Unbounded, unordered, global-scale datasets are increasingly common in day-to-day business (e.g. Web logs, mobile usage statistics, and sensor networks). At the same time, consumers of these datasets have evolved sophisticated requirements, such as event-time ordering and windowing by features of the data themselves, in addition to an insatiable hunger for faster answers. Meanwhile, practicality dictates that one can ne…

Read more about this Meetup

PWL #14: A Critique of the CAP Theorem

Map Date/Time: 2015-12-10 06:30pm Location: Whitepages - Seattle 98101

Trevor Lalish-Menagh will present "A Critique of the CAP Theorem." Abstract:

The CAP Theorem is a frequently cited impossibility result in distributed systems, especially among NoSQL distributed databases. In this paper we survey some of the confusion about the meaning of CAP, including inconsistencies and ambiguities in its definitions, and we highlight some problems in its formalization. CAP is often interpreted as proof that eventually consistent databases have better availability properties than strongly consistent databases; although there is some truth in this, we show that more careful reasoning is required. These problems cast doubt on the utility of CAP as a tool for reasoning about trade-offs in practical systems. As alternative to CAP, we propose a "delay-sensitivity" framework, which analyzes the sensitivity of operation latency to network delay, and which may help practitioners reason about the trade-offs between consistency guarantees and tolerance of network fa…

Read more about this Meetup

PWL #13: Survivable Key Compromise in Software Update Systems

Map Date/Time: 2015-11-12 06:30pm Location: Whitepages - Seattle 98101

Ryan Cox will cover "Survivable Key Compromise in Software Update Systems". What happens when your signing keys are compromised or checked into GitHub? He will demo Notary, Docker's implementation of TheUpdateFramework; described in the paper. TUF is a system that grew out of Tor and is capable of surviving key compromises as well as several other issues in current update managers.

Paper: https://isis.poly.edu/~jcappos/papers/samuel_tuf_ccs_2010.pdf

Samuel, Justin, et al. "Survivable key compromise in software update systems." Proceedings of the 17th ACM conference on Computer and communications security. ACM, 2010.

…

Read more about this Meetup

PWL#12 Predicting Voice Elicited Emotions

Map Date/Time: 2015-10-08 06:30pm Location: Whitepages - Seattle 98101

https://goo.gl/uJ4Mvk

This time Jose Contreras will be presenting a KDD paper on "Predicting Voice Elicited Emotions".

Abstract:

We present the research, and product development and deployment, of Voice Analyzer™ by Jobaline Inc. This is a patent pending technology that analyzes voice data and predicts human emotions elicited by the paralinguistic elements of a voice.

Human voice characteristics, such as tone, complement the verbal communication. In several contexts of communication, “how” things are said is just as important as “what” is being said. This paper provides an overview of our deployed system, the raw data, the data processing steps, and the prediction algorithms we experimented with. A case study is included where, given a voice clip, our model predicts the degree in which a listener will find the voice “engaging”. Our prediction results were verified through independ…

Read more about this Meetup

PWL#11 Ideal Hash Trees

Map Date/Time: 2015-09-10 07:00pm Location: Whitepages - Seattle 98101

David Murray will present Ideal Hash Trees:

"Hash Trees with nearly ideal characteristics are described. These Hash Trees require no initial root hash table yet are faster and use significantly less space than chained or double hash trees. Insert, search and delete times are small and constant, independent of key set size, operations are O(1). Small worst-case times for insert, search and removal operations can be guaranteed and misses cost less than successful searches. Array Mapped Tries(AMT), first described in Fast and Space Efficient Trie Searches, Bagwell [2000], form the underlying data structure. The concept is then applied to external disk or distributed storage to obtain an algorithm that achieves single access searches, close to single access inserts and greater than 80 percent disk block load factors. Comparisons are made with Linear Hashing, Litwin, Neimat, and Schneider [1993] and B-Trees, R.Bayer an…

Read more about this Meetup

Mesos - A Platform for Fine-Grained Resource Sharing in the Data Center

Map Date/Time: 2015-08-13 07:00pm Location: Whitepages - Seattle 98101

Abstract

We present Mesos, a platform for sharing commodity clusters between multiple diverse cluster computing frameworks, such as Hadoop and MPI. Sharing improves cluster utilization and avoids per-framework data replication. Mesos shares resources in a fine-grained manner, allowing frameworks to achieve data locality by taking turns reading data stored on each machine. To support the sophisticated schedulers of today's frameworks, Mesos introduces a distributed two-level scheduling mechanism called resource offers. Mesos decides how many resources to offer each framework, while frameworks decide which resources to accept and which computations to run on them. Our results show that Mesos can achieve near-optimal data locality when sharing the cluster among diverse frameworks, can scale to 50,000 (emulated) nodes, and is resilient to failures.

Presenter:

Ankur Chauhan

Links:

Read more about this Meetup

PWL#9: Dedalus: Datalog in Time and Space

Map Date/Time: 2015-07-09 06:30pm Location: Whitepages - Seattle 98101

Derek Elkins will present on "Dedalus: Datalog in Time and Space" touching also on some follow-on work.

Abstract.

Recent research has explored using Datalog-based languages to express a distributed system as a set of logical invariants. Two properties of distributed systems proved difficult to model in Datalog. First, the state of any such system evolves with its execution. Second, deductions in these systems may be arbitrarily delayed, dropped, or reordered by the unreliable network links they must traverse. Previous efforts addressed the former by extending Datalog to include updates, key constraints, persistence and events, and the latter by assuming ordered and reliable delivery while ignoring delay. These details have a semantics outside Datalog, which increases the complexity of the language and its interpretation, and forces programmers to think operationally. We argue that the missing component from these previous languages is a notion of time.

I…

Read more about this Meetup

PWL#8: Chain Replication for Supporting High Throughput and Availability

Map Date/Time: 2015-06-11 06:30pm Location: Whitepages - Seattle 98101

David Murray will present chain replication: a high-throughput alternative to quorum-based replication protocols like PAXOS and RAFT.

Abstract

Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several objectplacement strategies (including schemes based on distributed hash table routing) are discussed.

Link to the paper: http://www.cs.cornell.edu/home/rvr/papers/osdi04.pdf

…

Read more about this Meetup

PWL#7: The LCA Problem Revisited

Map Date/Time: 2015-05-14 06:30pm Location: Whitepages - Seattle 98101

This time Ankur Chauhan will present the paper: The LCA Problem Revisited by Michael A. Bender and Martin Farach-Colton. The lowest common ancestor problem was first stated in 1973 and it took 11 years before an optimal solution was discovered, and another 16 before an understandable and implementable solution with the same bounds was presented. This deceptively simple problem comes together in the end and uses techniques that are powerful in plenty of other places.

Link to the paper: http://www.ics.uci.edu/~eppstein/261/BenFar-LCA-00.pdf

…

Read more about this Meetup

PWL#6: Brandon Bloom on Programming with Algebraic Effects and Handlers

Map Date/Time: 2015-04-16 06:30pm Location: Whitepages - Seattle 98101

we're excited to have Brandon Bloom presenting the paper Programming with Algebraic Effects and Handlers by Andrej Bauer and Matija Pretnar.

Intro

Some great papers embody insights, others package up those insights into digestible bites. "Programing with Algebraic Effects and Handlers" is the later sort of great paper. After two decades of fundamental research in to the nature of computation, a lot of mysterious ideas in computer science such as continuations and exception handling finally made sense to a number of mathematically inclined geniuses. Bauer and Pretnar's Eff programming language cuts right through the heart of the theory in a way that makes sense to anybody who has ever written a functional program. This paper uses the Eff language to explore a number of simple com…

Read more about this Meetup

PWL#5: Dynamo: Amazon’s Highly Available Key-value Store

Map Date/Time: 2015-03-10 06:30pm Location: The Project Room - Seattle

We will be covering the paper that launched many other NoSQL databases over the years. Apache Cassandra, Voldemort, Riak to name a few.

ABSTRACT Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience…

Read more about this Meetup

PWL#4: Bigtable: A Distributed Storage System for Structured Data

Map Date/Time: 2015-02-10 06:30pm Location: Moz - Seattle

Abstract

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

Arunabha Ghosh, director of Engineering at Moz, will be presenting this paper. Before Moz, Arunabha spend around 7 years at…

Read more about this Meetup

PWL#3: LSM Trees, BTrees and Cache oblivious B-Trees

Map Date/Time: 2015-01-15 06:30pm Location: The Project Room - Seattle

After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.

This time I will cover B-Trees, LSM Trees and Fractal Trees papers and provide some real world use cases (and data) along with the a discussion on the respective papers.

…

Read more about this Meetup

PWL#2: RRB-Trees: Efficient Immutable Vectors

Map Date/Time: 2014-10-23 06:30pm Location: Madrona Venture Group - Seattle

Abstract

Immutable vectors are a convenient data structure for functional programming and part of the standard library of modern languages like Clojure and Scala. The common implementation is based on wide trees with a fixed number of children per node, which allows fast indexed lookup and update operations. In this paper, the authors extend the vector data type with a new underlying data structure, Relaxed Radix Balanced Trees (RRB-Trees), and show how this structure allows immutable vector concatenation, insert-at and splits in O(logN) time while maintaining the index, update and iteration speeds of the original vector data structure.

Chris Bilson will be presenting this paper followed by a Q&A session.

Link: <a>RRB-Trees: Efficient Immutable Vectors</a>

…

Read more about this Meetup

Inaugural meetup

Map Date/Time: 2014-09-23 07:00pm Location: Cloudant - Seattle 98108

The idea is to bridge the gap between theory and practice and the first step is to disseminate the knowledge and we have and explore new horizons.

For the first meetup we will be discussing Paxos - The part time parliament by Leslie Lamport. This is one of the most widely known papers in the distributed systems community but also known to be notoriously complicated and rarely well understood. We shall endeavour to change that!

Reading list for Paxos:

• [The Part-Time Parliament](http://research.microsoft.com/en-us/um/people/lamport/pubs/lamport-paxos.pdf)

• [Paxos Made simple](

Read more about this Meetup