Please welcome Tom Hall presenting: "Erasure Code with Shingled Local Parity Groups for Efficient Recovery from Multiple Disk Failures" by Takeshi Miyamae, Takanori Nakao, and Kensuke Shiozawa, Fujitsu Laboratories Ltd. The paper can be downloaded here.
Previously at PwL, I have given a talk on RAID systems and their clever use of some interesting mathematics to recover from data loss while using space more efficiencty than a simplistic 'replicate the data N times' approach.
While the approach taken with RAID6 (so-called Reed-Soloman encodings) to protect from 2 failures can be generalised (and has been in RAIDZ2 in the ZFS file system), moving to higher redunancy requires more work to be be done and is difficult to implement efficiently.
There is lots of …
Simon Peyton Jones presents: "Getting from A to B: fast route-finding using slow computers", a talk inspired by "Computing the shortest path: A* search meets graph theory" paper (Andrew V Goldberg and Chris Harrelson).
[WARNING] Unfortunately, because of building and safety regulations, we are forced to a max number of 100 people in the room. Please be aware that we will be forced to leave people out if we reach that number. I'm really sorry for this drastic decision but there is no way to predict how many people will show up.
We all now take it for granted that online maps will rapidly display a fast route from A to B. But these maps are gigantic: a map of Europe has 20 million intersections and 40 million road segments. How can a computer find t…
When a paper gets really large it transforms into a book! Tomas Petricek (@tomaspetricek) is presenting "Where Mathematics Comes From" by G. Lakoff and R. E. Núñez. A paper related to this book is "Lakoff, Nunez  The Metaphorical Structure of Mathematics" available at this link.
• 6.30pm: networking, pizza and drinks.
• 7:00pm: presentation starts
This book by cognitive linguist George Lakoff and a psychologist Rafael Nunez seeks to found a cognitive science of mathematics. The key idea is that mathematics is born from our interactions with the world (it is embodied) through cognitive metaphors (that construct more complex mathematical structures as metaphors of simpler ones). The book uses the framework …
Chris Ford (@ctford) is presenting the paper "Analysis by Compression" by David Meredith. Please have a look at the paper here before the meeting.
• 6.30pm: pizza and drinks
• 7:00pm: presentation starts
In his paper 'Analysis by Compression', David Meredith wonders how are we supposed to decide which is “better” if we are given two analyses of the same musical work. His fascinating answer is to turn to complexity theory. Kolmogorov complexity analysis suggests that we can measure how well we understand a piece of music by the concision of a program that produces it. We can then compare programmatic 'explanations' of the same piece of music via their length.Meredith also sketches a way to apply his insight to psychological coding theory of perceptual organisation. If you accept his argument, it shows that complexity theory is n…
Thomas Depierre (@DianaOlympos) will be presenting the paper "Programming with Abstract Data Types" by B. Liskov and S. Zilles 1974", available here for reading before the meetup:
Along the course of the past 50 years, programming centred mainly around abstraction, being for control flow, operations to execute or data objects and structure. In this foundational paper, B. Liskov and S. Zilles described a concept that is now so deeply embedded into our programming ecosystem that we forget why it was even invented. Come join us to explore what Abstract Data Types still have to teach us.
Coming from a background of maths and EE, Thomas ended up as a Maker at Asce…
What happened in the basements of the AI lab at MIT in the late '50 is truly inspiring and this paper is telling us that story. Herbert Stoyan spent many years researching those early days and even McCarthy is pointing at his work in his papers!
Lisp is one of the oldest languages out there. When I started learning about functional programming, I got interested in why Lisp languages are what they are today, what's the relationship with AI, what lambda calculus has to do with this, or why I kept hearing things like "Lisp has failed".
Surprisingly, many concepts that we give for granted in functional programming today were there almost 60 years ago!
The paper is quite accessible and a good start for the new season of PWL. Some Lisp sources will be presented but there is no need to have a deep knowledge of the language. An html version of the paper is available at
As always, please sign up at Skills Matter and thanks to them for hosting us!
Tom Hall on Plank's "A Tutorial on Reed-Solomon Coding for
Fault-Tolerance in RAID-like Systems"
In what will be my last Papers We Love before taking a long overdue sabbatical and leaving the UK for a while I want to explain how RAID works, why we need it and why existing strategies may become less useful at protecting us from data loss.
The paper  is for systems programmers, is quite accessible (modulo an error corrected in a follow-up 7 years later ) and explains a strategy to survive any number of disk failures by maintaining checksums using Galois Fields (which it also explains)
You may find it useful to also look at 'The mathematics of RAID-6'  explaining what the linux kernels RAID6 implementation does.
Please sign up at Skills Matter also
Certain papers change your life. McCarthy’s ‘Recursive Functions of Symbolic Expressions and their Computation by Machine (Part I)’ (1960) changed mine, and so did Landin’s ‘The Next 700 Programming Languages’ (1966). And I remember the moment, halfway through my graduate career, when Guy Steele handed me Reynolds’s ‘Definitional Interpreters for Higher-Order Programming Languages’ (1972).
It is now common to explicate the structure of a programming language by presenting an interpreter for that language. If the language interpreted is the same as the language doing the interpreting, the interpreter is called meta-circular.
Interpreters may be written at differing levels of detail, to explicate different implementation strategies. For instance, the interpreter may be written in a continuation-passing style; or some of the higher-order …
As always, please sign up on Skills Matter too.
Sorry for the short notice. As Tom mentioned at the end of our meetup on Tuesday, we've got a bonus speaker for May!
At this meetup, Gareth Morgan will presenting on The Rendering Equation. This paper is one of the most important 3D graphics papers in the subject's history. The rendering equation put forth by James Kajiya is the foundation of many of the rendering techniques used in modern graphics. Anytime a computer generates physically accurate image it is actually attempting to solve the rendering equation.
Gareth has been involved in games and 3D graphics since 1999, starting at Silicon Graphics followed by several games companies. He spent eight years at Imagination Technologies researching sop…
As ever, please sign up at Skills Matter too.
In "End-to-End Arguments In System Design", Saltzer, Reed, and Clark argue that building a robust and scalable system comes from pushing work to the edges of a system. or if you like: why bittorrent worked and multicast continues to fail.
tef will also be arguing the same thing, but saying things out loud rather than writing things down.…
We have EuroSys in town so a few peeps attending that are doing a CRDT themed social meet in Hoop and Toy near South Kensington.
"Are you curious about CRDT research? Are you looking into this and related technologyto tackle scalability and latency reduction in a new system?
If you are interested on an informal exchange of ideas about current trends on research and industry uses of Conflict-Free Replicated Data-Types (CRDTs), we are organizing a Get-together in central London on April 19th, starting around 17h00. Several of the initial researchers and early developers will attend, and will be open for chatting on new ideas or just help explain why a semi-lattice is cool.…
Fouad Mardini on "Worst-case analysis of a new heuristic for the traveling salesman problem" by Nicos Christofides
The Traveling Salesman Problem might very well be the most studied problem in combinatorial optimization. Much of this interest comes from the fact that the TSP is an NP hard problem, yet the TSP is studied on its own merits as it has applications in many fields.
We will use the paper of Christofides as a starting point to discuss the NP complexity class and approximation algorithms to intractable problems. Christofides algorithm is simple and elegant, yet gives the best known approximation ratio for the symmetric TSP.
Along the way we will detour into graph matchings and other interesting tidbits such as Aurora's Euclidean TSP approximation scheme and the Held-Karp LP relaxation.
If you are coming, please sign up at Skills Matter also
Phil Potter on Dolstra's "The Purely Functional Software Deployment Model"
Don't forget to RSVP to Skillsmatter too: https://skillsmatter.com/meetups/7857-dolstra-s-the-purely-functional-software-deployment-model-p
This PhD thesis is about software deployment. Deployment is not often seen as an academic topic, but the reason I like this thesis is because it takes an academic approach of defining the problem carefully and choosing the correct abstractions, and in doing so enables the application of knowledge from other areas such as memory management and memoization to the seemingly unrelated field of deployment.
The thesis collects the theoretical ideas together into a practical tool called nix, a package manager which is st…
Remember to sign up on Skills Matter too, so they know you're coming!
Tom Crayford is a database wrangler.
Chamberlin, Astrahan, Blagsen, Gray, King, Lindsay, Lorie, Mehl, Price, Putzolo, Selinger, Schkolnick, Slutz, Traiger, Wade, Yost
Relational Database technology is one of the largest successes of computology over the past 46 years. Most useful non-blob data written to disk is probably in an RDBMS somewhere. But in 1970, all relational meant was "somebody had written a weird paper". System R was the research project that took that paper and made it of huge practical importance. On the way, the team discovered dozens of new research areas, techniques for database implementation, and ultimately made relat…
This is at Skills Matter so you will need to sign up there too https://skillsmatter.com/meetups/7466-camilla-montonen-support-vector-machines-and-kernels-for-computational-biology Thanks!
BenHur A, Ong CS, Sonnenburg S, Schölkopf B, Rätsch G (2008) Support Vector Machines and Kernels for Computational Biology.
The widespread adoption of highthroughput sequencing machinery has produced an unprecedented amount of genomic data for biologists to analyse. To fully leverage the potential patterns hidden in the petabytes of DNA and RNA sequence information requires the use of machine learning algorithms and specialised kernels, which ca…
Emily was until recently a Backend Engineer in SoundCloud's Data Team. Before that she was a Haskell developer for a Swedish e-signing startup, and before that she was a developer for a variety of financial institutions. She lives in London, and likes gardening and hanging out with her cat Albertina.
Leslie Lamport. 1998. The Part Time Parliament
The Paxos algorithm is a consensus algorithm that’s considered one of the most important in distributed computing. Cassandra uses it for transactions,
The paper, by Melissa O’Neill, is available here.
I like it because I first read it not knowing Haskell very well and I read through the paper a few times and had a wonderful a-ha moment with a sentence or two that made the idea click and was able to implement it in Python. I have since re-done it in Clojure so you will see it done 3 ways (and hopeful I will share an a-ha moment with you)
"""A much beloved and widely used example showing the elegance and simplicity of lazy functional programming represents itself as "The Sieve of Eratosthenes." The paper shows that this example is not the sieve and presents an implementation that actually is."""
Starting with the classic one-liner sieve
sieve (p : xs) = p : sieve [x | x <− xs, x ‘mod‘ p > 0]
primes = sieve [2..]
O'Neill proceeds to show why this standard rendition of the Sieve of Eratosthenes does not in…
Hagit Attiya and Jennifer L Welch: “Sequential Consistency versus Linearizability,” ACM Transactions on Computer Systems (TOCS), volume 12, number 2, pages 91–122, May 1994.
An often-cited constraint on distributed database design is the CAP theorem, an impossibility result in distributed systems. It states that in a linearizable database, if the network is interrupted, some nodes cannot respond to requests. Although being able to tolerate network faults is important, the performance and response times of a database are often even more important, and CAP says nothing about th…
Oliver Charles presents Conor McBride's "Kleisli Arrows of Outrageous Fortune".
For a long time, it wasn't clear how to get purely functional programs to actually "do" anything. Finally, a breakthrough came by sharing ideas from category theory to bring monads into functional programming, as a way to model and manage side effects. However, little has been done to ensure that entire interaction sequences make sense. For example, many functional programming languages will allow you to open a file handle, close it, and then try and read from it - clearly a nonsensical sequence of operations! In "Kleisli Arrows of Outrageous Fortune", Conor McBride shows us how we can borrow more ideas from the literature, introducing a new formulation of indexed monads - a structure with close connections to slice categories. Given these new tools, he demonstrates that these monads can correspond to enf…
Andy Bennett on "Scalable Atomic Visibility with RAMP Transactions"
Bailis et al formalise "Read Atomic" isolation which is a database isolation level sufficient for many applications which make use of large,distributed databases in real world applications. They then show how to design a database system which provides this isolation level whilst allowing read, write and read/write transactions on data across multiple partitions: partitions which do not need to communicate with each other.This method is both performant and fairly portable to existing architectures and allows database authors to implement much stronger guarantees than are currently available from systems in use at companies such as Facebook, LinkedIn, Yahoo or Amazon.
In this talk we will explore the contributions of this paper and how they can be used solve some of the anomolies encountered in today's multi-node, distributed, database systems and accompanying applications.
In ‘the real world’, nothing is simple and failure is unavoidable. “How Complex Systems Fail” was discovered by Adam when he followed a reference in a database book. It expanded his mind on the similarities between hospitals, nuclear weapons, and databases, and now he’d like to share these with you.
The paper is available at </a><a href="http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf">http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf . Its subtitle, "Being a Short Treatise on the Nature of Failure;
How Failure is Evaluated; How Failure is Attributed to Proximate
Cause; and the Resulting New Understanding of Patient Safety", serves as a brief abstract.
Having started programming at the age…
Relational programming, or logic programming, is a programming paradigm that exhibits remarkable and powerful properties, to the extent that its implementation seems frightfully daunting to the layman. µKanren is a minimal relational language that seeks to strip the paradigm down to its core, leaving us with a succinct, elegant and above all simple set of primitives on top of which we can rebuild even the most powerful relational constructs.
In this talk, we will explore the µKanren language by implementing it from first principles in a simple functional programming language, going on to demonstrate how you can assemble these simple building blocks into a semblance of its richer parent, miniKanren, and maybe solve a logic puzzle or two to make sure it’s working as advertised.
The µKanren paper, and the original µKanren implementation, were authored by Jason Hemann and Daniel P. Friedman. The paper is available at
Ricardo Monti from Imperial College London will presenting Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers by Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato and Jonathan Eckstein.
Doors open at 6:30 pm; the presentation will begin at 7:00 pm.
We hope that you'll take a look at the paper before the Meetup (and if you don't, no worries).
Many modern statistics and machine learning problems can be recast as optimization problems where we look to minimize a convex objective function.
The resulting objective function enforces a set of desirable properties which we wish our answer to possess: a popular example would be sparsity (i.e., we wish to obtain a sparse answer for reasons of interpretability or for other reasons).
However, it is often the case that minimizing such objective functions is difficult. While there …
Ryan Kennedy from Yammer Engineering will presenting Dapper, a Large-Scale Distributed Systems Tracing Infrastructure paper by Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan and Chandan Shanbhag.
Doors open at 6:30 pm; the presentation will begin at 7:00 pm
We hope that you'll read the paper before the Meetup (and if you don't, no worries).
Ryan Kennedy is an Infrastructure Engineer at Yammer in way over his head. When he's not busy cobbling together databases from the odds and ends he finds in Maven Central he can often be found scouring the world for inedible nachos.
Equal Rights for Functional Objects, or, The More Things Change, the More they Stay the Same
There are many ways of defining equality, particularly in a language which has both object-oriented and functional aspects. There's reference equality, numerical equality, shallow list equality, deep list equality, string equality, and even "fuzzy" floating-point equality. Which is the right to choose in any given situation? Which should i use for my equals() method on my custom object?
This paper answers all of these questions, by taking a seemingly complex problem and revealing an underlying simplicity. It does this by defining one equality operation that works in any situation.
The paper makes use of lisp syntax, but the ideas are applicable to most modern languages.
Can anyone sponsor pizza and drinks? Makes it much easier for me to host more things at uSwitch if someone else covers the food.…