Martin Monperrus presents: "Creating an Artificial Software Developer" based on his ongoing research in the field at the KTH Royal Institute of Technology (https://www.monperrus.net/martin/publications).
This talk describes research in AI assisted software development, and the vision of bots who help software engineers with for all sorts of tasks. In particular, they are doing automated bug fixing. It's a talk based on recent software engineering research, w…
Claudio Martella (Google Sciengineer, https://twitter.com/claudiomartella) presents the talk:
"Massive-scale graph partitioning with Giraph"
Based on his paper "Spinner: Scalable Graph Partitioning in the Cloud" (by Claudio Martella, Dionysios Logothetis, Andreas Loukas, Georgos Siganos) downloadable from https://arxiv.org/abs/1404.3861
We present Spinner, a scalable and adaptive graph partitioning algorithm based on label propagation designed on top of the Pregel model. Spinner scales to massive graphs, produces partitions with locality and balance comparable to the state-of-the-art and efficiently adapts the partitioning upon changes. We describe our algorithm and its implementation in the Pregel programming model that makes it possible to partition billion-vertex graphs. We evaluate Spinner with a variety of synth…
"ACL's Don't" by Tyler Close.
Access control lists, in the form of RBAC and related technology are everywhere. They are fundamental to how we design authorisation mechanisms in systems today. However they don’t actually work, as this lovely, short and beautifully written paper by Tyler Close from 2009 explains, and major security issues such as the recent Kubernetes vulnerability continue to happen because of this.
Justin Cormack is a security engineer at Docker, ba…
Derek Jones (http://www.knosof.co.uk) presents the paper:
"A conversation around the analysis of the SiP effort estimation dataset"
-> Download Paper+link to data: http://arxiv.org/abs/1901.01621
An analysis of over ten years of commercial development using Agile
(10,100 unique task estimates made by 22 developers, under 20 project
codes). Factors found to influence task implementation effort estimation accuracy included the person making the estimate, the project involved, and the propensity to use round numbers.
Derek used to write compilers that translated what people wrote, and then moved on to analyzing code to try to work out what they intended to write. These days he is analyzing all the publicly available software engineering data:
Dimos Raptis (https://twitter.com/dimosr7) presents the original "Bitcoin: A Peer-to-Peer Electronic Cash System" paper by Satoshi Nakamoto.
Download the paper here: https://bitcoin.org/bitcoin.pdf
It's been now 10 years since the genesis of Bitcoin, which has given birth to many more protocols and technologies, while also helping establish blockchain as a technological concept. In this talk, we'll celebrate its birthday, by going through the initial paper by Satoshi Nakamoto. We will explain the main principles behind the idea, explain the basic design decisions and implementation details and how it deals with several problems. We will conclude the session by looking back at history and retrospecting on the implications of some of these design decisions.
Dimos has an MEng in Eletrical and Computer Enginee…
Steffen Zschaler (www.steffen-zschaler.de , https://twitter.com/szschaler) presents a talk titled
"Language workbenches: Better domain-specific languages more quickly"
Inspired by "Language workbenches" by Martin Fowler (https://www.martinfowler.com/articles/languageWorkbench.html) and "Evaluating and comparing language workbenches: Existing results and benchmarks for the future" by Erdweg and others (available at https://ir.cwi.nl/pub/23990/23990B.pdf).
Domain-specific or "little" languages have a long history. We all use them, but rarely think about designing our own, even though there are clear and well-demonstrated benefits. The main reason is that building good domain-specific languages and, in particular, good tools an…
Dimos Raptis (https://twitter.com/dimosr7) presents the paper "Dynamo: Amazon’s Highly Available Key-value Store" by Amazon (many authors).
Download the paper from:
During the last decade, Internet companies have achieved unprecedented growth. As a result of this, their systems needed to cope with workloads they had never experienced before. The existing data storage technologies were not capable of supporting these workloads, because of inherent limitations, such as single point of failures, vertical scaling etc.
In an effort to solve these problems, companies started re-evaluating the architecture of existing systems and started building a new generation of systems, focused on a more distributed, scalable and highly…
You can download the paper at https://www.cc.gatech.edu/~orso/papers/term.orso.harrold.ASE04.pdf
In order to identify the changed program entities between two versions of a program (original and modified) and to classify them as added, deleted, or modified, differencing algorithms are used. Differencing algorithms are the core part of many software engineering tasks. For example, change impact analysis is used to identify the program parts that are affected by changes. For this purpose, the impact analysis uses the informatio…
Tom Harding (http://www.tomharding.me , https://twitter.com/am_i_tom) presents a talk on " Comonads all the way down" inspired by the paper "Declarative UIs are the Future — And the Future is Comonadic!" by Phil Freeman.
Download the paper at:
Anyone who has been involved with front-end development over the last
few years will have seen drastic changes in the way we approach UIs.
As we move from jQuery to React, the trend has been clear: we want our
UIs to be declarative functions from our state to our HTML. So...
Phil's conference paper introduces comonads as the abstraction to
represent the properties we want in our UIs, and explores the powerful
Bruno Bonacci (https://twitter.com/BrunoBonacci http://blog.brunobonacci.com) presents a talk on "Viewstamped Replication" on the papers "Viewstamped Replication" by Brian M. Oki
Barbara Liskov and "Viewstamped Replication Revisited" by Barbara Liskov and James Cowling.
Viewstamped Replication is a protocol to replicate state and converge to a consistent view in face of non Byzantine failures. The original paper was published in 1988 and then revisited in 2012.
It is regarded as one of the first papers on distributed consensus; as comparison Lamport's…
Ron Pressler (https://twitter.com/pressron, https://pron.github.io) presents a talk with title "On the Nature of Abstraction: An introduction to the mathematical analysis of programs" inspired by the papers "Binary Relations for Abstraction and Refinement" by David A Schmidt and "The Existence of Refinement Mappings" by Martín Abadi and Leslie Lamport.
There are two main approaches to analysing programs mathematically: representing the program as a formal object, a…
Bohdan Orlov (https://twitter.com/bohdan_orlov, https://www.moonpig.com/uk/217ab/) presents a talk with title "Do MVC like it’s 1979" inspired by the paper "The original MVC reports" by Trygve Reenskaug. The paper is available here: http://folk.uio.no/trygver/2007/MVC_Originals.pdf
Modern developers often think:
* I know MVC very well
* MVC is not good enough for my projects
Consequently, they may look for more “mature” architecture and end up writing a lot of boilerplate code just to make sure the architecture is scalable. In reality, the “mature” architecture could be a premature optimisation for cases where “real” MVC would be sufficient.
Often, the developers are only familiar with one interpretation of the MVC, the one which…
Nicola Mometto (https://twitter.com/Bronsa_) presents a talk on "The Kanren family of languages, a deep dive in μKanren" inspired by the paper "μKanren: A Minimal Functional Core for Relational Programming" by J. Hemann and D. P. Friedman. The paper is available here: http://webyrd.net/scheme-2013/papers/HemannMuKanren2013.pdf
Kanren is a family of logic and relational programming language, the most known and used dialect is miniKanren, but several others exist.
In this talk we'll go over the history of the Kanren family and we'll dive into the elegant and beautiful implementation of μKanren, an intentionally minimal and compact kernel for Kanren dialects, implementing complete relational semantics in just over 50 lines of code.
Nicola Mometto is a software developer with interests…
Bodil Stokke (https://bodil.lol and https://twitter.com/bodil) presents the paper "Basic Polymorphic Type Checking" by Luca Cardelli. The paper is available here: http://lucacardelli.name/Papers/BasicTypechecking.pdf
Cardelli's 1987 paper "Basic Polymorphic Type Checking" was, according to the author, the first attempt to explain the Hindley-Milner-Damas type system in a way that would be accessible to people who weren't either Hindley, Milner or Damas. It served as my first gentle introduction to the mysteries of type checking, back when I was obsessed with the conceit that the world needed another Lisp with a type system and I was going to have to build it (it would be called BODOL, because obviously).
It was my first significant journey into the mysteries of t…
Chas Emerick (http://cemerick.com & http://twitter.com/cemerick) presents a survey of metric and spatial data structures, guided by the book 'Foundations of Multidimensional and Metric Data Structures' by Hanan Samet, with particular details on two data structures and their papers:
* 'Corner Stitching: A Data-Structuring Technique for VLSI Layout Tools' by Ousterhout, (https://www2.eecs.berkeley.edu/Pubs/TechRpts/1983/CSD-83-114.pdf)
* 'R-trees: A Dynamic Index Structure for Spatial Searching' by Guttman,(https://klevas.mif.vu.lt/~algis/DSA/guttman.pdf)
Most of us are familiar with the notion of indexes thanks to databases, but that exposure belies the rich kale…
Neil Brown (https://academiccomputing.wordpress.com) presents a talk about the paper "An Empirical Investigation into Programming Language Syntax" by Stefik and Siebert (https://dl.acm.org/citation.cfm?id=2534973&CFID=849175684&CFTOKEN=92613695).
Program code has a dual aspect: it is understandable both by computers and by humans. Ever since computing moved on from directly entering numerical opcodes, we have had increasingly free rein to design for the humans more than the machines. However, this has been done in an evidence-free manner: designers choose keywords and notations without scientifically testing any aspects of their usability. This paper, "An Empirical Investigation into Programming Language Syntax" by Stefik and Siebert, aims to address this issue by scie…
Gareth Morgan (https://twitter.com/axumgraphics) presents a talk about the paper "Physically-Based Shading at Disney Animation" (downloadable at https://disney-animation.s3.amazonaws.com/library/s2012_pbs_disney_brdf_notes_v2.pdf) by Brent Burley at Walt Disney Animation Studios.
Brent's paper is an overview of how Disney Animation Studios pioneered physically based shading. This talk will summarize the paper and serve as an introduction to the field of physically based shading and BRDF theory generally. If you are interested in learning more about physically based rendering and the science of computer animation join us for this technical discussion.
PapersWeLove (http://paperswelove.org) London proudl…
Simon Peyton Jones presents: "Getting from A to B: fast route-finding using slow computers", a talk inspired by "Computing the shortest path: A* search meets graph theory" paper (Andrew V Goldberg and Chris Harrelson).
[WARNING] Unfortunately, because of building and safety regulations, we are forced to a max number of 100 people in the room. Please be aware that we will be forced to leave people out if we reach that number. I'm really sorry for this drastic decision but there is no way to predict how many people will show up.
We all now take it for granted that online maps will rapidly display a fast route from A to B. But these maps are gigantic: a map of Europe has 20 million intersections and 40 million road segments. How can a computer find t…
When a paper gets really large it transforms into a book! Tomas Petricek (@tomaspetricek) is presenting "Where Mathematics Comes From" by G. Lakoff and R. E. Núñez. A paper related to this book is "Lakoff, Nunez  The Metaphorical Structure of Mathematics" available at this link.
• 6.30pm: networking, pizza and drinks.
• 7:00pm: presentation starts
This book by cognitive linguist George Lakoff and a psychologist Rafael Nunez seeks to found a cognitive science of mathematics. The key idea is that mathematics is born from our interactions with the world (it is embodied) through cognitive metaphors (that construct more complex mathematical structures as metaphors of simpler ones). The book uses the framework …
Chris Ford (@ctford) is presenting the paper "Analysis by Compression" by David Meredith. Please have a look at the paper here before the meeting.
• 6.30pm: pizza and drinks
• 7:00pm: presentation starts
In his paper 'Analysis by Compression', David Meredith wonders how are we supposed to decide which is “better” if we are given two analyses of the same musical work. His fascinating answer is to turn to complexity theory. Kolmogorov complexity analysis suggests that we can measure how well we understand a piece of music by the concision of a program that produces it. We can then compare programmatic 'explanations' of the same piece of music via their length.Meredith also sketches a way to apply his insight to psychological coding theory of perceptual organisation. If you accept his argument, it shows that complexity theory is n…
Thomas Depierre (@DianaOlympos) will be presenting the paper "Programming with Abstract Data Types" by B. Liskov and S. Zilles 1974", available here for reading before the meetup:
Along the course of the past 50 years, programming centred mainly around abstraction, being for control flow, operations to execute or data objects and structure. In this foundational paper, B. Liskov and S. Zilles described a concept that is now so deeply embedded into our programming ecosystem that we forget why it was even invented. Come join us to explore what Abstract Data Types still have to teach us.
Coming from a background of maths and EE, Thomas ended up as a Maker at Asce…
What happened in the basements of the AI lab at MIT in the late '50 is truly inspiring and this paper is telling us that story. Herbert Stoyan spent many years researching those early days and even McCarthy is pointing at his work in his papers!
Lisp is one of the oldest languages out there. When I started learning about functional programming, I got interested in why Lisp languages are what they are today, what's the relationship with AI, what lambda calculus has to do with this, or why I kept hearing things like "Lisp has failed".
Surprisingly, many concepts that we give for granted in functional programming today were there almost 60 years ago!
The paper is quite accessible and a good start for the new season of PWL. Some Lisp sources will be presented but there is no need to have a deep knowledge of the language. An html version of the paper is available at
As always, please sign up at Skills Matter and thanks to them for hosting us!
Tom Hall on Plank's "A Tutorial on Reed-Solomon Coding for
Fault-Tolerance in RAID-like Systems"
In what will be my last Papers We Love before taking a long overdue sabbatical and leaving the UK for a while I want to explain how RAID works, why we need it and why existing strategies may become less useful at protecting us from data loss.
The paper  is for systems programmers, is quite accessible (modulo an error corrected in a follow-up 7 years later ) and explains a strategy to survive any number of disk failures by maintaining checksums using Galois Fields (which it also explains)
You may find it useful to also look at 'The mathematics of RAID-6'  explaining what the linux kernels RAID6 implementation does.
Please sign up at Skills Matter also
Certain papers change your life. McCarthy’s ‘Recursive Functions of Symbolic Expressions and their Computation by Machine (Part I)’ (1960) changed mine, and so did Landin’s ‘The Next 700 Programming Languages’ (1966). And I remember the moment, halfway through my graduate career, when Guy Steele handed me Reynolds’s ‘Definitional Interpreters for Higher-Order Programming Languages’ (1972).
It is now common to explicate the structure of a programming language by presenting an interpreter for that language. If the language interpreted is the same as the language doing the interpreting, the interpreter is called meta-circular.
Interpreters may be written at differing levels of detail, to explicate different implementation strategies. For instance, the interpreter may be written in a continuation-passing style; or some of the higher-order fun…
As always, please sign up on Skills Matter too.
Sorry for the short notice. As Tom mentioned at the end of our meetup on Tuesday, we've got a bonus speaker for May!
At this meetup, Gareth Morgan will presenting on The Rendering Equation. This paper is one of the most important 3D graphics papers in the subject's history. The rendering equation put forth by James Kajiya is the foundation of many of the rendering techniques used in modern graphics. Anytime a computer generates physically accurate image it is actually attempting to solve the rendering equation.
Gareth has been involved in games and 3D graphics since 1999, starting at Silicon Graphics followed by several games companies. He spent eight years at Imagination Technologies researching sophisti…
As ever, please sign up at Skills Matter too.
In "End-to-End Arguments In System Design", Saltzer, Reed, and Clark argue that building a robust and scalable system comes from pushing work to the edges of a system. or if you like: why bittorrent worked and multicast continues to fail.
tef will also be arguing the same thing, but saying things out loud rather than writing things down.…
We have EuroSys in town so a few peeps attending that are doing a CRDT themed social meet in Hoop and Toy near South Kensington.
"Are you curious about CRDT research? Are you looking into this and related technologyto tackle scalability and latency reduction in a new system?
If you are interested on an informal exchange of ideas about current trends on research and industry uses of Conflict-Free Replicated Data-Types (CRDTs), we are organizing a Get-together in central London on April 19th, starting around 17h00. Several of the initial researchers and early developers will attend, and will be open for chatting on new ideas or just help explain why a semi-lattice is cool.…
Fouad Mardini on "Worst-case analysis of a new heuristic for the traveling salesman problem" by Nicos Christofides
The Traveling Salesman Problem might very well be the most studied problem in combinatorial optimization. Much of this interest comes from the fact that the TSP is an NP hard problem, yet the TSP is studied on its own merits as it has applications in many fields.
We will use the paper of Christofides as a starting point to discuss the NP complexity class and approximation algorithms to intractable problems. Christofides algorithm is simple and elegant, yet gives the best known approximation ratio for the symmetric TSP.
Along the way we will detour into graph matchings and other interesting tidbits such as Aurora's Euclidean TSP approximation scheme and the Held-Karp LP relaxation.
If you are coming, please sign up at Skills Matter also
Phil Potter on Dolstra's "The Purely Functional Software Deployment Model"
Don't forget to RSVP to Skillsmatter too: https://skillsmatter.com/meetups/7857-dolstra-s-the-purely-functional-software-deployment-model-p
This PhD thesis is about software deployment. Deployment is not often seen as an academic topic, but the reason I like this thesis is because it takes an academic approach of defining the problem carefully and choosing the correct abstractions, and in doing so enables the application of knowledge from other areas such as memory management and memoization to the seemingly unrelated field of deployment.
The thesis collects the theoretical ideas together into a practical tool called nix, a package manager which is st…
Remember to sign up on Skills Matter too, so they know you're coming!
Tom Crayford is a database wrangler.
Chamberlin, Astrahan, Blagsen, Gray, King, Lindsay, Lorie, Mehl, Price, Putzolo, Selinger, Schkolnick, Slutz, Traiger, Wade, Yost
Relational Database technology is one of the largest successes of computology over the past 46 years. Most useful non-blob data written to disk is probably in an RDBMS somewhere. But in 1970, all relational meant was "somebody had written a weird paper". System R was the research project that took that paper and made it of huge practical importance. On the way, the team discovered dozens of new research areas, techniques for database implementation, and ultimately made relational…
This is at Skills Matter so you will need to sign up there too https://skillsmatter.com/meetups/7466-camilla-montonen-support-vector-machines-and-kernels-for-computational-biology Thanks!
BenHur A, Ong CS, Sonnenburg S, Schölkopf B, Rätsch G (2008) Support Vector Machines and Kernels for Computational Biology.
The widespread adoption of highthroughput sequencing machinery has produced an unprecedented amount of genomic data for biologists to analyse. To fully leverage the potential patterns hidden in the petabytes of DNA and RNA sequence information requires the use of machine learning algorithms and specialised kernels, which ca…
Emily was until recently a Backend Engineer in SoundCloud's Data Team. Before that she was a Haskell developer for a Swedish e-signing startup, and before that she was a developer for a variety of financial institutions. She lives in London, and likes gardening and hanging out with her cat Albertina.
Leslie Lamport. 1998. The Part Time Parliament
The Paxos algorithm is a consensus algorithm that’s considered one of the most important in distributed computing. Cassandra uses it for transactions,
The paper, by Melissa O’Neill, is available here.
I like it because I first read it not knowing Haskell very well and I read through the paper a few times and had a wonderful a-ha moment with a sentence or two that made the idea click and was able to implement it in Python. I have since re-done it in Clojure so you will see it done 3 ways (and hopeful I will share an a-ha moment with you)
"""A much beloved and widely used example showing the elegance and simplicity of lazy functional programming represents itself as "The Sieve of Eratosthenes." The paper shows that this example is not the sieve and presents an implementation that actually is."""
Starting with the classic one-liner sieve
sieve (p : xs) = p : sieve [x | x <− xs, x ‘mod‘ p > 0]
primes = sieve [2..]
O'Neill proceeds to show why this standard rendition of the Sieve of Eratosthenes does not in…
Hagit Attiya and Jennifer L Welch: “Sequential Consistency versus Linearizability,” ACM Transactions on Computer Systems (TOCS), volume 12, number 2, pages 91–122, May 1994.
An often-cited constraint on distributed database design is the CAP theorem, an impossibility result in distributed systems. It states that in a linearizable database, if the network is interrupted, some nodes cannot respond to requests. Although being able to tolerate network faults is important, the performance and response times of a database are often even more important, and CAP says nothing about those. …
Oliver Charles presents Conor McBride's "Kleisli Arrows of Outrageous Fortune".
For a long time, it wasn't clear how to get purely functional programs to actually "do" anything. Finally, a breakthrough came by sharing ideas from category theory to bring monads into functional programming, as a way to model and manage side effects. However, little has been done to ensure that entire interaction sequences make sense. For example, many functional programming languages will allow you to open a file handle, close it, and then try and read from it - clearly a nonsensical sequence of operations! In "Kleisli Arrows of Outrageous Fortune", Conor McBride shows us how we can borrow more ideas from the literature, introducing a new formulation of indexed monads - a structure with close connections to slice categories. Given these new tools, he demonstrates that these monads can correspond to enforcin…
Andy Bennett on "Scalable Atomic Visibility with RAMP Transactions"
Bailis et al formalise "Read Atomic" isolation which is a database isolation level sufficient for many applications which make use of large,distributed databases in real world applications. They then show how to design a database system which provides this isolation level whilst allowing read, write and read/write transactions on data across multiple partitions: partitions which do not need to communicate with each other.This method is both performant and fairly portable to existing architectures and allows database authors to implement much stronger guarantees than are currently available from systems in use at companies such as Facebook, LinkedIn, Yahoo or Amazon.
In this talk we will explore the contributions of this paper and how they can be used solve some of the anomolies encountered in today's multi-node, distributed, database systems and accompanying applications.
In ‘the real world’, nothing is simple and failure is unavoidable. “How Complex Systems Fail” was discovered by Adam when he followed a reference in a database book. It expanded his mind on the similarities between hospitals, nuclear weapons, and databases, and now he’d like to share these with you.
The paper is available at </a><a href="http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf">http://web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf . Its subtitle, "Being a Short Treatise on the Nature of Failure;
How Failure is Evaluated; How Failure is Attributed to Proximate
Cause; and the Resulting New Understanding of Patient Safety", serves as a brief abstract.
Having started programming at the age of 8, Ada…
Relational programming, or logic programming, is a programming paradigm that exhibits remarkable and powerful properties, to the extent that its implementation seems frightfully daunting to the layman. µKanren is a minimal relational language that seeks to strip the paradigm down to its core, leaving us with a succinct, elegant and above all simple set of primitives on top of which we can rebuild even the most powerful relational constructs.
In this talk, we will explore the µKanren language by implementing it from first principles in a simple functional programming language, going on to demonstrate how you can assemble these simple building blocks into a semblance of its richer parent, miniKanren, and maybe solve a logic puzzle or two to make sure it’s working as advertised.
The µKanren paper, and the original µKanren implementation, were authored by Jason Hemann and Daniel P. Friedman. The paper is available at
Ricardo Monti from Imperial College London will presenting Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers by Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato and Jonathan Eckstein.
Doors open at 6:30 pm; the presentation will begin at 7:00 pm.
We hope that you'll take a look at the paper before the Meetup (and if you don't, no worries).
Many modern statistics and machine learning problems can be recast as optimization problems where we look to minimize a convex objective function.
The resulting objective function enforces a set of desirable properties which we wish our answer to possess: a popular example would be sparsity (i.e., we wish to obtain a sparse answer for reasons of interpretability or for other reasons).
However, it is often the case that minimizing such objective functions is difficult. While there have been…
Ryan Kennedy from Yammer Engineering will presenting Dapper, a Large-Scale Distributed Systems Tracing Infrastructure paper by Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan and Chandan Shanbhag.
Doors open at 6:30 pm; the presentation will begin at 7:00 pm
We hope that you'll read the paper before the Meetup (and if you don't, no worries).
Ryan Kennedy is an Infrastructure Engineer at Yammer in way over his head. When he's not busy cobbling together databases from the odds and ends he finds in Maven Central he can often be found scouring the world for inedible nachos.
Equal Rights for Functional Objects, or, The More Things Change, the More they Stay the Same
There are many ways of defining equality, particularly in a language which has both object-oriented and functional aspects. There's reference equality, numerical equality, shallow list equality, deep list equality, string equality, and even "fuzzy" floating-point equality. Which is the right to choose in any given situation? Which should i use for my equals() method on my custom object?
This paper answers all of these questions, by taking a seemingly complex problem and revealing an underlying simplicity. It does this by defining one equality operation that works in any situation.
The paper makes use of lisp syntax, but the ideas are applicable to most modern languages.
Can anyone sponsor pizza and drinks? Makes it much easier for me to host more things at uSwitch if someone else covers the food.…