The Madrid chapter of Papers We Love
Papers We Love has a Code of Conduct. Please contact one of the Meetup's organizers if anyone is not following it. Be good to each other and to the PWL community!
Sign-up: Please RSVP for meetings via Meetup.com
Organizers: Miguel Pastor
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price. [Access path selection in a relational database management system. SIGMOD, 1979.](https://t.co/KsfkS4evlM)
Databases are one of the backbones of our applications, we use them daily but probably we didn't stop to analyze how things works under the hood. We take for granted that given some declarative query (i.e with SQL), the RDBMS will try to be as efficient as possible. This enables data-independent query processing, back in the day with systems like [IMS](https://t.co/jm3sxobbup) queries were done using low level information, so knowledge about the underlying data structures was needed to be able to perform efficient queries. The presented paper is the foundation for query optimization field, it decomposes the problem into three distinct subproblems:
Causality is an essential component of how we make sense of the physical world, and of our relations to other humans. If I put a cup on the table, and look back at it, I expect it to be there. I also expect to get a reply to my postcards, after I send them, and not before.
These days hardly any service can claim not to have some form of distributed algorithm at its core. In a distributed scenario, if we are not careful, it is very easy to break the causal sense of things. In a key-value store my writes can be directed to a replica, and my subsequent reads served from an outdated one --- my cup might not be there when I look back. Message dissemination middleware might not always provide the ordering I expect --- I might receive some replies, before their leading questions.
Luckily, most of these problems were already there 30 years ago, although in a much smaller scale, and lots of techniques have been developed to keep track of causality and make sense of the compl…
We're really glad to have Christopher Meiklejohn (https://twitter.com/cmeik) with us.
Chris will be talking about HyParView: a membership protocol for reliable gossip-based broadcast
Gossip, or epidemic, protocols have emerged as a powerful strategy to implement highly scalable and resilient reliable broadcast primitives. Due to scalability reasons, each participant in a gossip protocol maintains a partial view of the system. The reliability of the gossip protocol depends upon some critical properties of these views, such as degree distribution and clustering coefficient.Several algorithms have been proposed to maintain partial views for gossip protocols. In this paper, we show that under a high number of faults, these algorithms take a long time to restore the desirable view properties. To address this problem, we present HyParView, a new membership proto…
While I was working on the presentation for Plumtree I realized that the gossip introduction is itself quite long, so I'm going to split the talk in two. In this first one I will introduce Gossip protocols and I will go over these papers:
• A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swinehart, and D. Terry. “Epidemic Algorithms for Replicated Database Maintenance.” In Proc. Sixth Symp. on Principles of Distributed Computing, pp. 1–12, Aug. 1987. ACM. Ken Birman. The Promise, and Limitations, of Gossip Protocols. SIGOPS Oper. Syst. Rev., 41(5):8–13, October 2007
• Gossip-based Protocols for Large-scale Distributed Systems. Márk Jelasity, 2013JELASITY, M., GUERRAOUI, R., KERMARREC, A.-M., AND VAN STEEN, M. 2004. The peer sampling service: Experimental evaluation of unstructured gossip-based implementations. In Middleware 2004, H.-A. Jacobsen, Ed. Lecture Notes in Computer Science, vol. 3231. Springer-Verlag, 79–98.
The main problem when automatically analysing source code is finding a meaningful representation that captures the specific properties of programming languages. While there are widely adopted techniques for building “word embeddings” for natural language (like Word2vec), these are not applicable to source code, whose syntax and vocabulary are simpler .
This paper presents a model for learning “code embeddings”, relying on the syntactic structure of the code, expressed as Abstract Syntax Trees, and how to use those embeddings for code classification tasks.
Paper is available here: http://arxiv.org/abs/1409.3358
Talk will be in english…
We will talk about the paper "Path, Trees and Flowers" by Jack Edmonds.
This paper is seminal in graph theory. It describes the maximum matchings blossom algorithm, which finds a decomposition of a graph in the maximum number of pairs connected by an edge.
It is also the base of the famous minimum weight maximum matchings algorithm, a very important algorithm used in many applications, from pairing in chess tournaments to resources or worker placement within companies.…
En esta ocasión tenemos la suerte de contar con Alberto Cortés, que nos hablará de sus experiencias sobre Diff/Blame y cómo están construyendo su propia versión en Go
Diff es una algoritmo venerable, desarrollado en 1970 para Unix; básicamente es una distancia de Levenshtein, pero orientado a líneas en vez de a carácter. Git-blame utiliza diff de una forma bastante interesante para detectar qué commit fue el último en modificar cada línea de un fichero.
Durante la charla se revisarán como funcionan ambos algoritmos, diff y blame, de una forma intuitiva y Alberto nos hablará de los papers y recursos que ha utilizado para llevar a cabo su desarrollo.
Referencias de algunos de los papers utilizados:
Tenemos el enorme placer de contar con Álvaro Videla como primer ponente de esta primera entrega de Papers We Love Madrid.
En cuanto Álvaro me pase un pequeño resumen, actualizaré esta sección.