Dumpster Dive your Erlang Data!

Tom Szilagyi
Erlang hacker, Klarna engineer

Today's information processing and database systems are more capable than ever. Nowhere is this so apparent as in the world of NoSQL databases, now mainstream in enterprise software architecture. With such a dynamic store of data, business exploration and organic growth happens naturally and in production, thanks to the absence of an always enforced, rigid schema.

But hang on... do you know the data in your system?

Large software systems are constantly evolving (and we all love hot code updates). However, data once produced might stay there forever. Not surprisingly, accessing data written by an older software version may lead to unpleasant discoveries. You might want to check your assumptions before pushing new code to production... or you might want to validate the data, cleaning up artifacts of an old bug that has been fixed long ago. Maybe you are just curious and would like to learn more about your data.

In this talk, I will introduce and showcase dumpsterl, an open source tool to address these challenges. Dumpsterl scrutinizes the contents of an entire table, a single column, or an arbitrary stream of Erlang terms. It goes through the values (or a random subset of them) and builds comprehensive metadata, essentially a specification of the data encountered. While doing this, it collects representative samples possibly annotated by key and timestamp to support further probing. Dumpsterl is flexible and easily extensible, so you can feed it with virtually any source of data.

Talk objectives:

  • Highlight certain challenges faced by the databases of real-life, complex, constantly evolving production systems. Introduce dumpsterl, a data inspection tool useful in such circumstances.

Target audience:

  • Programmers from the trenches; DevOps people; data geeks and dumpster divers.

PROJECT WEBPAGE: http://tomszilagyi.github.io/dumpsterl

VIDEO - we cannot publish the video of Tom's talk due to technical reasons for which we apologise

Slides

Tom started working with Erlang professionally in 2006, shortly after earning his M.Sc. in Electrical Engineering. Shuffling back and forth between industry, R&D and consulting jobs for the last ten years, he applied the lessons of functional programming in a variety of scenarios. He likes to "do stuff" and come up with practical solutions to real problems.


GitHub: tomszilagyi

Back to conference page