Subscribe to our Erlang Factory newsletter to receive the latest updates and news

Pero Subasic
Chief Architect of R&D at AOL

Pero works on research and development of new technologies at Aol Advertising R&D in Palo Alto. Over the past 5 years he has been the Chief Architect of R&D distributed ecosystem comprising more than thousand nodes in multiple data centers. He led large-scale contextual analysis, segmentation and machine learning efforts at AOL, Yahoo and Cadence Design Systems and published patents and research papers in these areas. His recent interests are in the areas of data-parallel frameworks and cloud computing for large-scale data analytics. 

Pero Subasic is Giving the Following Talks
Building data-parallel pipelines in Erlang

Data-parallel processing frameworks are being introduced at a fast pace and Erlang seems to be particularly well suited to soft real-time applications with high level of data parallelism and processing concurrency where reliability is important. With increased memory size and multi-core computing capabilities of modern processors and introduction of high-performance persistent storage, Erlang covers increasing portions of response time-data volume chart and complements very well existing large-scale analytics platforms like Hadoop. 

This talk aims at presenting a case for building soft real-time, scalable data-parallel processing pipelines in Erlang. 

We present architecture and simple specification language for building data-parallel flows in Erlang and share use cases covering data-parallel methods such as map-reduce and iterative graph algorithms to illustrate flexibility of the proposed approach. We discuss other important elements of the architecture such as capacity planning for typical use cases, relationship with other ecosystem components, instrumentation and monitoring, scheduling, replication and failover. 

Talk objectives:  Introduce the general area of data-parallel processing. Make case for using Erlang for data-parallel processing. Share proposed architecture for building data-parallel pipelines in Erlang. Share use cases and lessons learned; solicit feedback on proposed architecture 

Target audience: System and infrastructure architects, large-scale data architects and scientists, CTOs, CIOs.