Futurakka Volume II

Futurakka Volume II

In the previous post, we reviewed the features of Futures in Scala and the most important methods used in the API. In this post, we’ll review another interesting API for handling asynchrony with Scala together with the use of Futures. Promises provide an alternative way to create Future instances. A Promise has to be fulfilled […]

Futurakka, Volume I

Futurakka, Volume I

The objective of this post is to explain the nuts and bolts of dealing with Futures in Scala and Akka. Futures allow to perform many operations in parallel in an efficient and non-blocking way, but dealing with multiple operations can be a real headache. On this post, we’ll review the main APIs and we’ll see […]

ELASTICSEARCH UNIVERSE: FLYING OVER THE DATA-I

ELASTICSEARCH UNIVERSE: FLYING OVER THE DATA-I

Hi all again!! This is the third tech-paper of this serie dedicated to the optimization of an Elasticsearch Cluster. Here we  can find how to configure your searches in order to make them faster and more efficient. Avoid big http requests In search  tasks Elasticsearch needs to fetch the Id of all the documents, this […]

GOOGLE DATAFLOW AND APACHE BEAM (II)

GOOGLE DATAFLOW AND APACHE BEAM (II)

What is Apache Beam? Apache beam is an open source, unified programming model that defines and executes data processing pipelines. These pipelines can be both batch and streaming. It is exposed via several sdks that allow to execute a pipeline in different processing engines, aka, runners. The supported runners so far are: Apache Spark Apache […]

Datio Advent Calendar 2017

Datio Advent Calendar 2017

During the last twelve months, through our blog posts we have been trying to enrich the knowledge of big data developer communities.  This is why, to celebrate the end of 2017, we offer our particular Christmas tree, built just like an advent calendar composed by 24 articles and multimedia useful resources devoted to corporative and […]

Google Dataflow And Apache Beam (I)

Google Dataflow And Apache Beam (I)

A bit of context first.. As some of you may know, in 2004 Google released the MapReduce paper that became the cornerstone of a whole new set of open source technologies composing the big data ecosystem as we know it (Hadoop, Pig, Hive, Spark, Kakfa, etc.). Meantime, Google followed its own path by developing other […]