ELASTICSEARCH UNIVERSE: FLYING OVER THE DATA-I

ELASTICSEARCH UNIVERSE: FLYING OVER THE DATA-I

Hi all again!! This is the third tech-paper of this serie dedicated to the optimization of an Elasticsearch Cluster. Here we  can find how to configure your searches in order to make them faster and more efficient. Avoid big http requests In search  tasks Elasticsearch needs to fetch the Id of all the documents, this […]

GOOGLE DATAFLOW AND APACHE BEAM (II)

GOOGLE DATAFLOW AND APACHE BEAM (II)

What is Apache Beam? Apache beam is an open source, unified programming model that defines and executes data processing pipelines. These pipelines can be both batch and streaming. It is exposed via several sdks that allow to execute a pipeline in different processing engines, aka, runners. The supported runners so far are: Apache Spark Apache […]

Datio Advent Calendar 2017

Datio Advent Calendar 2017

During the last twelve months, through our blog posts we have been trying to enrich the knowledge of big data developer communities.  This is why, to celebrate the end of 2017, we offer our particular Christmas tree, built just like an advent calendar composed by 24 articles and multimedia useful resources devoted to corporative and […]

Google Dataflow And Apache Beam (I)

Google Dataflow And Apache Beam (I)

A bit of context first.. As some of you may know, in 2004 Google released the MapReduce paper that became the cornerstone of a whole new set of open source technologies composing the big data ecosystem as we know it (Hadoop, Pig, Hive, Spark, Kakfa, etc.). Meantime, Google followed its own path by developing other […]

Security in the upside down

Security in the upside down

Almost every day we check our bank account from our smartphones, we buy products in Amazon, we send private messages to our family and colleagues, and so on. We do this because we know that it’s safe, that no one can steal our credit card number or read our messages. Information security is the discipline […]