Building real time data pipelines with Kafka Streams I

Building real time data pipelines with Kafka Streams I

Many interesting projects have been developed within the stream processing field in the last years. Most of us could name open source projects (Apache Spark, Apache Storm, Apache Flink) or proprietary services (Google DataFlow or AWS Lambda) that are very well-known solutions for real time scenarios. However, the objective of this post to present Kafka […]

GOOGLE DATAFLOW AND APACHE BEAM (II)

GOOGLE DATAFLOW AND APACHE BEAM (II)

What is Apache Beam? Apache beam is an open source, unified programming model that defines and executes data processing pipelines. These pipelines can be both batch and streaming. It is exposed via several sdks that allow to execute a pipeline in different processing engines, aka, runners. The supported runners so far are: Apache Spark Apache […]

Datio Advent Calendar 2017

Datio Advent Calendar 2017

During the last twelve months, through our blog posts we have been trying to enrich the knowledge of big data developer communities.  This is why, to celebrate the end of 2017, we offer our particular Christmas tree, built just like an advent calendar composed by 24 articles and multimedia useful resources devoted to corporative and […]

Improving VM performance in OpenStack: NUMA and CPU Pinning

Improving VM performance in OpenStack: NUMA and CPU Pinning

Today we are going to see how to improve the performance of a VM running in OpenStack. Memory has a large impact in the performance of workload. This affirmation is specially true if the workload is running on a VM, so it’s necessary to be careful with the memory and NUMA if the machine supports it.   But wait! What is NUMA? […]

Most popular Datio’s posts

Most popular Datio’s posts

It’s been eight months since we started this blog and we loved every minute of it. It’s been a privilege to publish 24 posts of our colleagues, with different topics both corporative and technical.   Here you are the top 5 most viewed blog posts. Women at Datio –  Raquel Asenjo. Docker in Your Production […]

Chaos Engineering and Mesos

Chaos Engineering and Mesos

  Modern applications are distributed by default. They are composed by a bunch of services that have to communicate with each other (a back-end and a database, for example). This distributed computing is hard, and when these applications are deployed in the cloud it’s even harder. In fact, the number of variables that can produce […]

Building a Docker Container Orchestrator with Akka

Building a Docker Container Orchestrator with Akka

Nowadays, Docker containers have become the core behind service-oriented architectures (microservices). With this approach, there’s a need of some distributed applications which are able to run these containers at scale, like Kubernetes or Docker Swarm. Maybe one day you wonder how difficult it would be to build such a system, or even how it can […]

Monitoring Mesos Resource Offers and Tasks

Monitoring Mesos Resource Offers and Tasks

WHAT IS DRF AND HOW IT WORKS? As we have said in the previous post, Mesos provides two-level resource scheduling, the first level happens at the Mesos master that is responsible for deciding what resources are offered to each framework and when; the second level happens at the framework’s scheduler level which is responsible for […]