Multipathing allows to configure multiple paths from servers to storage arrays. It provides I/O failover and load balancing. Linux uses device mapper kernel framework to support multipathing. In this post I will explain the steps taken to troubleshoot a multipath…
Read More >One of the common tasks in data processing is to calculate the number of days between two given dates. You can easily achieve this by using Hive DATEDIFF function. You can also get weekday number by using this more obscure…
Read More >It’s been 4 years since the last time I spoke at the Postgres Brazilia community event (last time I spoke was at PGCon 2009 Sao Paolo – PyReplica project) and seems that the community is still growing and vibrant. The…
Read More >Introduction Disk I/O is frequently the performance bottleneck with relational databases. With AWS recently releasing 4,000 PIOPs EBS volumes, I wanted to do some benchmarking with pgbench and PostgreSQL 9.2. Prior to this release the maximum available I/O capacity was…
Read More >What is Ansible? Ansible is a configuration management and deployment system, like Puppet, Capistrano, Fabric, and Chef. Its aim is to be radically simple and let you use your existing scripts to help with cluster configuration and software deployment whenever possible….
Read More >When I heard about this project a year ago, I was really excited about it. Many cluster-wide projects based on Postgres were developed very slowly, based on older (i.e. Postgres-R https://www.postgres-r.org/) or proprietary (i.e. Greenplum) versions. The features that this…
Read More >This essential tool for Postgres architectures is continually improving, and is now available in its new releases. Both are bugfix versions. For those unfamiliar with the tool, it is a middleware with functionality as a load balancer, pooler* and/or replication…
Read More >We, like other Postgres DBAs worldwide, have been waiting for the 9.2 release for some time, specifically for the index-only scan feature, which will help reduce I/O by preventing unnecessary access to heap data if you only need data from…
Read More >Imagine the following scenario: you have a table with a small set of records with columns containing tsvector and text data type. But, text fields has almost 20 megabytes of text or more. So, what happens? Postgres planner checks the…
Read More >“The pg_trgm module provides functions and operators for determining the similarity of ASCII alphanumeric text based on trigram matching, as well as index operator classes that support fast searching for similar strings.” This is the introduction to the official documentation…
Read More >