At Pythian, the DevOps, Big Data, and Data Science teams use Slack for our IM system. We’re a diverse group drawn to a wide range of technologies so there’s some interesting and valuable chatter about what folks are reading. Here are some subjects that came across our channels the past couple of weeks:
DevOps
Pythian engineers support Solr and Elasticsearch for a number of clients. Here’s a great summary of the important tunables in elasticsearch:
https://tech.scrunch.com/blog/lessons-learned-from-a-year-of-running-elasticsearch-in-production/
We often are called into companies to help manage deployment infrastructure, and it is quite common to encounter an unwieldy monolithic application that has been cobbled together over the years. Our head of DevOps pointed out this book as a great resource for strategies to decompose those into microservices:
https://www.amazon.com/Microservices-Patterns-Applications-Designing-fine-grained/dp/069242427X
Big Data
As Data Lake architectures mature, we’re seeing more comprehensive offerings from vendors. This blog talks about Microsoft’s offerings. I like the integration of active directory for strong security and the U-SQL approach to pulling data from data lakes. While I’m not a huge fan of C#, the concept of having a library of extractors and outputters is a nice nut and bolt approach:
https://tomkerkhove.ghost.io/2015/10/22/exploring-azures-data-lake/
Another thread from the Data Lakes discussion highlights the critical importance of Data Governance. Waterline’s Data Inventory tool is a strong player for MetaData/Governance automation:
https://blog.waterlinedata.com/blog/the-d-artagnan-of-hadoop-spoiler-alert-data-governance-for-hadoop
Data Science
Facebook released its implementation of deep learning neural nets last year. The Data Science team has been spending some time with it as they evaluate and build AI tools:
https://github.com/facebook/MemNN
The team has also been using some great Java tools for natural language processing from Stanford:
https://stanfordnlp.github.io/CoreNLP/index.html
No comments