[1/23] Forward: A Cost-Effective Unified Data Platform for Small Organizations

By Eric Burt on November 10, 2020 in Data Engineering, Data Warehouse

Starting this week, I will be releasing a 23 part series on creating a cost-effective, modern, unified data platform based on a whitepaper that I recently wrote.

There is plenty of literature on enterprise-level data architectures. However, I have found a lack of materials on creating a modern data stack, including high-performance ETL pipelines and a data warehouse cluster, for small to medium businesses with low budgets, especially nonprofits.

The architecture I propose is cost-efficient and will fit into even the tightest of budgets. However, it is scalable, highly available, durable, scalable, and performant.

This is done by utilizing open-source technologies such as the #hadoop ecosystem (#HDFS#Hive), #spark#docker#kubernetes, and #apache#airflow (for #ETL). It is built upon #amazonwebservices leveraging #aws#ec2#eks#redshift, #s3#rds#lambda#cloudformation#codepipeline#codebuild#elasticsearch#ecr, and #vpc.

______________________________________________________________________________

This post is part of a 23 part mini-series about implementing a cost-effective modern data infrastructure for a small organization. This is a small part of a whitepaper that will be released at the end of this series.

Leave a Reply

Your email address will not be published. Required fields are marked *