Data Architecture & Engineering

Getting the data right is an investment in your business.

Overview


Data Architecture and Engineering is the 2nd stage in the data management process and are like two sides of the same coin. This stage sits in the middle of the data ecosystem between Data Strategy and Technology Product Development. Both are needed to get your data ecosystem setup and flowing seamlessly. If you do this right, you get better data products for internal user or for your customer. Do it wrong and your data products won't perform like you want them to.

Details


These are some of the most common deliverables of the middle stage of the data ecosystem in an organization that takes data seriously. Many of these take advantage of Automation and are run on a schedule to provide Integration of systems.

Data Model

Data modeling goes from business requirements to technical implementation using the Conceptual, Logical and Physical levels. Good database design relies upon this.

Database Design

Build databases to power a given data product. Consideration given to performance and scalability using techniques like normalization, indexing, views and correct data types for the use case.

Data Pipeline

Automated programs that take a myriad of data sources and deliver them to a destination that perform data processing along the way using a process called ETL or Extract Transform Load.

API

Quite often used in conjunction with Data Pipelines allows integration between systems by sharing data when properly authenticated to a third party service or vendor.

Data Cleaning

Data cleaning ensures you meet quality standards, deals with missing values and that you don't have 28 variations of the same item, all meaning the same thing.

Data Enrichment

Data enrichment takes your existing data and appends data from a third party source to give it more detail and increase its value.

Master Data

Master Data deals with the 4 most common types of data in your organization. Customers, Team Members, Products and Locations. It manages that data in a centralized way because it is used in a variety of products.

Hub

A hub is a data distribution method, typically for Master Data, that sits in the middle and has spokes going out from center. Technology products are at the end of the spokes and receive the data to use in the product.