on July 26, 2023
Twenty years ago, database administrators (DBAs) were the primary career path when it came to specializing in data management.
Much has changed: development patterns transformed from Waterfall to Agile, DevOps drives automation and shared ownership of code, and cloud services have made many more kinds of PAAS databases, data lakes, and data lakehouses available to organizations of all sizes.
These changes have introduced new and varied career paths for data folks which have different emphases on skill sets. In this post, I talk through the commonalities and differences between DBAs, Database Reliability Engineers (DBREs), and Data Engineers (DEs). Whether you’re a hiring manager or data professional, it’s worth knowing about these roles.
Why does it matter?
We all know that job titles can be meaningless – I once was a DBA with the job title of “Software Development Engineer, Testing,” because of an HR categorization issue. (Did I develop software? No. Did I write tests? Also no.)
However, I think these concepts are useful to think about for a few reasons:
- Defining yourself and planning your career using a common language can be useful in broadening your horizons.
- When writing job descriptions, these terms are useful keywords to get the job in front of the best candidates.
- For job candidates, understanding the breadth of these disciplines is useful when reviewing opportunities. It’s a red flag when any position requires an individual who covers all of the roles that I describe here with significant breadth. While it’s certainly possible to dip your toe into multiple areas, covering all three of these specialties deeply is unmanageable.
The trash can explanation
I asked folks on social media what they think about these three roles. One of my favorite responses came from Erik Darling:
Maybe it’s just that my first DBA job came with the team motto of ‘Data Janitors’, but this resonated with me. It can all be described by trash.
The business explanation
Although trash cans are easy to relate to, they don’t necessarily help you much in a job interview. Here’s a more specific version of how I’ve seen these roles compare in job postings, and in organizations I’ve worked with as an FTE, external speaker, or consultant.
This image is too complex for alt text, but the content is explained in the paragraphs below the image.
Commonalities between DBAs, DBREs, and DEs
There are a few core specialties between all three of these roles, although they are used in different contexts. All of these roles are concerned with how data is:
- Stored
- Secured
- Shared
- Backed up
- Queried
- Archived
For an example of “contexts”, I mean that a DBA may be thinking about how this applies to data in a relational database. A DBRE may be thinking about how these tasks are automated across both on-prem and cloud environments, with multiple types of data stores. Meanwhile a DE may be thinking about how this applies in a Lakehouse or Event Streaming scenario.
In other words, everyone cares about compliance, data loss, security, and availability in one way or another.
DBAs, DBREs, and DEs all generally know and use SQL on a regular basis.
Database Administrator key strengths
Database Administrators are often highly knowledgeable regarding:
- Relational databases (mostly OLTP)
- Bespoke query tuning and performance troubleshooting
- Scaling monoliths
- Patching and upgrades (often semi-automated)
- Environment standardization
These are the folks who are most likely to learn 1,001 configuration settings for a specific database technology, along with the pros and cons of using each setting, alone or in conjunction with one another.
While DBAs often implement automation, often writing bespoke scripts is a small part of their job and much of the focus is on adopting off-the-shelf or open source automation tools written by others. Establishing reliable processes is an emphasis and “protecting” data in source databases is an emphasis.
Database Reliability Engineer key strengths
The DBRE role focuses on automation as well as empowering developers to have shared ownership of code. While DBAs are generally “protectors” of data, DBREs have a mission of helping lay down safe and flexible guardrails for data that allows development to move faster and more flexibly.
DBREs key strengths are:
- Data related infrastructure as code (IAAC)
- Automated cost monitoring, availability monitoring, performance monitoring at scale
- Turning “pets” into “cattle”
- Enabling developers to own and support their own data code
- Data code versioning, testing, and automated deployments
In addition to SQL, DBREs use various command line and scripting languages for automation.
Data Engineer key strengths
The Data Engineer role often interfaces with Enterprise Reporting and Data Scientist teams. Data Engineers generally load, wrangle, and help manage vast amounts of data from a large variety of sources within an Enterprise.
You might want to be a Data Engineer if the following excite you:
- Data lakes, lakehouses, warehouses: architecture and implementation
- Data pipelines, transformations, data quality checks
- Streaming architectures
- Governance, metadata, lineage
Data Engineers often use Python quite a lot, sometimes more than they use SQL. Like DBREs, they have a focus on using version control and automating testing and deployments, although this is more related to notebooks, data pipelines, and transformation (as compared to Terraform or Bicep deployments and automating monitoring).
When interviewing, ask what these terms mean to the employer
I recently saw a conversation online between Data Engineers discussing that the Data Engineer role is defined quite differently at Meta than at Google, which can make for an awkward interview for candidates who assume that expectations for the role are similar.
This is true for any role, really. It’s always worth asking in a job interview: what are the primary pain points you want this person to solve? What are the key strengths that a candidate will need to succeed in this role?
What’s your take?
What have I missed? How are your definitions different? Tell me in the comments, or write your own post and share a link.