As a follow up from my last post about things I look for when hiring a junior DevOps engineer - these are the things I'm looking for out of a senior engineer. It all boils down to what I consider the core of DevOps - helping engineering teams ship their apps quickly and safely to production.
At a high level, this means being able to provide:
- a CI workflow
- knowledge of release workflows
- an automated delivery workflow of multiple services to multiple test environments
- production monitoring & alerting
A senior engineer should know how to provide these basic resources to the engineering team in a way that allow for quick release cycles. They might have a specialty in one of these areas, but know enough to provide a decent foundation across all of these areas. Let's see what this means in practice.
Setup and support a CI system/workflow to support many services spanning multiple engineering teams.
This is like basic need number one of any developer or dev team. These questions should have decent answers:
- How/Where should I run my browser tests?
- How/Where do I test my configuration/infrastructure-as-code?
- How do I get more nodes to run more tests?
- Can your teams build out their pipelines as they want?
- Are they able to deploy to production on their own?
- How do I deploy?
For example - a team of 20 engineers are building an android app - maybe you set up AWS Device Farm, Code Build, and CodePipeline and provide terraform/cloudformation modules and best practices to get your teams onboarded quickly. They should be able to view logs, retry builds, see screenshots of their tests to name a few. If you're a small team with just a few services, TravisCI/CircleCI might be good enough. If you're a bigger team, you might opt to use Gitlab. If you have very specific testing requirements, you might need to lean on Jenkins.
Some practices scale better than others, and a lot of it has to do with some of your organization culture. I've seen dozens of teams each managing their own Jenkins instance, as well as a dozen teams depending on a centrally managed Jenkins instance. In one context it works great, while at another, it might be a nightmare in practice.
Have seen multiple release workflows
As the number of sites/services that your team creates increases, you need to have a method to the madness to get things shipped out quickly to customers and also keep all your internal stakeholders updated as you ship out changes. I'd hope they've gotten to see a good many of them and most importantly, I'd love for someone to be able to provide great feedback about a poor release workflow they've seen.
Some teams create interesting workflows to support it. Things like a release train for organizing releases across teams, or a continuous delivery environment where everything gets shipped after it goes through the pipeline. A senior engineer would be able to recommend and implement a proof-of-concept with other engineering teams outlining the release process. Will you be using an ITIL workflow
Set up automated deployments to different runtimes and environments
At the beginning, your teams might be focused on JS, but there will come a time where there's a solution that can only be built using a different language. Or a different framework. As your teams grow, your configuration management solution should scale to the needs of your teams. Are they able to deploy java apps as quickly/easily and nodejs apps? What if you need to introduce python/go into the mix, how much or how little of your pipeline needs to change?
You might set up a docker build/deploy workflow where teams can build/deploy whatever they want to their container to a mangaged k8 cluster, OR you might have a Chef/Ansible workflow where deployments are centrally executed from chef-server or Ansible tower. Maybe using scp
to deploy to a remote server is really all you need.
Configure and scale a distributed monitoring platform for applications and infrastructure
When you deploy your application, you need to see whether or not it works as intended. Your teams should have all the tools available at their disposal - they should be able to view things like:
- CPU/memory metrics and plot it on a graph alongside other hosts
- centralized logs across all hosts
As you get more mature and your needs grow, you'll want to add more custom metrics to your application like application level details using tools like NewRelic APM and/or Datadog.
To start with, maybe using a SaaS tool like Papertrail or logz.io might be good enough. As your services grow and needs change, it might make sense to use a bigger provider like Splunk or Elastic. Or if your senior engineers have a specialty in this area, they can deploy an elasticsearch + kibana + logstash/fluentd architecture and link it automatically to all of your services.
Conclusion
There are other things that I'm paying attention to with senior engineers (especially with regards to things like security practices), but their experiences and solutions to these common basic problems is what I'm interested in the most. What I'm looking for is experience seeing different types of applications get deployed across a wide variety of teams with differing requirements and priorities. After you've seen a few, you're able to see processes that work and ones that don't.
How you release to production is a particular interest of mine, and there's always something to learn in what other companies do. What's the best/worst release process you've seen? Let me know in the comments.
21 Comments. Leave new
Your article helped me a lot, thanks for the information. I also like your blog theme, can you tell me how you did it?
I agree with your point of view, your article has given me a lot of help and benefited me a lot. Thanks. Hope you continue to write such excellent articles.
Reading your article has greatly helped me, and I agree with you. But I still have some questions. Can you help me? I will pay attention to your answer. thank you.
Your article helped me a lot, is there any more related content? Thanks!
I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.
Your point of view caught my eye and was very interesting. Thanks. I have a question for you. https://accounts.binance.com/ur/register-person?ref=OMM3XK51
After reading your article, it reminded me of some things about gate io that I studied before. The content is similar to yours, but your thinking is very special, which gave me a different idea. Thank you. But I still have some questions I want to ask you, I will always pay attention. Thanks.
Your article made me suddenly realize that I am writing a thesis on gate.io. After reading your article, I have a different way of thinking, thank you. However, I still have some doubts, can you help me? Thanks.
Your article made me suddenly realize that I am writing a thesis on gate.io. After reading your article, I have a different way of thinking, thank you. However, I still have some doubts, can you help me? Thanks.
Can you be more specific about the content of your article? After reading it, I still have some doubts. Hope you can help me. https://accounts.binance.com/uk-UA/register-person?ref=JHQQKNKN
Thank you very much for sharing. Your article was very helpful for me to build a paper on gate.io. After reading your article, I think the idea is very good and the creative techniques are also very innovative. However, I have some different opinions, and I will continue to follow your reply.
The point of view of your article has taught me a lot, and I already know how to improve the paper on gate.oi, thank you. https://www.gate.io/zh/signup/XwNAU
I may need your help. I tried many ways but couldn’t solve it, but after reading your article, I think you have a way to help me. I’m looking forward for your reply. Thanks.
I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article. https://accounts.binance.com/cs/register?ref=OMM3XK51
Your point of view caught my eye and was very interesting. Thanks. I have a question for you. https://accounts.binance.com/ru/register-person?ref=B4EPR6J0
Your article gave me a lot of inspiration, I hope you can explain your point of view in more detail, because I have some doubts, thank you.
Thank you very much for sharing, I learned a lot from your article. Very cool. Thanks. nimabi
Thank you very much for sharing, I learned a lot from your article. Very cool. Thanks. nimabi
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me? https://www.binance.com/cs/register?ref=W0BCQMF1
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?