Cloud Native
Control Tower: Then vs Now
Control Tower today is not the same Control Tower that you may have been introduced to in the past.
Over the past few years at Trek10, I’ve worked with dozens of companies from startups to the Fortune 100, helping business and technical decision-makers build teams and applications that thrive in the cloud.
Though these companies sometimes believe their problems and questions must be unique, in reality teams often encounter the same challenges over and over again.
As I wrap up my tenure at Trek10, I want to leave you with ten general rules that have helped guide my cloud decision-making.
I’ve divided them into two “tablets”: one set for business decision-makers (executives), one set for technical decision-makers (engineers and managers in the technology organization).
An agile or DevOps transformation is widely understood to involve some level of top-down organizational change. It involves new responsibilities, new ways of working, new forms of communication between teams. You would not try to buy a “box of DevOps” (though there are people who will try to sell it to you!) and expect success.
Your cloud adoption is no different. Don’t assume that cloud is just an alternate vehicle for delivering the same old technology and ways of working. A “lift-and-shift” mentality will ultimately lead to an expensive, bloated cloud adoption failure.
Development, operations, security, even finance — all take on new configurations in a successful cloud-first business. Take the time to understand and embrace the depth of change needed.
You cannot present a top-down organizational mandate to adopt the cloud, walk away, and expect anything good to happen.
Cloud is a huge shift in terms of technical skill set as well as team responsibilities. People who know the ins and outs of cloud are not readily available in the marketplace, either. Partners can help, but ultimately you need a plan to build that expertise from within.
Establishing a cloud center of excellence is a good start in larger enterprises, but I’ve seen these efforts turn into yet another silo. You can’t just create a “cloud team”; you must foster a wider culture, and that means patiently expanding cloud competency across the entire organization over a period of months and years.
Yes, training is expensive. But the whole point of cloud is that you’re supposed to be able to build more value with the same number of people. So reallocate some of your hiring budget into training your team: you’ll be amazed at what they can accomplish using services maintained by the cloud provider.
Set the right priorities, provide appropriate incentives, and your workforce will slowly but surely transform.
We still see some enterprises choosing cloud primarily for cost-saving reasons, though fewer than in past years — and with good reason. You don’t go to the cloud to spend less money, you go there to create more value.
This is why I like the image of cloud as “the killer app for IT”: as spreadsheets did for early PCs, cloud alone can more than justify the cost of your IT footprint.
Time-to-market, speed of innovation, agility all increase with cloud services — if you put together an organization that can take advantage.
Some companies get stuck on their cloud bill because it’s simply the only metric associated with their cloud adoption efforts — albeit a painful one — that they can see and understand. Metrics of success, like time-to-value, are not impossible to measure, but they take deliberate effort to institute and evaluate.
Prioritize experimentation, know your success criteria, and you’ll see cloud drive value far beyond its cost.
It’s easy to get stuck for months, or even years, in a phase of cloud limbo. In his most recent re:Invent keynote, Andy Jassy called out an organization that, while bragging about their cloud adoption, had really deployed only a few minor experiments. While that approach may serve as resume-padding for your technical team, it’s of no value to the business.
You will not get everything in the cloud right at first, but don’t let that keep you from starting. Pick a bite-size workload, make a serious attempt to run in on the cloud, figure out where it falls short, improve it, repeat, and expand the learning outward.
Get serious about the cloud, and serious results will follow.
One reason enterprises get frozen for months or years on the doorstep of cloud? They recognize that change can be complicated, and they don’t want to get it wrong.
For this reason, it makes sense to bring in trusted partners during your cloud journey — people who have traveled the road before and know the pitfalls as well as the shortcuts.
(Incidentally, this is a great argument for the public cloud itself. Do you really think you can run a data center better than AWS, or build better developer tooling than Microsoft? Concerns about lock-in are historically reasonable and have their place, but ultimately you have to partner with providers who can help you appropriately leverage your risk.)
Find cloud and consulting partners you trust, then give them the latitude to help you achieve lasting and positive change.
There’s a strain of advice in technical decision-making that basically says: for most problems, go with a tried-and-true solution that’s been around forever. Nobody ever got fired for writing another Rails app, or whatever.
This is good advice in a world where big, disruptive forward leaps don’t occur. Cloud changes the calculus because some of the newer technologies radically decrease the scope of things you have to manage, while increasing the underlying scale and power of what you’re delivering.
AWS’s Amplify service, for example, has rapidly moved the goalposts for real-time mobile and back-end development. The low-code, schema-centric approach to API development is great; the tight integrations with cloud-native datastores are better. And because the AppSync back end is a cloud service, it improves over time without you having to do a thing. It may take you a few weeks to get fully up to speed. But in this case, some time invested in learning pays dividends indefinitely.
The point is that your feelings can betray you. You need to dispassionately evaluate what tools and services give you the biggest bang for your buck.
The most costly resource you touch on a daily basis isn’t your production database cluster, your app server farm, or your logging stack. It’s your calendar.
You and your team have limited cycles to spend building and maintaining technology. Don’t waste them on re-implementing the wheel.
To provide an unfortunate personal example, I spent a good deal of time at a previous job building out a full stack deployment framework and pipeline tool more or less from scratch. The deployment pipeline did eventually provide some value to the business, but at the cost of thousands of engineering hours.
If my team and I had relied more heavily on existing services, we could have delivered the same functionality in much less time, and taken on higher-value projects instead of constantly having to justify to leadership what we were doing.
Cloud-native development, due to the high priority it places on consuming pre-built services, puts more of that precious time back on your calendar.
I’ve been trying to encourage the concept of “test code locally and services in the cloud” for awhile. As serverless development crawls higher and higher up the stack, there’s less and less value in maintaining local development environments separate from the cloud.
Iterate on code locally, if you need to; test it against roles and resources deployed to a remote stack (not your production stack, obviously!). The earlier in your development lifecycle you do this, the faster you will catch issues with permissions and configuration, and the fewer surprises you will have in production.
The vendor I see with the best understanding of this is Stackery. I love their concept of “cloudlocal” development, though I doubt that name will catch on. Their CLI is the only tool I am aware of that currently supports running a local AWS Lambda function under the role associated with the cloud-deployed version of that function. (They inject a trust relationship into the role when you invoke the local function.) I predict that you will see the other big serverless deployment frameworks, AWS SAM and the Serverless Framework, provide better support for this soon. It’s where things are going.
You’d think there would be enough real problems in our jobs without making up fake ones. Unfortunately, software engineers are great at coming up with solutions to problems that don’t really matter.
For example, does your app really need to be multi-region? If the entire AWS us-east region goes down, does your app need to maintain availability? How much time will you spend maintaining transactional consistency across multiple regions, how much money will you spend replicating your data, versus just gracefully going down for a few minutes once in a blue moon?
Portability is a huge, fake problem. As some have said, if you’re using Kubernetes, you must be clever: that’s a requirement, not a compliment. Coordinating service meshes, service proxies, ingress routing, and all the rest is a massively complex task.
I even see people architecting serverless apps with complex layers of abstraction, so they can swap technologies in and out if they ever need to choose a different cloud provider.
A truly wise technical decision-maker understands that the scope of things you can be an expert in is quite small. Leverage cloud services to get production-hardened architecture out of the box, and build your unique features on top.
If you must use Kubernetes, look into a managed service like EKS on Fargate or SpotInst Ocean that abstracts away much of the gross stuff (and can orchestrate your containers on ultra-low priced spot instances).
Not to mention, you gain a lot of advantages by integrating with your provider more tightly. Until the cloud providers indicate that they’re going to lock in predatory prices, which hasn’t happened so far, your time may be more valuably spent on feature delivery.
My rubric for making these decisions is simple: don’t be code-wise, cloud-foolish.
As a technical decision-maker, someone who is hands-on with code, YOU are uniquely suited to identify which technology has the greatest impact on your problem domain.
If you are not getting buy-in on a design choice that seems to have obvious benefits, see if you can find some time to prototype it out in the cloud — it’s often quick and cheap.
As a technical decision-maker, you can make a powerful argument to the business: here, I already built the thing you need. It runs fast and is easy to maintain.
Whether you are a business or a technical decision-maker, the success of your cloud migration depends ultimately on your willingness to get comfortable with being uncomfortable: to push the boundaries of what you can build, to reject the comfort of familiarity bias, and to trust vendors and partners that model the way forward.
Though my time at Trek10 is drawing to an end, I continue to believe strongly in the vision and expertise I’ve found here, and I recommend reaching out to them for help as you apply these guidelines.
Control Tower today is not the same Control Tower that you may have been introduced to in the past.