Updated:
Updated: Cloud
What do a Toyota Supra, a Honda (Acura) NSX, a 11th Gen Thunderbird or a Cadillac Allanté have in common? They’re cool cars, featured state-of-the-art technology, got a lot of attention, and developed a devout fan following. But none of them sold very well, nor were they intended to. Their main purpose was to raise the respective brand’s image: if people watched an ad for these cars or saw one in the showroom, they were more inclined to buy a Toyota Camry, Honda Accord, Ford Taurus, or Cadillac Seville.
If that reminds you of AWS Lambda, you may be onto something. I love building serverless applications to the point of having joined the AWS serverless team to help take the product to the next level. I also drive a Cadillac Allanté, full of advanced tech (at the time) of somewhat questionable value, like a switchable-unit digital dash, dynamic suspension, and the ability to unlock your doors from the trunk lock.
Just like in the car dealer showrooms, Lambda is right in the front, demo’d and admired a lot. But in the reality of large cloud migrations it tends to make room for vast armies of EC2 instances to host legacy workloads or ERP systems. Those EC2 instances aren’t quite as sexy, but they are well-understood, reliable, and have predictable cost, kind of like a Toyota Camry or Honda Accord. And they generate a lot more revenue, meaning Lambda along the Supra’s and NSX’s of the world has done its job well.
Recent on-line discussions have questioned the future of serverless due to slow update cycles (causing language version lag, which is now much better), limited customer uptake, and the near-elimination of AWS’ serverless developer advocacy team. So, what if commercial success was never the main intent of Lambda?
The Halo Effect
A Halo Product (like a Halo Car) is a product put out by a company to benefit from the Halo Effect, which Wikipedia describes as:
Consumer bias toward certain products because of favorable experience with other products made by the same company.
The Halo Effect is a cognitive bias that makes people believe that one thing is good because another thing that’s seen as related (e.g., produced by the same brand) is perceived as good. A widely known example are Halo Cars like those cited above: those shiny models that you see in all the ads and in the front of the showroom as you walk to the back to see if the economy model is running a campaign.
I experienced this effect every time I took my Allanté to the dealer (it had a 7-year warranty and no shortage of things that could be fixed): other customers always came by, all excited, to ask if this is a new Cadillac (by the time Cadillac had already stopped building the Allanté, so my car was the stand-in for the showroom model). Sadly, Cadillacs have a habit of outliving their owners (not through the longevity of the car, but the age of their owners), so I am not sure whether the ploy worked out for GM.
My halo car being readied for shipping. It’s still a head-turner, but not the best daily driver
The limited sales are usually not a huge issue for the manufacturer because the profit per Halo car is minimal or even incurs a loss (more on that below). Being the owner of a small collection of Halo cars (my Integrale certainly qualifies, but my DeLorean was more of a fata morgana than a halo), I understand the appeal but also the consequences of owning a halo product. Those head-turns also come along with a long list of issues, including below-average reliability, high maintenance cost, limited part availability, or low resale value (until they suddenly skyrocket when they become collectibles).
Deploying to the Cloud: Shiny Tech
The success of the cloud lies in abstracting stuff away, mainly that “undifferentiated heavylifting” (which isn’t so undifferentiated when you think about reliable operations and cost optimization). I have a long love affair with hiding machinery from developers: I was so excited about Google AppEngine back in the day (still have the T-Shirt) that at a 2009 JavaOne panel I deployed an (admittedly simple) Java app to the cloud instead of showing slides. A much younger Jeff Barr also demo’d an application, but I can’t remember what it ran on as S3 got more airtime than the application run-time (it could not have been Beanstalk or Lambda).
The idea of just deploying code and having it run “somewhere” was as appealing back then as it is now. I have gone as far as stating that:
Serverless is the way the cloud wants you to write applications.
Alas, initial AppEngine adoption suffered from a programming model that featured no SQL database (not to be confused with a NoSQL database) and had some birthing pains running popular Java frameworks like Spring. Nevertheless, the service still lives on 16 years after initial release—I keep hearing rumors that it likes hanging out with Elastic Beanstalk in the retirement home.
Initial adventures with Lambda are no less magical (IAM aside) today than they were with AppEngine back then. But just like a Halo car, it comes with some special needs.
Some Halos become very valuable, thanks to their rarity.
Halos: Where There is Light, There is Shadow
I discussed The Serverless Illusion in a prior post: fine-grained applications reduce infrastructure complexity but increase application complexity. And although noone wants to manage infrastructure, it’s a skill that is easier to find than distributed system developers.
The extensive collection of serverless “patterns” is well-intended but also perfectly highlights the remaining complexity that developers have to deal with. Connecting two functions with a queue takes about 50 lines of CDK code, excluding the application code or the referenced Lambda-to-SQS or SQS-to-Lambda pieces. This is sample code that may hard code settings or make tacit assumptions. Providing a slew of helpful examples feels a bit like the 7-year warranty on my Cadillac Allanté: you appreciate the service but can’t quite suppress the uneasy feeling.
The remaining complexity is why the heavy-duty work by many serverless heroes and folks like Mike Roberts is so essential. The same goes for the awesome AWS serverless developer advocate and specialist team, who brought us things like Serverlesspresso, but now work somewhere else (apparently Stripe and Datadog were hiring and lucky).
If you own a Halo product, you better be part of a community (who does awesome stuff like this)
So, if you are the proud owner or user of a Halo product, you better have some tools ready or know a good repair shop. The tight-knit serverless community (in which I count many friends) is not unlike that of Halo cars: being in a club or well connected in the community helps you get advice or those hard-to-find spare parts / code samples. In a way, the community is part of what you buy into and also part of the enjoyment (at least for me, for cars and cloud). There’s a slight risk. though:
Folks in your community will unconditionally love their Halo product, perhaps letting you forget how rare it actually is (and the challenges it comes with).
Cool Tech Doesn’t Always Age Well
Halo products usually contain some cutting edge tech. The Honda NSX was the first-ever production car with an aluminum semi-monocoque chassis and featured an all-aluminum engine. My Allanté is a lot more conservative under the hood (pushrod V8), but had forward-looking features like a digital instrument cluster along with a debug mode than can be activated via the center console (not shown at 103 mph). The DeLorean is famous for its stainless steel body, which hasn’t found a repeat in the automotive history for about 42 years until the arrival of the steel ball-bouncing Cybertruck (ugh).
But not all that tech aged well. While stainless body panels withstand the test of time (and the countless fingerprints from onlookers–I am always tempted to give them a cooking pot as it feels much the same), digital instrument clusters were a decidedly 80ies thing until they became so high-quality digital that they now look analog.
The Allanté instrument cluster has a hint of AWS Console to it
The same question can be applied to technologies like Lambda. John O’Hara’s equally provocative and insightful talk at QCon London reminded us that fine-grained architectures gained popularity when single servers were significantly smaller, requiring us to scale out in order to handle traffic. Nowadays, you can order a server with 240 cores and 16TB RAM: online–if your functions run on 256MB each, that’ll make 64,000 instances, perhaps outdoing your default quota of 1000 Lambda invocations per Availability Zone as long as your functions are happy with a small CPU slice each.
There are many nuances here like failure recovery times. Also, your on-line facing services that serve spiky marketing campaigns will surely benefit from having the provider foot the hardware bill for the rest of the year. But modern large servers have sufficient capacity to run most enterprise applications, in memory without the need for asynchronous, event-driven architectures and provisioned concurrency. In a future post, I’ll dive into how loosening constraints like hardware size shape our architecture choices.
Serverless Scope Creep to the Rescue
About two years ago, in the pre-GenAI era, there was some upheaval in the community when AWS relabeled every other service as serverless. Aurora, Redshift, and Neptune can be had in serverless alongside SQS, SNS, S3, and AppSync. If you come to think of it, all services aside from EC2 (or things that explicitly deploy onto EC2) are serverless to the definition of the term, meaning that you don’t have to deal with the servers that provide the service.
The broadening of the term serverless had the distinct advantage of disassociating serverless with only Lambda. You could now be “serverless” with decidedly less Halo-esque services like ECS Fargate that support a more traditional programming model and local storage. Perhaps some folks at AWS knew that limiting a powerful term to a relatively narrow service could work to their disadvantage.
Do Halo Economics Apply to Lambda?
I hinted at Halo products usually not being designed to yield a profit directly, as their main purpose is to generate profit through other products. Now, if Lambda is a Halo service, that leads to some interesting thoughts.
First off, Lambda economics are interesting, and likewise apply to serverless integration services like EventBridge . In many scenarios they are ridiculously cheap, whereas in others it can be outrageously expensive. That happens because the services are higher-level (read: closer to the application), but aren’t priced by application value. If EventBridge transports an insurance application worth many thousand of Dollars, $1 per million events is a steal. If it’s a Web log entry for an image load, it’s a very different story.
But there’s a notable scale aspect:
The more successful (and perhaps stable) your serverless app is, the less attractive pricing becomes.
AWS offers modest discounts on Lambda duration (not request) charges for heavy users, but essentially it’s flat pricing, meaning there are no economies of scale for the user.
Now, each serverless function invocation is intended to have business value, so higher cost would indicate more users and perhaps more profit for your organization. That marketing-approved rhetoric comes with some important footnotes, though. First, most other processes harvest economies of scale. If my company sells 1,000,000 widgets instead of 1,000, I can negotiate better prices with my suppliers and manufacture more efficiently. So, my cost per widget goes down, and my profit is more than 1,000-fold.
Whereas economies of scale constrain you when you’re small, when you get big, you kind of want them.
Second, not each and every function provides immediate business value. For example, a surge in quotations due to a campaign may drive revenue, but isn’t guaranteed to. And, if you’re unlucky, you may be paying for invalid requests, although that’s fixed now.
One-off or spiky tasks like campaigns tend to do very well on Lambda, which is why so many case studies come from this space. But while those applications make great showcases, they are rarely significant revenue drivers for the provider. Large corporate workloads bring steady usage (and revenue), but their load profiles don’t give the provider a lot of opportunity for overprovisioning to drive the cost down. Small users likely stumble over the complex distributed programming model, so that there isn’t that much of a long tail either.
I can’t speak publicly about service profitability or revenue share. But if it is indeed a Halo product, AWS may not mind if it doesn’t make a huge profit. But at a data-driven company service revenue and profit are major determinants of headcount allocation and talent attractiveness. It’s hard to get promoted for running a loss leader.
Are Halo Products a Good Deal, Then?
If Halo products aren’t profitable, you could argue that buying one means you are getting a good deal. A Honda NSX provided a lot of cool tech and world-class handling for 1/2 the price of a Ferrari. Likewise, I got a lot of cool (depending on your definition of cool) for my money when I bought my (used) Allanté.
I also get to run my Serverless Loan Broker for free. But getting a lot of features for the money doesn’t necessarily imply a low TCO. For example, the body-color-matching tail lights on my Allanté are no longer available. I am very lucky that I coughed up $450 many, many years ago to replace my left one. So, it’s good to look at Halo products for the long run: do you have sufficient skills and resources to keep it as shiny as it was when you first saw it?
I drive my Allanté in nice weather in the foothills and use serverless mainly for demos and to order coffee. Both make me happy, but both may be halos.