Spotlight
Amazon Q: GenAI a Feature or a System?
Identifying where challenges and advantages exist in the quest for immediate value in Generative AI.
Wed, 14 Feb 2018
Howdy, I’m Jared Short at Trek10, and this is ‘Think FaaS,’ where we learn about the world of serverless computing in less time than it takes to run an AWS Lambda function. So put five minutes on the clock - it’s time to ‘Think FaaS.’
My fellow audience members, the State of our State is strong-ly consistent!
When we talk about state in serverless computing, there are a two fundamental things we need to think to Bob Ross this architectural landscape. State in functions, and state in data stores.
You’ll often hear that FaaS “functions” are stateless. While this is somewhat true, there are caveats that can get you in to trouble in a hurry if you aren’t aware of them.
The first important thing to realize, is that nearly all of the providers use containers behind the scenes to power their FaaS offerings.
When the platform needs to serve a request, they will shuttle the request to a container running your code. You don’t care where or how this happens. However, what you do care about, is that on most of the platforms, once a container with your code has been spun up to serve a particular kind of request, it may serve one or more requests. This is called a “warm container”, in the event that a request has to be served by a newly started container we call it (no surprise here) a “cold start”. While the performance is certainly impacted, what we care about today is how state is impacted.
On a cold start on various FaaS providers, you typically get a clean slate. Some shared /tmp
scratch space, clean memory, etc. Great! But where we can get in to trouble (and also do some neat tricks), is that state can persist across invocations of the function. Files that you drop in the scratch space, or variables that you hoist outside the main event loop in memory, will be accessible to by the second, third, and forth runs and so on.
How can this be bad?
For example, you may be doing image processing in the scratch space. During testing no issues arise, but surprise, surprise, in production at scale, that scratch space totally fills up and hoses the prod functions. You could also accidentally hoist some variables outside the main event loops and start gobbling up memory or inaccurately calculating things if you thought that an array or list was cleared out on each invocation.
It ain’t all bad though!
If you are aware of the caveats, you can bend them to your will. Scratch space could be used to store frequently accessed s3 files or indexes in a data pipeline instead of fetching them every time from s3. Caching variables outside the main event loop can be used to init environments, decrypt variables in memory, and cache DynamoDB or other data store calls. Anywhere you can smartly and safely reduce external requests from your function will improve function response time, execution times (and costs! remember, time is money) and reduce load on those services, just like any other distributed system.
Always remember, you absolutely cannot count on any kind of cache or other tricks persisting reliably. Side note, some of the FaaS providers are also starting to offer a more opinionated way to handle persistence and state in their offerings such as Azure’s durable functions.
So, how about data stores?
Data stores is a huge topic, and whole other podcast really even, but lets focus on what we look for in a data store in a serverless context.
Number 1, it scales easily and readily, even with bursty loads.
What good is having a solid FaaS platform running your code if your data store can’t scale to handle what your application can throw at it? Traditional relational databases are a tough sell here, now that’s not to say you can’t do it, it’s just challenging. You best be ready for managing clusters, read-replicas, sharding, connection pooling… I’d just rather not! Things that are designed with bursty loads in mind, and complete abstractions of the server model itself are what you are looking for. Things like DynamoDB, FaunaDB, and the upcoming serverless Amazon Aurora are close to the serverless way.
Number 2, pay directly for your usage.
Now, obviously this is a core tenant of FaaS providers, but why should your data store be any different? I’d suggest it shouldn’t be. Your serverless infrastructure costs at all tiers should scale roughly linearly with the usage. You don’t want to be stuck paying for a bunch of read replica capacity on a relational database, you need to know if throw traffic at your data store, it will cost predictable amount of cash to get through the spike (now a story for another time is actually the cost controls and monitoring for these spikes). The usual suspects all meet this need, with DynamoDB and FaunaDB being good choices once again.
Number 3, replication and locality
We are starting to get into the weeds a bit for this podcast, but when you get into the thick of it, but understanding how you data is replicated (across availability zones, and regions) is important. Is you API regionally active/active or active/passive? How do you ensure your data’s integrity? Is the data you need low latency access away from your functions? Simply put, ensure you data store offers replication that matches your availability models. Eventual consistency across regions, and low latency consistent reads to the local region can play well in many situations.
Time’s up, and the State of the State remains strong-ly consistent, or even better eventually consistent, and that’s it from me! You can follow @Trek10inc or myself, @shortjared, on Twitter. And I hope you’ll join me next time for another episode of ‘Think FaaS.’
Identifying where challenges and advantages exist in the quest for immediate value in Generative AI.