Modern businesses rely on applications, and they rely on continued innovation in those applications to drive their business.
This strive for innovation creates a need for improved techniques for validating that an application will work as expected. But constant innovation means a constant chance for problems, and testing applications at scale is not an easy task. This is where SpeedScale comes into play. SpeedScale assists in stress-testing applications by recreating real-world traffic loads in a test environment.
Today on Modern Digital Business.
{{useful-links-research-links}}
{{about-lee}}
{{architecting-for-scale-ad}}
{{signup-dont-miss-out}}
Transcript
Modern businesses rely on applications and we rely on
Lee:continued innovation in those applications to drive their business.
Lee:But as these applications evolve and become more complicated, testing
Lee:them also becomes more challenging.
Lee:Testing applications at scale is not an easy task.
Lee:Today, we're going to look at a company focused on easing the burden
Lee:of testing applications at scale.
Lee:Are you ready?
Lee:Let's go.
Lee:Modern businesses rely on applications and they rely on continued innovation in those
Lee:applications to drive their business.
Lee:This strive for innovation creates a need for improved techniques for validating
Lee:that an application will work as expected.
Lee:But constant innovation means constant chance for problems and testing
Lee:applications at scale is not an easy task.
Lee:This is where SpeedScale comes into play.
Lee:SpeedScale assists in stress testing applications by recreating real world
Lee:traffic loads in a test environment.
Lee:Nate Lee is co-founder of SpeedScale and he is my guest today.
Lee:Nate, welcome to Modern Digital Business.
Nate:Thanks, Lee.
Nate:Glad to be here.
Lee:You know, I think we finally have this worked out, , after a
Lee:couple of delays and an internet outage, I think we're finally going
Lee:to do this podcast, don't you?
Lee:What?
Lee:What do you think?
Nate:no, it's, it's, it's always exciting, uh, eventful, uh,
Nate:leading up to something like this.
Nate:But, uh, yeah, with the power outages and, um, we're, we're recording
Nate:this, uh, between Thanksgiving and Christmas, uh, the holiday season,
Nate:and everybody's kind of hectic.
Nate:Um, a lot of our customers are retail, so they're going through
Nate:code freezes and trying to make sure, you know, hold their breath.
Nate:Tap their head, you know, rub their belly to make sure nothing goes
Nate:down a critical time, but that, hey,
Lee:I remember those days at a, in Amazon retail and, this time of the
Lee:year was always a, a very busy time and yeah, a lot of holding your breath.
Lee:You, you didn't do much change, but everyone was really busy.
Lee:It was a very busy time.
Nate:Yeah.
Nate:Yeah.
Nate:Actually, uh, I got, I got a funny story about Amazon and the holiday rush
Nate:period we were talking to, um, I think it was heavy bit, um, one of the venture
Nate:capital firms that kind of specialize on, on Kubernetes, um, dev tools.
Nate:And, um, one of the gentlemen were telling us that, um, were, they were
Nate:at Amazon working on SRE stuff and, and we're like, how are we gonna
Nate:get ready for the holiday season?
Nate:Like we, we have to run like a gigantic load test.
Nate:And it, it kinda speaks to the genesis of SpeedScale, right?
Nate:It's very difficult to run these sorts of, um, high traffic
Nate:situations without a perfect carbon copy replica production, right?
Nate:Because, you know, a lot of, lot of the load and whether can I handle
Nate:it or not is, is critical on, on having production like, uh, hardware.
Nate:They said, well, what if we run a gigantic sale?
Nate:And, uh, we can basically just simulate what we're gonna be encountering in
Nate:production and during the holiday season.
Nate:And so they were like, yeah, that's a good idea.
Nate:What are we gonna call it?
Nate:And they decided to call it Prime Day.
Nate:So when you have Amazon Prime Day, which is, it's pretty, pretty big deal, right?
Nate:Um, that's really just a veiled dress rehearsal for, uh, black Friday
Nate:season and Christmas holiday shopping.
Nate:But you.
Nate:Like, like a few of the ideas that that Amazon's put through, it actually ended
Nate:up being a huge barn burner of an event.
Lee:Yeah.
Lee:Prime Day came after I left.
Lee:I left Amazon in:Nate:Okay.
Lee:Well, well, definitely one of the things we always used to do
Lee:is we always, um, had, test days where it's like, what happens if we
Lee:take this data center offline and.
Lee:In what happens when we cut this cable?
Lee:We do that sort of testing in production the time.
Lee:The theory was everything should continue to work at scale.
Lee:What's no issues whatsoever.
Lee:But we had to it in production.
Lee:You know, it's, it's the only way to, um, to get that volume of traffic
Lee:until we have someone like SpeedScale.
Lee:Why don't you tell us a little bit exactly what SpeedScale is and what it does.
Nate:Yeah, so, so SpeedScale's a production, traffic replication
Nate:service, and, uh, we help engineers simulate production
Nate:conditions using actual traffic.
Nate:Um, you know, it's, there's kind of been a long history of these sorts of tools.
Nate:Um, you, I think you were referring to Chaos Monkey.
Nate:That, you know, the army, I think it had come from the Netflix days where they were
Nate:randomly executing these daemons to take down services and then seeing what fails.
Nate:And then of course, Gremlin's got a productized version of, um,
Nate:specifically focusing on chaos, right?
Nate:Running these game days.
Nate:Um, and experiments to take down, um, aspects of the servers.
Nate:And I think they're tiptoeing around how do I, how.
Nate:Run these experiments, but also not affect production.
Nate:Right?
Nate:Uh, but SpeedScale's approach is slightly different, and we actually
Nate:capture the traffic and then allow you to run that traffic in
Nate:a safe manner lower environments.
Nate:another way to think about this is shifting left, uh, what you're,
Nate:what you're gonna encounter in production, but do it in a safe way
Nate:in these, in these lower environments.
Lee:So you record production traffic and then replay it in
Lee:a staging or a test environment
Lee:. Nate: That's right.
Lee:A lot of this is possible now because of the advent of cloud environments, right?
Lee:You can spin up these ephemeral environments and was always a promise
Lee:of cloud was . You know, just use what you need and uh, and, and, and spin up
Lee:these environments at a moment's notice.
Lee:And, I think the reality of it is, well, these environments are expensive.
Lee:Uh, they, they actually can skyrocket and cost and they don't actually stay
Lee:up ephemerally, we end up keeping 'em on for long periods of time.
Lee:Right.
Lee:Uh, and, and people, uh, are actually, especially given the
Lee:current economic state, are looking for ways to reduce our costs.
Lee:Your customers really are building modern or have modern applications,
Lee:modern application development.
Lee:I'm talking about things like.
Lee:Cloud native applications, they're in undoubtedly cloud-based applications
Lee:where they can do these replicated environments, um, a a lot easier.
Lee:So in, in that sort of mindset, what challenges do you find exist for your
Lee:customers in managing those applications?
Lee:What are the, some of the problems they come to you with?
Nate:Yeah.
Nate:You know, um, I think that's kind of the key, um, qualifiers.
Nate:What do our customers come with there?
Nate:There are a variety of challenges in developing.
Nate:And modern Cloud, you know, security is always of paramount concern and, um, know,
Nate:making sure that, uh, scale is proper.
Nate:But our customers typically are coming to us with the specific challenge
Nate:of environments, and, and that's something that's, um, been, been kind
Nate:of a common threat that we've noticed.
Nate:Um, Environments themselves aren't the issue.
Nate:Um, when I say environments more specifically, I mean the
Nate:data and the, and the downstream constraints of those environments.
Nate:So, they can always spin up just a carbon copy replica of production
Nate:and, and a full end-to-end environment at a lower scale, right?
Nate:Um, but even if you do.
Nate:The problem is that, uh, a it's expensive cuz there's so many moving parts and,
Nate:and databases and stuff like that.
Nate:b, if it's not seated with the proper data that they need in order to exercise their
Nate:applications, it's really quite useless.
Nate:And and that's where the challenge exactly.
Nate:So, so how are my clients hitting my app that I'm trying to test and uh, how
Nate:does my app send these downstream calls?
Nate:To, to the third party backends or the, or the, the systems of
Nate:record or the other internal APIs.
Nate:And, and what do those systems need to be seated with data-wise
Nate:in order to respond accurately?
Nate:So capturing state managing item potents, it becomes a huge headache actually.
Nate:And, um, That, that's one of the reasons why we had developed SpeedScale, is
Nate:we want the engineers to be able to come into a self-service portal and
Nate:understand, okay, what does my app do?
Nate:Like how does it behave currently?
Nate:And then how do I recreate this situation?
Nate:Um, in, in a cloud native environment without a lot of hassle.
Nate:The current state of the art is usually.
Nate:Using a conventional tool, like, uh, something that can
Nate:actually the transactions.
Nate:And, and on a very simple level, it could be something like Postman or Insomnia.
Nate:Um, a more sophisticated level, maybe you're, you're replaying
Nate:large, large, uh, reams of traffic using something like K six.
Nate:Um, But again, what we hear typically is going on is you're doing those
Nate:sorts of transactions and exercising your application in a full staging
Nate:environment where everybody else is using it at the same time.
Nate:Right?
Nate:And so you don't know if somebody's pushed their alpha version of an
Nate:application in and you're getting these.
Nate:because somebody is, you know, doing some tests at the same time you are, or if
Nate:you truly do have a bug and you should be paying attention to it and fixing it.
Nate:Um, and, uh, yeah, so, so specifically backend environments, the right source
Nate:of data, and then also simulating the inbound calls into your application.
Nate:Those are the challenges we typically see, um, in, in modern cloud development.
Nate:And, and it's really about having the.
Nate:Um, if, if you're focusing on just one area, or one type of transaction, like,
Nate:you know, gold medallion members, when you're really trying to test platinum
Nate:medallion members right, you could be missing a lot of code coverage.
Lee:So I imagine the typical QA development environment is kind
Lee:of what you were describing, what kind of chaos is going on because
Lee:everyone's doing everything
Lee:in it
Lee:but you know, in a, in like a full C I C D pipeline where you might
Lee:have a, let's do a validation at scale test as part of pipeline.
Lee:I imagine in that case, um, you.
Lee:Could spin up, you could afford to spin up for a short period of time, a full fledged
Lee:production environment, use something like SpeedScale to, to um, to execute,
Lee:to test the environment at scale, to make sure nothing works as not anticipated.
Lee:But I imagine the problem with that sort of scenario though, is as
Lee:you're making deployments and making changes exactly what the script
Lee:is from SpeedScale, the scripted.
Lee:Traffic that you're getting in will change as time goes on.
Lee:How do you keep that up to date?
Lee:do you constantly take new scripted traffic and replay that?
Lee:Is that how you do
Nate:Yeah.
Nate:Yeah, yeah, So it's really kind of shifting the paradigm.
Nate:So the, the way SpeedScale was developed, um, we've all got backgrounds in companies
Nate:like New Relic and observing and, It k o that really kind of founded the
Nate:concept of service virtualization, which is a fancy way to say service mocking.
Nate:But with that background, we inherently understood that it's really slow
Nate:to, uh, develop these scripts.
Nate:So we don't actually take a script based approach in running this traffic.
Nate:What we actually do is, We run traffic snapshots.
Nate:So what we're doing is since we are capturing all this traffic, we develop
Nate:a snapshot, um, and generate things.
Nate:One is the inbound traffic.
Nate:We generate like a script, if you will.
Nate:It's really just a JSON snapshot file, is what we call it.
Nate:And there's no actual scripting involved.
Nate:It's auto-generated from real traffic.
Nate:Uh, a key point in this, uh, for the listeners is we are redacting PII we
Nate:are capturing the traffic, cuz you don't wanna be, you know, spewing, uh, sensitive
Nate:information, uh, when you're replaying it.
Nate:So data loss prevention is actually a very big piece of this.
Nate:Um, but anyways, so the snapshots are auto-generated as well as, From the
Nate:traffic, we can kinda reverse engineer what backends you need in order to
Nate:exercise a particular So, so not only do we, um, generate a traffic snapshot
Nate:and you can replay of the inbound traffic, but we also generate a mock
Nate:server in a pod, if you will, that mock server in a pod can be spun up.
Nate:And what this really does is, is vastly uh, Narrow the scope of the
Nate:environment that you need to spin up.
Nate:So we're actually just spinning up n plus one and N minus one.
Nate:We're spinning up your API and only its neighbors instead of the whole full,
Nate:full-blown end-to-end environment.
Nate:And so it's like a little microcosm of your API, but your API for all
Nate:intensive purposes thinks it's in a fully integrated end-to-end environment.
Lee:But you're essentially doing service by service level
Lee:versus an application level.
Lee:So you're not really, you're not.
Lee:Scripting user traffic into the system, you're scripting traffic into a particular
Lee:service in and out of the service and, and the data that goes with it.
Lee:So you can, you only have to bring up the service and what's
Lee:around it, and you don't have to bring up the entire application.
Nate:Well, you really only need to bring up just the app
Nate:and we're, and SpeedScale's taking care of the rest really.
Nate:Yeah.
Nate:Um, so we're scripting all the inbound traffic for you.
Nate:There's no scripting involved.
Nate:You basically, we, we have what's called traffic viewer and you use that to browse
Nate:the type of traffic you want to invoke.
Nate:Um, and once you select the traffic that you wanna invoke, we basically take a
Nate:look at all the traffic around it and say, okay, well when you run this call
Nate:inbound, As a result of that, your application calls, you know, a Dynamo
Nate:database and then these other two APIs, and then you make a call to a third
Nate:party, I don't know, let's say Stripe or Google Maps or something like that.
Nate:And so we will automatically generate a mock server, uh, based off of reverse
Nate:engineering, how your app works and make sure everything is there that you need.
Nate:Um, so yeah, it, it, it's, you, you got it.
Nate:The, the concept is we, Virtualizing your neighbors so that you can do consistent
Nate:scientific, uh, dry runs of, of your API as part of ci and, and, but it's
Nate:also a huge reduction in cloud costs cuz you're not spinning up a big end-to-end
Nate:environment of literally everything that is included in your app every time.
Nate:And, and, and to be honest, that's also sometimes not possible.
Nate:Because of the connections that you do have to third parties, almost
Nate:everybody integrates with like a payment or maybe like a background
Nate:check organization or, a mapping or a messaging solution that's a third party.
Nate:And so, so many times, uh, these wires that hang out at the cloud, uh, as I
Nate:call them, uh, those are difficult.
Nate:To simulate, uh, you have to call the vendor and ask for a sandbox.
Nate:And if you wanna do a load test, forget it.
Nate:You know, that's not gonna go to a hundred tps, right?
Nate:They're just standing up the sandbox to give you, you know,
Nate:ones Z two Z transactions, whereas,
Lee:uh, performance testing or anything like
Nate:exactly.
Nate:Exactly.
Nate:But if we're simulating that using, um, traffic to, to auto-generate a
Nate:mock server and a pod for you, uh, the, the, the possibilities go up.
Lee:Cool, cool.
Lee:Now, I, I can see how this works for APIs.
Lee:Um, and you can include database, AC activities such as to DynamoDB, uh, DB
Lee:as an API call essentially is what it is.
Lee:But what about native databases or native data that's stored in the service?
Lee:You know, like MySQL database might be part of the service or, or, um,
Lee:cash or Redis cash or something like.
Lee:Do you simulate those as well or how do you, uh, what do you do in those cases?
Nate:Yeah, so, um, for tus we can actually, uh, see the traffic going
Nate:through it, but we can't simulate it, um, for other data sources.
Nate:Um, we do have ongoing support developing for like things like Postgres and MongoDB.
Nate:Um, we've got the full list of supported technologies on our documentations
Nate:page, which is docs SpeedScale.com.
Nate:But.
Nate:Really the beauty with, um, being able to provision these, um,
Nate:backends, if they're API based, right?
Nate:Usually it's all fairly standard.
Nate:If you communicate to a system of record via API, we can also handle that something
Nate:like elastic search, for example.
Nate:Uh, but if it is a local data source or, or something like MySQL, which, uh, sorry,
Nate:uh, MS that's got like a proprietary, um, non-open standard, you would
Nate:probably wanna provision that locally.
Nate:Uh, by yourself as part of that, uh, kind of simulated microcosm.
Nate:So, um, with most of these cloud native environments, you can specify
Nate:either like, you know, the environment script or, or, or, or the, the yams
Nate:to properly stand those things up.
Nate:In addition to the SpeedScale simulations.
Lee:Makes sense.
Lee:Let's talk about resiliency a little bit.
Lee:You know, resiliency is, it's an, it's an interesting aspect when it comes
Lee:to cloud-based applications, you know, because, building the DNA of the cloud
Lee:is the cloud is designed to break, right?
Lee:I mean, the whole fundamental aspect of the cloud is if a
Lee:service server isn't working right, just terminate and restart it.
Lee:And, and that mindset extends throughout the entire cloud ecosystem
Lee:where everything is designed to.
Lee:You know, with, with retry, with, with a redundancy built in so that you
Lee:can lose components, components can go away, come back, and your entire
Lee:system as a whole continues to work.
Lee:What does SpeedScale do to help with that sort of resiliency testing?
Lee:Are there ways you can simulate those sorts of, of environment?
Nate:Yeah.
Nate:Yeah, to an extent.
Nate:I mean, uh, well, first of all, before I ju jump into that, I, I think, um, a lot
Nate:of people have kind of a false level of comfort with the, the resiliency that's,
Nate:that's inherently built into the cloud.
Nate:I think what people realize is, oh, look, the, the, the, you know, The
Nate:startup times of the Lambda serverless instances are actually quite long.
Nate:And how do we get past that right?
Nate:Or hey, horizontal pod auto-scaling rules actually take quite a little
Nate:while to understand that, hey, a pod is down and then spin up another pod.
Nate:Like it, it waits and it retries a couple times, and meanwhile, you
Nate:know, you're bleeding thousands of dollars, , uh, because you know, your,
Nate:your mobile ordering app is down.
Nate:So, I think it's a little bit of false sense of comfort, or, or protection.
Nate:And um, that's what we can really help simulate.
Nate:And, and what we do with that is, again, it's, capturing traffic, um, in order to
Nate:understand how users run your application.
Nate:But once we do have that traffic, engineers can multiply that.
Nate:Um, and, and empowering the engineers to run these what if scenarios?
Nate:Like, what if I had a hundred x traffic, or what if I had, you know, a thousand
Nate:x traffic for 30 seconds, um, and run more of a soak or soak or sanity test.
Nate:Um, this is all things that are available with the few mouse clicks
Nate:once we have that baseline of traffic.
Nate:Um, and the traffic captures kind of how your, your application
Nate:is exercised as well as.
Nate:We've got the necessary backends ready to be spun up in a mock server.
Nate:So it's kinda like a turnkey simulation that you can run.
Nate:and so when people do have DR rules or HPA rules, um, they can actually
Nate:verify that things are going to, to fail over as expected or scale as expected.
Nate:Another aspect within resiliency that that.
Nate:simulation can help catch is, your resource consumption.
Nate:So if you're making logic changes to your, your services and, um, or, or
Nate:you make this calculation change and for some reason, let's say it causes
Nate:CPU to skyrocket or it causes, you've got a memory leak in your code and
Nate:it, it begins to r rise over time.
Nate:The state of the art in catching issues like that really is to, to
Nate:just go ahead and release and then pay really close attention to Datadog
Nate:or New Relic or AppDynamics, right?
Nate:And, and rely on those observability tools to give you an early warning.
Nate:And then it's kind of all hands on deck reacting or, or trying to shut
Nate:down that pod over and over again, whatever it starts creeping up.
Nate:Those sorts of changes can be actually proactively caught by, you know,
Nate:running these traffic simulations.
Nate:So by simulating the inbound traffic in the mock server
Nate:pods, those are your controls.
Nate:And really the only thing that changes is your application as you make changes.
Nate:And that's another kind of reason to use these crowded, um, chaotic
Nate:staging environments is because there's so much noise in the system
Nate:and other people are doing things and.
Nate:Staging can break quite frequently, and I know you've written actually about this,
Lee:Yep.
Nate:and . And so that's another kinda argument to using these, um,
Nate:production simulations in a very kinda sterilized lab environment, if you will.
Nate:And you know, the only thing that's changing is your code.
Nate:So it's, it's a way to consistently iterate and experiment, make changes.
Nate:And so that's another way you could improve your resiliency.
Nate:Um, you can make sure that you're optimizing all the resources at hand and
Nate:you're not, you know, irresponsibly, allocating memory, and then just hoping
Nate:that horizontal auto scaling rules or the cloud scalability will cover for you.
Nate:Right.
Nate:You might not be economical with your code.
Lee:Right, right.
Lee:That makes sense.
Lee:You, you can also do controlled failures too, right?
Lee:You can do game day testing, if you will, during these simulation runs,
Lee:so you can see what You know, your, your, your normal traffic works fine,
Lee:but what happens if three servers go down while that's going on?
Lee:The DR rules you're talking about certainly cover that, but, but
Lee:this is kind of a way of interject what if scenarios and to get even
Lee:useful information that you can feed back the development org about,
Lee:Hey, it didn't quite work the way we expected to in this scenario.
Lee:What if we changed the rules a little bit and adjusting so it's
Lee:higher likelihood of success.
Nate:That's right.
Nate:Yeah.
Nate:Um, we can, we can, you know, generate the inbound traffic
Nate:into, you know, just an API.
Nate:. But you can also just use that in isolation.
Nate:You can use our traffic generation capabilities to hit you at the front
Nate:door like an ingress or an API gateway and test your entire application so
Nate:you can actually piecemeal out the solutions, um, which is like, you
Nate:know, we've got the traffic generation piece and the mock server piece.
Nate:some people spin up our mocking pod just leave it on full-time
Nate:because they need, uh, to, to simulate the third party components.
Nate:That's the cool part about having the traffic patterns as a snapshot is once we
Nate:do have the traffic, we can play with the traffic we can start to slow things down.
Nate:So we can say, Hey, we're mocking, stripe.
Nate:what if Stripe goes down?
Nate:Then we can just tell that traffic replay to be a black hole not respond.
Nate:We can also tell it to respond with 22nd latency.
Nate:Um, and then you can start check.
Nate:my application time out gracefully?
Nate:Uh, does it wait the whole time?
Nate:can also speed up the traffic.
Nate:I've actually heard of cases of applications failing because the
Nate:back ends get improved and they start responding faster, and then
Nate:your application then becomes the bottleneck and starts crashing.
Lee:So even as development tool, right?
Lee:This is, um, you know, when you're, when you're trying to build your application
Lee:and build the resiliency in, or you're trying to build, what, if scenarios
Lee:in, you can take the scripted language in your development environment.
Lee:And, and fool around with it and do different things there.
Lee:I, I'm assuming these are all rational use cases for speed that correct?
Lee:. Nate: that, yeah, exactly.
Lee:They're out of the box kind of.
Lee:And then, and again, just to, just to reemphasize, uh, while under the hood we
Lee:are developing, you know, JSON and Scripps and stuff, uh, no scripting involved.
Lee:It's literally just a UX dashboard where you peruse all the API level calls that
Lee:we've been picking up and desensitized.
Lee:And you can see basically the ins and outs of all the traffic of a
Lee:particular API you're trying to test.
Lee:You tell us, Hey, I wanna generate a snapshot, and I wanted this snapshot
Lee:to have this set of inbound traffic that you're gonna rerun, and also this
Lee:set of mocked traffic that I wanna run, and you get this kind of turn.
Lee:Ephemeral environment, lab environment that you can run over and over again.
Lee:If production happens to update, then you can just go out and grab
Lee:another side of traffic, right?
Lee:The paradigm's completely changed.
Lee:Now.
Lee:There's no scripting involved.
Lee:There's no maintenance of the script and updating the script, like normal kind of
Lee:testing organization to take a look at it.
Lee:It's literally.
Lee:out and grab a new snapshot.
Lee:Wait two minutes for it to be auto generation, and
Lee:then run that new snapshot.
Lee:Or it can be automated via GitHub or API call.
Lee:You can say, Hey, grab the last 15 minutes of traffic, run it again.
Lee:Uh, and, and it can all be done as part of the CI pipeline as well.
Lee:Yeah, so one of your use cases is, like you said, C I C D pipelines,
Lee:another use cases development.
Lee:Another use case I'm assuming is uh, QA departments who just want
Lee:to see what happens if scenarios and they just poke around and.
Lee:Make changes dynamically just to try to see what's going on, whether
Lee:that's a QA department, I, as I said, or if it's the development
Lee:organization going through a QA process.
Lee:It doesn't matter, but a, it's a step to validate.
Lee:So I imagine, so those are like three distinct use cases, an automated
Lee:pipeline, a QA doing random testing and development, using it to harden
Lee:the application, or even in part of the development process itself, are there.
Lee:Use cases that are not represented by those three that
Lee:this, this, this is useful for.
Nate:Yeah.
Nate:Yeah.
Nate:There's, um, within those three use cases, I mean, I guess you could,
Nate:uh, break it up into specific.
Nate:Phases of testing.
Nate:I mean, it could, the traffic replays can really be curated in a
Nate:way, uh, where you're checking for functionality or contract changes, right?
Nate:Um, you can look at it more as like an integration test.
Nate:You can also multiply the traffic and look at it more as a load test.
Nate:So that's where the concept gets interesting is load testing at a
Nate:regular interval as part of ci.
Nate:Um, so I've heard people call it performance assurance.
Nate:Uh, I've heard people call it continuous performance testing.
Nate:Um, once you integrate, and, and really the linchpin to all of that is the mocks
Nate:because when you're doing load testing, typically everybody has to be finished
Nate:with their application code, like their, their particular piece of it, right?
Nate:And then they have to curate it at a performance environment that's,
Nate:you know, one 10th the size of staging so they can extrapolate
Nate:the results and multiply it by 10.
Nate:Now if we're mocking the back ends and they're performing and they
Nate:can do a thousand t ps, um, then really that constraint goes away.
Nate:and, and now you can understand, well, this one piece, this payment API, or
Nate:this fulfillment API I'm working on needs to go up to 800 transaction per second.
Nate:You can do that, without having to wait for the full end, end environment
Nate:without having to tell the dba, Hey, I'm gonna be hammering the database.
Nate:You know, please don't get mad at me, kind of thing.
Nate:And so that can all be done and, in, a self-service way.
Nate:Now you've written about like all these different microservice teams
Nate:that are disparate and, and siloed, and they, but they all have to
Nate:be communicating tightly, right?
Nate:And you've written about the ability for them to have some sort
Nate:of self-service way to understand how they interconnect with every.
Nate:And also understand the integrations and then spin up these environments
Nate:and, and so SpeedScale literally does that, is allows somebody to jump into
Nate:this API or that API view the traffic and we'll show them service map.
Nate:Then they say, well I run this, I exercise my application.
Nate:They can actually just grab the traffic that's relevant to them.
Nate:Um, and so in that way we've actually beyond just like the CI and the
Nate:development enablement and then the, the QA testing kinda what if testing that they
Nate:can do, can also take that traffic and point it at different endpoints, right?
Nate:So they can actually do performance benchmarking.
Nate:One of the stories that we've had from a customer is like, you know,
Nate:they were graviton, Google came out with a new graviton process.
Nate:And they were like, well, is that really gonna be any faster
Nate:than what we're currently on?
Nate:And so they were able to benchmark like, well, this is business as usual traffic.
Nate:Let's test on the Google Graviton processors.
Nate:And they did find out that there was like a X percent faster throughput.
Nate:Um, so they ended, yeah.
Nate:So you can use it to, to benchmark in a conventional load testing sense.
Nate:Um, there's also the, the use case that I call parody testing.
Nate:To check for parody and, uh, when you're doing migrations like from e c
Nate:two to Kubernetes, if your application fundamentally is gonna remain the
Nate:same, but you're just re-platforming, you could capture business as usual
Nate:traffic coming into your e c two app.
Nate:And then once you're done platforming, like moving to Kubernetes,
Nate:you can do a sanity check.
Nate:And before you.
Nate:Fork all the traffic over and kind of do the grand opening.
Nate:You can run the old traffic that you would normally get to e C two, run it
Nate:against Kubernetes platform and say, Hey, am I getting to the same response times?
Nate:Are things scaling properly?
Nate:Do the functionality get preserved as we moved over?
Nate:the last, just the last piece actually is, um, In particular, when you're doing
Nate:like docker environment development.
Nate:That you can run like Docker locally on your laptop or you docker
Nate:compose, Minicube, micro Kates kind.
Nate:so all of these concepts, all these mock server pods, traffic
Nate:generator pods can actually be spun up locally on your laptop.
Nate:So now, an argument for like, Hey, I don't need the full
Nate:blown end to end environment.
Nate:I can just simulate my neighbors, get SpeedScale to generate those pods and
Nate:then run them locally on my laptop.
Lee:It's one of.
Lee:Biggest complaints I hear about microservice architectures is the
Lee:development, the laptop development environment is so difficult to set up
Lee:and manage and this, this is a tool that'll help make that a lot easier.
Nate:Yep.
Nate:Yep.
Nate:We
Lee:done offline as well, or is it still only an online tool?
Nate:So we're about to launch a command line version of this that is, uh, is it,
Nate:it doesn't require an internet connection.
Nate:So it'll be free and you can generate those pods and then run them locally.
Nate:In like a mini cube environment, something like that.
Lee:So you talked a little bit about what motivated you to start
Lee:SpeedScale but why you, why,
Nate:why did you start SpeedScale?
Nate:You were writing device drivers.
Nate:And, uh, Matt had actually developed a, uh, it was basically
Nate:like a, what would we call it?
Nate:Um, like a visual kind of driver development kit, that allowed us to.
Nate:these drivers more quickly.
Nate:And then Ken developed a simulator that would, it was like kind of like
Nate:a stub code harness that you could drop the driver in and it would test
Nate:the out, the input and outputs to it.
Nate:So all three of us have kind of been in this mindset of like,
Nate:you know, better testing, faster development, and those two got into the
Nate:observability space first with Wiley.
Nate:Uh, and then New Relic and Observe Inc.
Nate:meanwhile, I kind of took a different path.
Nate:I, I had actually, um, been with itt, k o, actually Ken worked at itt.
Nate:K he's the one who pulled pn, but it, k o had developed this
Nate:concept of service virtualization, and that was back in the so days.
Nate:there's just a huge mix of like legacy queuing technologies
Nate:like MQ and TIPCO and amqp.
Nate:And then there was just, you know, SOAP services were just becoming a thing.
Nate:So developing these service mocks was a hugely complex affair.
Nate:You had to, you know, redirect WA servers and bounce them and,
Nate:and do a lot of networking to get these mocks up and running.
Nate:Um, and we had always been, uh, Enamored with the concept, but really
Nate:dissatisfied with the process of developing these service marks cuz
Nate:done properly, they're a huge enabler.
Nate:they're a huge value add cuz they can accelerate, um, the dev process.
Nate:You can develop in parallel, you can simulate all these
Nate:conditions and so on and so forth.
Nate:But service blocks kind of.
Nate:A bad reputation because you usually have to hand script the responses one by one.
Nate:If you want a backend to simulate whatever, you have to
Nate:seat it with the right data.
Nate:It has to be onesie, twosy, programmed to respond.
Nate:So, uh, this is a long-winded way of saying, um, when the cloud
Nate:came about Kubernetes and cloud data warehouse storage, realized,
Nate:oh, we can do this very quickly.
Nate:There's proxies, uh, there's always, there's already network taps that we
Nate:can take advantage of, and then we can use the traffic to train the models.
Nate:once the, the mock pods and, and the inbound traffic, uh, can, can
Nate:be simulated, the rest of it is just an orchestration problem.
Nate:And, you know, with Terraform Scripts, helm charts and yaml, all that stuff
Nate:is pretty, pretty well known as well.
Nate:So, It was, it was a matter of desire and background.
Nate:And then the, the cloud data technology has actually just been a huge enabler, so,
Lee:Yeah, and I've known Ken for many years, but, , I'm so glad I, I met
Lee:you guys and, , and I'm really excited to see what you guys, are going to
Lee:accomplish as, uh, as you go along.
Lee:So the natural question that always comes up at this time of.
Lee:Year is what's next year like, and so what, what are your plans for next year?
Lee:What are you gonna do in:Lee:does SpeedScale look like in:Nate:g with some great partners in:Nate:really refining the ergonomics of the product and, , It's been a huge kind
Nate:of developer productivity accelerator.
Nate:2023, we're gonna release a, for a free version of SpeedScale.
Nate:We know kind of what aspects that people love uh, we just wanted to be
Nate:careful about understanding where the, the, the real, you know, exceptionally
Nate:useful features are, and then what, what, what those features could be
Nate:command line driven, which of those actually need like a full-blown ui?
Nate:Um, so the freemium tool is gonna be mostly command ba, command line based.
Nate:Um, but once you start needing re uh, you know, enterprise level
Nate:things like single signon and more sophisticated, uh, redaction.
Nate:and visual reports, that's when you would, you know, kind of have a paid tier.
Nate:So we expect the free tier to be, uh, a great value add for engineers that, that
Nate:need mocking and traffic generation.
Nate:And then, there's also gonna be, A lot of, , momentum around kind of publicizing
Nate:SpeedScale from a marketing perspective.
Nate:We help to, uh, really kind of listen to the engineering community and
Nate:understand, uh, where we can provide the most lift and, uh, iterate quickly
Nate:to, to develop those features in.
Nate:But already we're, we're getting stories of, you know, taking two week load
Nate:testing sprints down to three hours and improving API performance by 30 x.
Nate:And we just wanna continue that.
Lee:So, so if any listener is interested in learning more about
Lee:SpeedScale, where should they go?
Nate:Yeah, they could just go to SpeedScale.com.
Nate:Uh, spelled exactly like it sounds, uh, one word.
Nate:Uh, we also have a community on slack SpeedScale.com, where they
Nate:can talk directly to the founders or the engineers, ask questions.
Nate:Um, and then if you go to SpeedScale.com/free-trial, they're
Nate:able to, to download the product and try it, um, locally, um,
Lee:And I'll make sure those links are in the show notes as well
Lee:too, so people can see 'em there.
Lee:So, great.
Lee:So, um, anything else you wanna add before we, we, uh, we wrap it up here and, uh, we
Lee:managed to make it all the way through the episode without losing the internet again.
Lee:That's, that's fantastic.
Nate:Yeah, I mean, wouldn't No, no.
Nate:That was it.
Nate:Always a pleasure to, to talk and, uh, you know, um, kinda commiserate
Nate:over the technical problems of modern Cloud with you, Lee, it's always great.
Lee:Definitely.
Lee:I love to love talking with you, Nate.
Lee:Thank you.
Lee:my guest today has been, uh, uh, Nate Lee, who is the co-founder of SpeedScale.
Lee:Nate, thank you very much for joining me on Modern Digital Business.
Leave A Comment