Modern businesses rely on applications, and they rely on continued innovation in those applications to drive their business.

This strive for innovation creates a need for improved techniques for validating that an application will work as expected. But constant innovation means a constant chance for problems, and testing applications at scale is not an easy task. This is where SpeedScale comes into play. SpeedScale assists in stress-testing applications by recreating real-world traffic loads in a test environment.

Today on Modern Digital Business.

{{useful-links-research-links}}

{{about-lee}}

{{architecting-for-scale-ad}}

{{signup-dont-miss-out}}

Transcript
Lee:

Modern businesses rely on applications and we rely on

Lee:

continued innovation in those applications to drive their business.

Lee:

But as these applications evolve and become more complicated, testing

Lee:

them also becomes more challenging.

Lee:

Testing applications at scale is not an easy task.

Lee:

Today, we're going to look at a company focused on easing the burden

Lee:

of testing applications at scale.

Lee:

Are you ready?

Lee:

Let's go.

Lee:

Modern businesses rely on applications and they rely on continued innovation in those

Lee:

applications to drive their business.

Lee:

This strive for innovation creates a need for improved techniques for validating

Lee:

that an application will work as expected.

Lee:

But constant innovation means constant chance for problems and testing

Lee:

applications at scale is not an easy task.

Lee:

This is where SpeedScale comes into play.

Lee:

SpeedScale assists in stress testing applications by recreating real world

Lee:

traffic loads in a test environment.

Lee:

Nate Lee is co-founder of SpeedScale and he is my guest today.

Lee:

Nate, welcome to Modern Digital Business.

Nate:

Thanks, Lee.

Nate:

Glad to be here.

Lee:

You know, I think we finally have this worked out, , after a

Lee:

couple of delays and an internet outage, I think we're finally going

Lee:

to do this podcast, don't you?

Lee:

What?

Lee:

What do you think?

Nate:

no, it's, it's, it's always exciting, uh, eventful, uh,

Nate:

leading up to something like this.

Nate:

But, uh, yeah, with the power outages and, um, we're, we're recording

Nate:

this, uh, between Thanksgiving and Christmas, uh, the holiday season,

Nate:

and everybody's kind of hectic.

Nate:

Um, a lot of our customers are retail, so they're going through

Nate:

code freezes and trying to make sure, you know, hold their breath.

Nate:

Tap their head, you know, rub their belly to make sure nothing goes

Nate:

down a critical time, but that, hey,

Lee:

I remember those days at a, in Amazon retail and, this time of the

Lee:

year was always a, a very busy time and yeah, a lot of holding your breath.

Lee:

You, you didn't do much change, but everyone was really busy.

Lee:

It was a very busy time.

Nate:

Yeah.

Nate:

Yeah.

Nate:

Actually, uh, I got, I got a funny story about Amazon and the holiday rush

Nate:

period we were talking to, um, I think it was heavy bit, um, one of the venture

Nate:

capital firms that kind of specialize on, on Kubernetes, um, dev tools.

Nate:

And, um, one of the gentlemen were telling us that, um, were, they were

Nate:

at Amazon working on SRE stuff and, and we're like, how are we gonna

Nate:

get ready for the holiday season?

Nate:

Like we, we have to run like a gigantic load test.

Nate:

And it, it kinda speaks to the genesis of SpeedScale, right?

Nate:

It's very difficult to run these sorts of, um, high traffic

Nate:

situations without a perfect carbon copy replica production, right?

Nate:

Because, you know, a lot of, lot of the load and whether can I handle

Nate:

it or not is, is critical on, on having production like, uh, hardware.

Nate:

They said, well, what if we run a gigantic sale?

Nate:

And, uh, we can basically just simulate what we're gonna be encountering in

Nate:

production and during the holiday season.

Nate:

And so they were like, yeah, that's a good idea.

Nate:

What are we gonna call it?

Nate:

And they decided to call it Prime Day.

Nate:

So when you have Amazon Prime Day, which is, it's pretty, pretty big deal, right?

Nate:

Um, that's really just a veiled dress rehearsal for, uh, black Friday

Nate:

season and Christmas holiday shopping.

Nate:

But you.

Nate:

Like, like a few of the ideas that that Amazon's put through, it actually ended

Nate:

up being a huge barn burner of an event.

Lee:

Yeah.

Lee:

Prime Day came after I left.

Lee:I left Amazon in:Nate:

Okay.

Lee:

Well, well, definitely one of the things we always used to do

Lee:

is we always, um, had, test days where it's like, what happens if we

Lee:

take this data center offline and.

Lee:

In what happens when we cut this cable?

Lee:

We do that sort of testing in production the time.

Lee:

The theory was everything should continue to work at scale.

Lee:

What's no issues whatsoever.

Lee:

But we had to it in production.

Lee:

You know, it's, it's the only way to, um, to get that volume of traffic

Lee:

until we have someone like SpeedScale.

Lee:

Why don't you tell us a little bit exactly what SpeedScale is and what it does.

Nate:

Yeah, so, so SpeedScale's a production, traffic replication

Nate:

service, and, uh, we help engineers simulate production

Nate:

conditions using actual traffic.

Nate:

Um, you know, it's, there's kind of been a long history of these sorts of tools.

Nate:

Um, you, I think you were referring to Chaos Monkey.

Nate:

That, you know, the army, I think it had come from the Netflix days where they were

Nate:

randomly executing these daemons to take down services and then seeing what fails.

Nate:

And then of course, Gremlin's got a productized version of, um,

Nate:

specifically focusing on chaos, right?

Nate:

Running these game days.

Nate:

Um, and experiments to take down, um, aspects of the servers.

Nate:

And I think they're tiptoeing around how do I, how.

Nate:

Run these experiments, but also not affect production.

Nate:

Right?

Nate:

Uh, but SpeedScale's approach is slightly different, and we actually

Nate:

capture the traffic and then allow you to run that traffic in

Nate:

a safe manner lower environments.

Nate:

another way to think about this is shifting left, uh, what you're,

Nate:

what you're gonna encounter in production, but do it in a safe way

Nate:

in these, in these lower environments.

Lee:

So you record production traffic and then replay it in

Lee:

a staging or a test environment

Lee:

. Nate: That's right.

Lee:

A lot of this is possible now because of the advent of cloud environments, right?

Lee:

You can spin up these ephemeral environments and was always a promise

Lee:

of cloud was . You know, just use what you need and uh, and, and, and spin up

Lee:

these environments at a moment's notice.

Lee:

And, I think the reality of it is, well, these environments are expensive.

Lee:

Uh, they, they actually can skyrocket and cost and they don't actually stay

Lee:

up ephemerally, we end up keeping 'em on for long periods of time.

Lee:

Right.

Lee:

Uh, and, and people, uh, are actually, especially given the

Lee:

current economic state, are looking for ways to reduce our costs.

Lee:

Your customers really are building modern or have modern applications,

Lee:

modern application development.

Lee:

I'm talking about things like.

Lee:

Cloud native applications, they're in undoubtedly cloud-based applications

Lee:

where they can do these replicated environments, um, a a lot easier.

Lee:

So in, in that sort of mindset, what challenges do you find exist for your

Lee:

customers in managing those applications?

Lee:

What are the, some of the problems they come to you with?

Nate:

Yeah.

Nate:

You know, um, I think that's kind of the key, um, qualifiers.

Nate:

What do our customers come with there?

Nate:

There are a variety of challenges in developing.

Nate:

And modern Cloud, you know, security is always of paramount concern and, um, know,

Nate:

making sure that, uh, scale is proper.

Nate:

But our customers typically are coming to us with the specific challenge

Nate:

of environments, and, and that's something that's, um, been, been kind

Nate:

of a common threat that we've noticed.

Nate:

Um, Environments themselves aren't the issue.

Nate:

Um, when I say environments more specifically, I mean the

Nate:

data and the, and the downstream constraints of those environments.

Nate:

So, they can always spin up just a carbon copy replica of production

Nate:

and, and a full end-to-end environment at a lower scale, right?

Nate:

Um, but even if you do.

Nate:

The problem is that, uh, a it's expensive cuz there's so many moving parts and,

Nate:

and databases and stuff like that.

Nate:

b, if it's not seated with the proper data that they need in order to exercise their

Nate:

applications, it's really quite useless.

Nate:

And and that's where the challenge exactly.

Nate:

So, so how are my clients hitting my app that I'm trying to test and uh, how

Nate:

does my app send these downstream calls?

Nate:

To, to the third party backends or the, or the, the systems of

Nate:

record or the other internal APIs.

Nate:

And, and what do those systems need to be seated with data-wise

Nate:

in order to respond accurately?

Nate:

So capturing state managing item potents, it becomes a huge headache actually.

Nate:

And, um, That, that's one of the reasons why we had developed SpeedScale, is

Nate:

we want the engineers to be able to come into a self-service portal and

Nate:

understand, okay, what does my app do?

Nate:

Like how does it behave currently?

Nate:

And then how do I recreate this situation?

Nate:

Um, in, in a cloud native environment without a lot of hassle.

Nate:

The current state of the art is usually.

Nate:

Using a conventional tool, like, uh, something that can

Nate:

actually the transactions.

Nate:

And, and on a very simple level, it could be something like Postman or Insomnia.

Nate:

Um, a more sophisticated level, maybe you're, you're replaying

Nate:

large, large, uh, reams of traffic using something like K six.

Nate:

Um, But again, what we hear typically is going on is you're doing those

Nate:

sorts of transactions and exercising your application in a full staging

Nate:

environment where everybody else is using it at the same time.

Nate:

Right?

Nate:

And so you don't know if somebody's pushed their alpha version of an

Nate:

application in and you're getting these.

Nate:

because somebody is, you know, doing some tests at the same time you are, or if

Nate:

you truly do have a bug and you should be paying attention to it and fixing it.

Nate:

Um, and, uh, yeah, so, so specifically backend environments, the right source

Nate:

of data, and then also simulating the inbound calls into your application.

Nate:

Those are the challenges we typically see, um, in, in modern cloud development.

Nate:

And, and it's really about having the.

Nate:

Um, if, if you're focusing on just one area, or one type of transaction, like,

Nate:

you know, gold medallion members, when you're really trying to test platinum

Nate:

medallion members right, you could be missing a lot of code coverage.

Lee:

So I imagine the typical QA development environment is kind

Lee:

of what you were describing, what kind of chaos is going on because

Lee:

everyone's doing everything

Lee:

in it

Lee:

but you know, in a, in like a full C I C D pipeline where you might

Lee:

have a, let's do a validation at scale test as part of pipeline.

Lee:

I imagine in that case, um, you.

Lee:

Could spin up, you could afford to spin up for a short period of time, a full fledged

Lee:

production environment, use something like SpeedScale to, to um, to execute,

Lee:

to test the environment at scale, to make sure nothing works as not anticipated.

Lee:

But I imagine the problem with that sort of scenario though, is as

Lee:

you're making deployments and making changes exactly what the script

Lee:

is from SpeedScale, the scripted.

Lee:

Traffic that you're getting in will change as time goes on.

Lee:

How do you keep that up to date?

Lee:

do you constantly take new scripted traffic and replay that?

Lee:

Is that how you do

Nate:

Yeah.

Nate:

Yeah, yeah, So it's really kind of shifting the paradigm.

Nate:

So the, the way SpeedScale was developed, um, we've all got backgrounds in companies

Nate:

like New Relic and observing and, It k o that really kind of founded the

Nate:

concept of service virtualization, which is a fancy way to say service mocking.

Nate:

But with that background, we inherently understood that it's really slow

Nate:

to, uh, develop these scripts.

Nate:

So we don't actually take a script based approach in running this traffic.

Nate:

What we actually do is, We run traffic snapshots.

Nate:

So what we're doing is since we are capturing all this traffic, we develop

Nate:

a snapshot, um, and generate things.

Nate:

One is the inbound traffic.

Nate:

We generate like a script, if you will.

Nate:

It's really just a JSON snapshot file, is what we call it.

Nate:

And there's no actual scripting involved.

Nate:

It's auto-generated from real traffic.

Nate:

Uh, a key point in this, uh, for the listeners is we are redacting PII we

Nate:

are capturing the traffic, cuz you don't wanna be, you know, spewing, uh, sensitive

Nate:

information, uh, when you're replaying it.

Nate:

So data loss prevention is actually a very big piece of this.

Nate:

Um, but anyways, so the snapshots are auto-generated as well as, From the

Nate:

traffic, we can kinda reverse engineer what backends you need in order to

Nate:

exercise a particular So, so not only do we, um, generate a traffic snapshot

Nate:

and you can replay of the inbound traffic, but we also generate a mock

Nate:

server in a pod, if you will, that mock server in a pod can be spun up.

Nate:

And what this really does is, is vastly uh, Narrow the scope of the

Nate:

environment that you need to spin up.

Nate:

So we're actually just spinning up n plus one and N minus one.

Nate:

We're spinning up your API and only its neighbors instead of the whole full,

Nate:

full-blown end-to-end environment.

Nate:

And so it's like a little microcosm of your API, but your API for all

Nate:

intensive purposes thinks it's in a fully integrated end-to-end environment.

Lee:

But you're essentially doing service by service level

Lee:

versus an application level.

Lee:

So you're not really, you're not.

Lee:

Scripting user traffic into the system, you're scripting traffic into a particular

Lee:

service in and out of the service and, and the data that goes with it.

Lee:

So you can, you only have to bring up the service and what's

Lee:

around it, and you don't have to bring up the entire application.

Nate:

Well, you really only need to bring up just the app

Nate:

and we're, and SpeedScale's taking care of the rest really.

Nate:

Yeah.

Nate:

Um, so we're scripting all the inbound traffic for you.

Nate:

There's no scripting involved.

Nate:

You basically, we, we have what's called traffic viewer and you use that to browse

Nate:

the type of traffic you want to invoke.

Nate:

Um, and once you select the traffic that you wanna invoke, we basically take a

Nate:

look at all the traffic around it and say, okay, well when you run this call

Nate:

inbound, As a result of that, your application calls, you know, a Dynamo

Nate:

database and then these other two APIs, and then you make a call to a third

Nate:

party, I don't know, let's say Stripe or Google Maps or something like that.

Nate:

And so we will automatically generate a mock server, uh, based off of reverse

Nate:

engineering, how your app works and make sure everything is there that you need.

Nate:

Um, so yeah, it, it, it's, you, you got it.

Nate:

The, the concept is we, Virtualizing your neighbors so that you can do consistent

Nate:

scientific, uh, dry runs of, of your API as part of ci and, and, but it's

Nate:

also a huge reduction in cloud costs cuz you're not spinning up a big end-to-end

Nate:

environment of literally everything that is included in your app every time.

Nate:

And, and, and to be honest, that's also sometimes not possible.

Nate:

Because of the connections that you do have to third parties, almost

Nate:

everybody integrates with like a payment or maybe like a background

Nate:

check organization or, a mapping or a messaging solution that's a third party.

Nate:

And so, so many times, uh, these wires that hang out at the cloud, uh, as I

Nate:

call them, uh, those are difficult.

Nate:

To simulate, uh, you have to call the vendor and ask for a sandbox.

Nate:

And if you wanna do a load test, forget it.

Nate:

You know, that's not gonna go to a hundred tps, right?

Nate:

They're just standing up the sandbox to give you, you know,

Nate:

ones Z two Z transactions, whereas,

Lee:

uh, performance testing or anything like

Nate:

exactly.

Nate:

Exactly.

Nate:

But if we're simulating that using, um, traffic to, to auto-generate a

Nate:

mock server and a pod for you, uh, the, the, the possibilities go up.

Lee:

Cool, cool.

Lee:

Now, I, I can see how this works for APIs.

Lee:

Um, and you can include database, AC activities such as to DynamoDB, uh, DB

Lee:

as an API call essentially is what it is.

Lee:

But what about native databases or native data that's stored in the service?

Lee:

You know, like MySQL database might be part of the service or, or, um,

Lee:

cash or Redis cash or something like.

Lee:

Do you simulate those as well or how do you, uh, what do you do in those cases?

Nate:

Yeah, so, um, for tus we can actually, uh, see the traffic going

Nate:

through it, but we can't simulate it, um, for other data sources.

Nate:

Um, we do have ongoing support developing for like things like Postgres and MongoDB.

Nate:

Um, we've got the full list of supported technologies on our documentations

Nate:

page, which is docs SpeedScale.com.

Nate:

But.

Nate:

Really the beauty with, um, being able to provision these, um,

Nate:

backends, if they're API based, right?

Nate:

Usually it's all fairly standard.

Nate:

If you communicate to a system of record via API, we can also handle that something

Nate:

like elastic search, for example.

Nate:

Uh, but if it is a local data source or, or something like MySQL, which, uh, sorry,

Nate:

uh, MS that's got like a proprietary, um, non-open standard, you would

Nate:

probably wanna provision that locally.

Nate:

Uh, by yourself as part of that, uh, kind of simulated microcosm.

Nate:

So, um, with most of these cloud native environments, you can specify

Nate:

either like, you know, the environment script or, or, or, or the, the yams

Nate:

to properly stand those things up.

Nate:

In addition to the SpeedScale simulations.

Lee:

Makes sense.

Lee:

Let's talk about resiliency a little bit.

Lee:

You know, resiliency is, it's an, it's an interesting aspect when it comes

Lee:

to cloud-based applications, you know, because, building the DNA of the cloud

Lee:

is the cloud is designed to break, right?

Lee:

I mean, the whole fundamental aspect of the cloud is if a

Lee:

service server isn't working right, just terminate and restart it.

Lee:

And, and that mindset extends throughout the entire cloud ecosystem

Lee:

where everything is designed to.

Lee:

You know, with, with retry, with, with a redundancy built in so that you

Lee:

can lose components, components can go away, come back, and your entire

Lee:

system as a whole continues to work.

Lee:

What does SpeedScale do to help with that sort of resiliency testing?

Lee:

Are there ways you can simulate those sorts of, of environment?

Nate:

Yeah.

Nate:

Yeah, to an extent.

Nate:

I mean, uh, well, first of all, before I ju jump into that, I, I think, um, a lot

Nate:

of people have kind of a false level of comfort with the, the resiliency that's,

Nate:

that's inherently built into the cloud.

Nate:

I think what people realize is, oh, look, the, the, the, you know, The

Nate:

startup times of the Lambda serverless instances are actually quite long.

Nate:

And how do we get past that right?

Nate:

Or hey, horizontal pod auto-scaling rules actually take quite a little

Nate:

while to understand that, hey, a pod is down and then spin up another pod.

Nate:

Like it, it waits and it retries a couple times, and meanwhile, you

Nate:

know, you're bleeding thousands of dollars, , uh, because you know, your,

Nate:

your mobile ordering app is down.

Nate:

So, I think it's a little bit of false sense of comfort, or, or protection.

Nate:

And um, that's what we can really help simulate.

Nate:

And, and what we do with that is, again, it's, capturing traffic, um, in order to

Nate:

understand how users run your application.

Nate:

But once we do have that traffic, engineers can multiply that.

Nate:

Um, and, and empowering the engineers to run these what if scenarios?

Nate:

Like, what if I had a hundred x traffic, or what if I had, you know, a thousand

Nate:

x traffic for 30 seconds, um, and run more of a soak or soak or sanity test.

Nate:

Um, this is all things that are available with the few mouse clicks

Nate:

once we have that baseline of traffic.

Nate:

Um, and the traffic captures kind of how your, your application

Nate:

is exercised as well as.

Nate:

We've got the necessary backends ready to be spun up in a mock server.

Nate:

So it's kinda like a turnkey simulation that you can run.

Nate:

and so when people do have DR rules or HPA rules, um, they can actually

Nate:

verify that things are going to, to fail over as expected or scale as expected.

Nate:

Another aspect within resiliency that that.

Nate:

simulation can help catch is, your resource consumption.

Nate:

So if you're making logic changes to your, your services and, um, or, or

Nate:

you make this calculation change and for some reason, let's say it causes

Nate:

CPU to skyrocket or it causes, you've got a memory leak in your code and

Nate:

it, it begins to r rise over time.

Nate:

The state of the art in catching issues like that really is to, to

Nate:

just go ahead and release and then pay really close attention to Datadog

Nate:

or New Relic or AppDynamics, right?

Nate:

And, and rely on those observability tools to give you an early warning.

Nate:

And then it's kind of all hands on deck reacting or, or trying to shut

Nate:

down that pod over and over again, whatever it starts creeping up.

Nate:

Those sorts of changes can be actually proactively caught by, you know,

Nate:

running these traffic simulations.

Nate:

So by simulating the inbound traffic in the mock server

Nate:

pods, those are your controls.

Nate:

And really the only thing that changes is your application as you make changes.

Nate:

And that's another kind of reason to use these crowded, um, chaotic

Nate:

staging environments is because there's so much noise in the system

Nate:

and other people are doing things and.

Nate:

Staging can break quite frequently, and I know you've written actually about this,

Lee:

Yep.

Nate:

and . And so that's another kinda argument to using these, um,

Nate:

production simulations in a very kinda sterilized lab environment, if you will.

Nate:

And you know, the only thing that's changing is your code.

Nate:

So it's, it's a way to consistently iterate and experiment, make changes.

Nate:

And so that's another way you could improve your resiliency.

Nate:

Um, you can make sure that you're optimizing all the resources at hand and

Nate:

you're not, you know, irresponsibly, allocating memory, and then just hoping

Nate:

that horizontal auto scaling rules or the cloud scalability will cover for you.

Nate:

Right.

Nate:

You might not be economical with your code.

Lee:

Right, right.

Lee:

That makes sense.

Lee:

You, you can also do controlled failures too, right?

Lee:

You can do game day testing, if you will, during these simulation runs,

Lee:

so you can see what You know, your, your, your normal traffic works fine,

Lee:

but what happens if three servers go down while that's going on?

Lee:

The DR rules you're talking about certainly cover that, but, but

Lee:

this is kind of a way of interject what if scenarios and to get even

Lee:

useful information that you can feed back the development org about,

Lee:

Hey, it didn't quite work the way we expected to in this scenario.

Lee:

What if we changed the rules a little bit and adjusting so it's

Lee:

higher likelihood of success.

Nate:

That's right.

Nate:

Yeah.

Nate:

Um, we can, we can, you know, generate the inbound traffic

Nate:

into, you know, just an API.

Nate:

. But you can also just use that in isolation.

Nate:

You can use our traffic generation capabilities to hit you at the front

Nate:

door like an ingress or an API gateway and test your entire application so

Nate:

you can actually piecemeal out the solutions, um, which is like, you

Nate:

know, we've got the traffic generation piece and the mock server piece.

Nate:

some people spin up our mocking pod just leave it on full-time

Nate:

because they need, uh, to, to simulate the third party components.

Nate:

That's the cool part about having the traffic patterns as a snapshot is once we

Nate:

do have the traffic, we can play with the traffic we can start to slow things down.

Nate:

So we can say, Hey, we're mocking, stripe.

Nate:

what if Stripe goes down?

Nate:

Then we can just tell that traffic replay to be a black hole not respond.

Nate:

We can also tell it to respond with 22nd latency.

Nate:

Um, and then you can start check.

Nate:

my application time out gracefully?

Nate:

Uh, does it wait the whole time?

Nate:

can also speed up the traffic.

Nate:

I've actually heard of cases of applications failing because the

Nate:

back ends get improved and they start responding faster, and then

Nate:

your application then becomes the bottleneck and starts crashing.

Lee:

So even as development tool, right?

Lee:

This is, um, you know, when you're, when you're trying to build your application

Lee:

and build the resiliency in, or you're trying to build, what, if scenarios

Lee:

in, you can take the scripted language in your development environment.

Lee:

And, and fool around with it and do different things there.

Lee:

I, I'm assuming these are all rational use cases for speed that correct?

Lee:

. Nate: that, yeah, exactly.

Lee:

They're out of the box kind of.

Lee:

And then, and again, just to, just to reemphasize, uh, while under the hood we

Lee:

are developing, you know, JSON and Scripps and stuff, uh, no scripting involved.

Lee:

It's literally just a UX dashboard where you peruse all the API level calls that

Lee:

we've been picking up and desensitized.

Lee:

And you can see basically the ins and outs of all the traffic of a

Lee:

particular API you're trying to test.

Lee:

You tell us, Hey, I wanna generate a snapshot, and I wanted this snapshot

Lee:

to have this set of inbound traffic that you're gonna rerun, and also this

Lee:

set of mocked traffic that I wanna run, and you get this kind of turn.

Lee:

Ephemeral environment, lab environment that you can run over and over again.

Lee:

If production happens to update, then you can just go out and grab

Lee:

another side of traffic, right?

Lee:

The paradigm's completely changed.

Lee:

Now.

Lee:

There's no scripting involved.

Lee:

There's no maintenance of the script and updating the script, like normal kind of

Lee:

testing organization to take a look at it.

Lee:

It's literally.

Lee:

out and grab a new snapshot.

Lee:

Wait two minutes for it to be auto generation, and

Lee:

then run that new snapshot.

Lee:

Or it can be automated via GitHub or API call.

Lee:

You can say, Hey, grab the last 15 minutes of traffic, run it again.

Lee:

Uh, and, and it can all be done as part of the CI pipeline as well.

Lee:

Yeah, so one of your use cases is, like you said, C I C D pipelines,

Lee:

another use cases development.

Lee:

Another use case I'm assuming is uh, QA departments who just want

Lee:

to see what happens if scenarios and they just poke around and.

Lee:

Make changes dynamically just to try to see what's going on, whether

Lee:

that's a QA department, I, as I said, or if it's the development

Lee:

organization going through a QA process.

Lee:

It doesn't matter, but a, it's a step to validate.

Lee:

So I imagine, so those are like three distinct use cases, an automated

Lee:

pipeline, a QA doing random testing and development, using it to harden

Lee:

the application, or even in part of the development process itself, are there.

Lee:

Use cases that are not represented by those three that

Lee:

this, this, this is useful for.

Nate:

Yeah.

Nate:

Yeah.

Nate:

There's, um, within those three use cases, I mean, I guess you could,

Nate:

uh, break it up into specific.

Nate:

Phases of testing.

Nate:

I mean, it could, the traffic replays can really be curated in a

Nate:

way, uh, where you're checking for functionality or contract changes, right?

Nate:

Um, you can look at it more as like an integration test.

Nate:

You can also multiply the traffic and look at it more as a load test.

Nate:

So that's where the concept gets interesting is load testing at a

Nate:

regular interval as part of ci.

Nate:

Um, so I've heard people call it performance assurance.

Nate:

Uh, I've heard people call it continuous performance testing.

Nate:

Um, once you integrate, and, and really the linchpin to all of that is the mocks

Nate:

because when you're doing load testing, typically everybody has to be finished

Nate:

with their application code, like their, their particular piece of it, right?

Nate:

And then they have to curate it at a performance environment that's,

Nate:

you know, one 10th the size of staging so they can extrapolate

Nate:

the results and multiply it by 10.

Nate:

Now if we're mocking the back ends and they're performing and they

Nate:

can do a thousand t ps, um, then really that constraint goes away.

Nate:

and, and now you can understand, well, this one piece, this payment API, or

Nate:

this fulfillment API I'm working on needs to go up to 800 transaction per second.

Nate:

You can do that, without having to wait for the full end, end environment

Nate:

without having to tell the dba, Hey, I'm gonna be hammering the database.

Nate:

You know, please don't get mad at me, kind of thing.

Nate:

And so that can all be done and, in, a self-service way.

Nate:

Now you've written about like all these different microservice teams

Nate:

that are disparate and, and siloed, and they, but they all have to

Nate:

be communicating tightly, right?

Nate:

And you've written about the ability for them to have some sort

Nate:

of self-service way to understand how they interconnect with every.

Nate:

And also understand the integrations and then spin up these environments

Nate:

and, and so SpeedScale literally does that, is allows somebody to jump into

Nate:

this API or that API view the traffic and we'll show them service map.

Nate:

Then they say, well I run this, I exercise my application.

Nate:

They can actually just grab the traffic that's relevant to them.

Nate:

Um, and so in that way we've actually beyond just like the CI and the

Nate:

development enablement and then the, the QA testing kinda what if testing that they

Nate:

can do, can also take that traffic and point it at different endpoints, right?

Nate:

So they can actually do performance benchmarking.

Nate:

One of the stories that we've had from a customer is like, you know,

Nate:

they were graviton, Google came out with a new graviton process.

Nate:

And they were like, well, is that really gonna be any faster

Nate:

than what we're currently on?

Nate:

And so they were able to benchmark like, well, this is business as usual traffic.

Nate:

Let's test on the Google Graviton processors.

Nate:

And they did find out that there was like a X percent faster throughput.

Nate:

Um, so they ended, yeah.

Nate:

So you can use it to, to benchmark in a conventional load testing sense.

Nate:

Um, there's also the, the use case that I call parody testing.

Nate:

To check for parody and, uh, when you're doing migrations like from e c

Nate:

two to Kubernetes, if your application fundamentally is gonna remain the

Nate:

same, but you're just re-platforming, you could capture business as usual

Nate:

traffic coming into your e c two app.

Nate:

And then once you're done platforming, like moving to Kubernetes,

Nate:

you can do a sanity check.

Nate:

And before you.

Nate:

Fork all the traffic over and kind of do the grand opening.

Nate:

You can run the old traffic that you would normally get to e C two, run it

Nate:

against Kubernetes platform and say, Hey, am I getting to the same response times?

Nate:

Are things scaling properly?

Nate:

Do the functionality get preserved as we moved over?

Nate:

the last, just the last piece actually is, um, In particular, when you're doing

Nate:

like docker environment development.

Nate:

That you can run like Docker locally on your laptop or you docker

Nate:

compose, Minicube, micro Kates kind.

Nate:

so all of these concepts, all these mock server pods, traffic

Nate:

generator pods can actually be spun up locally on your laptop.

Nate:

So now, an argument for like, Hey, I don't need the full

Nate:

blown end to end environment.

Nate:

I can just simulate my neighbors, get SpeedScale to generate those pods and

Nate:

then run them locally on my laptop.

Lee:

It's one of.

Lee:

Biggest complaints I hear about microservice architectures is the

Lee:

development, the laptop development environment is so difficult to set up

Lee:

and manage and this, this is a tool that'll help make that a lot easier.

Nate:

Yep.

Nate:

Yep.

Nate:

We

Lee:

done offline as well, or is it still only an online tool?

Nate:

So we're about to launch a command line version of this that is, uh, is it,

Nate:

it doesn't require an internet connection.

Nate:

So it'll be free and you can generate those pods and then run them locally.

Nate:

In like a mini cube environment, something like that.

Lee:

So you talked a little bit about what motivated you to start

Lee:

SpeedScale but why you, why,

Nate:

why did you start SpeedScale?

Nate:

You were writing device drivers.

Nate:

And, uh, Matt had actually developed a, uh, it was basically

Nate:

like a, what would we call it?

Nate:

Um, like a visual kind of driver development kit, that allowed us to.

Nate:

these drivers more quickly.

Nate:

And then Ken developed a simulator that would, it was like kind of like

Nate:

a stub code harness that you could drop the driver in and it would test

Nate:

the out, the input and outputs to it.

Nate:

So all three of us have kind of been in this mindset of like,

Nate:

you know, better testing, faster development, and those two got into the

Nate:

observability space first with Wiley.

Nate:

Uh, and then New Relic and Observe Inc.

Nate:

meanwhile, I kind of took a different path.

Nate:

I, I had actually, um, been with itt, k o, actually Ken worked at itt.

Nate:

K he's the one who pulled pn, but it, k o had developed this

Nate:

concept of service virtualization, and that was back in the so days.

Nate:

there's just a huge mix of like legacy queuing technologies

Nate:

like MQ and TIPCO and amqp.

Nate:

And then there was just, you know, SOAP services were just becoming a thing.

Nate:

So developing these service mocks was a hugely complex affair.

Nate:

You had to, you know, redirect WA servers and bounce them and,

Nate:

and do a lot of networking to get these mocks up and running.

Nate:

Um, and we had always been, uh, Enamored with the concept, but really

Nate:

dissatisfied with the process of developing these service marks cuz

Nate:

done properly, they're a huge enabler.

Nate:

they're a huge value add cuz they can accelerate, um, the dev process.

Nate:

You can develop in parallel, you can simulate all these

Nate:

conditions and so on and so forth.

Nate:

But service blocks kind of.

Nate:

A bad reputation because you usually have to hand script the responses one by one.

Nate:

If you want a backend to simulate whatever, you have to

Nate:

seat it with the right data.

Nate:

It has to be onesie, twosy, programmed to respond.

Nate:

So, uh, this is a long-winded way of saying, um, when the cloud

Nate:

came about Kubernetes and cloud data warehouse storage, realized,

Nate:

oh, we can do this very quickly.

Nate:

There's proxies, uh, there's always, there's already network taps that we

Nate:

can take advantage of, and then we can use the traffic to train the models.

Nate:

once the, the mock pods and, and the inbound traffic, uh, can, can

Nate:

be simulated, the rest of it is just an orchestration problem.

Nate:

And, you know, with Terraform Scripts, helm charts and yaml, all that stuff

Nate:

is pretty, pretty well known as well.

Nate:

So, It was, it was a matter of desire and background.

Nate:

And then the, the cloud data technology has actually just been a huge enabler, so,

Lee:

Yeah, and I've known Ken for many years, but, , I'm so glad I, I met

Lee:

you guys and, , and I'm really excited to see what you guys, are going to

Lee:

accomplish as, uh, as you go along.

Lee:

So the natural question that always comes up at this time of.

Lee:

Year is what's next year like, and so what, what are your plans for next year?

Lee:What are you gonna do in:Lee:does SpeedScale look like in:Nate:g with some great partners in:Nate:

really refining the ergonomics of the product and, , It's been a huge kind

Nate:

of developer productivity accelerator.

Nate:

2023, we're gonna release a, for a free version of SpeedScale.

Nate:

We know kind of what aspects that people love uh, we just wanted to be

Nate:

careful about understanding where the, the, the real, you know, exceptionally

Nate:

useful features are, and then what, what, what those features could be

Nate:

command line driven, which of those actually need like a full-blown ui?

Nate:

Um, so the freemium tool is gonna be mostly command ba, command line based.

Nate:

Um, but once you start needing re uh, you know, enterprise level

Nate:

things like single signon and more sophisticated, uh, redaction.

Nate:

and visual reports, that's when you would, you know, kind of have a paid tier.

Nate:

So we expect the free tier to be, uh, a great value add for engineers that, that

Nate:

need mocking and traffic generation.

Nate:

And then, there's also gonna be, A lot of, , momentum around kind of publicizing

Nate:

SpeedScale from a marketing perspective.

Nate:

We help to, uh, really kind of listen to the engineering community and

Nate:

understand, uh, where we can provide the most lift and, uh, iterate quickly

Nate:

to, to develop those features in.

Nate:

But already we're, we're getting stories of, you know, taking two week load

Nate:

testing sprints down to three hours and improving API performance by 30 x.

Nate:

And we just wanna continue that.

Lee:

So, so if any listener is interested in learning more about

Lee:

SpeedScale, where should they go?

Nate:

Yeah, they could just go to SpeedScale.com.

Nate:

Uh, spelled exactly like it sounds, uh, one word.

Nate:

Uh, we also have a community on slack SpeedScale.com, where they

Nate:

can talk directly to the founders or the engineers, ask questions.

Nate:

Um, and then if you go to SpeedScale.com/free-trial, they're

Nate:

able to, to download the product and try it, um, locally, um,

Lee:

And I'll make sure those links are in the show notes as well

Lee:

too, so people can see 'em there.

Lee:

So, great.

Lee:

So, um, anything else you wanna add before we, we, uh, we wrap it up here and, uh, we

Lee:

managed to make it all the way through the episode without losing the internet again.

Lee:

That's, that's fantastic.

Nate:

Yeah, I mean, wouldn't No, no.

Nate:

That was it.

Nate:

Always a pleasure to, to talk and, uh, you know, um, kinda commiserate

Nate:

over the technical problems of modern Cloud with you, Lee, it's always great.

Lee:

Definitely.

Lee:

I love to love talking with you, Nate.

Lee:

Thank you.

Lee:

my guest today has been, uh, uh, Nate Lee, who is the co-founder of SpeedScale.

Lee:

Nate, thank you very much for joining me on Modern Digital Business.