Episode 64: Shakemups

Amos King

Chris Keathley

Anna Neyzberg

The Elixir Outlaws now have a Patreon. If you’re enjoying the show then please consider throwing a few bucks our way to help us pay for the costs for the show.

Support Elixir Outlaws

Episode Transcript

Amos: Welcome to Elixir Outlaws the hallway track of the Elixir community.

Chris: Well hello there.

Amos: Hello there. *sigh* Went and got caught, went to get coffee, and there was only like an ounce left in the coffee pot. So I'm just going to enjoy this last ounce real quick while you eat your banana.

Chris: The last ounce of coffee.

Amos: *aaaah*. It's, uh, typically the worst ounce in a giant French press.

Chris: Hmm. Um-hmm. Oh, it's so, so silty down there at the bottom.

Amos: Oh yeah.

Chris: So silty down here at the bottom.

Amos: And it's been sitting there since 7:30 this morning when I made it. So its three hours old. So we're just hanging out waiting for Anna.

Chris: I'm working on the things that are most important to be working on.

Amos: I need to do that.

Chris: You always need to do that. You need to be constantly doing that.

Amos: I didn't do-

Chris: Here's the trick is, you never, never stop reviewing. I do little mini reviews sometimes. I'll do like a little like a quickie. I'll do like a little 15 minute review sometimes.

Amos: Oh nice.

Chris: But you know, the Friday review for me, I do my reviews on Friday and my Friday reviews, I mean, that's like a two hour affair.

Amos: I need to do that, but

Chris: You gotta, you gotta, you gotta make the time. That's the thing is the system only works if you make the time.

Amos: Um, the other thing I think though, is that if, so, if you do it regularly, I don't think it will always take two hours.

Chris: I do it regularly. Almost always takes me two hours.

Amos: Really?

Chris: At least an hour.

Amos: You're not encouraging me.

Chris: Well, here's the thing, the real question to you. You're like, how can I spend two hours doing this? And I'm like, how can you afford not to spend two hours doing this.

Amos: Fair.

Chris: That's my rebuttal to you. Cause like, I just think it's like, for me, for me personally, you know, I need to be constantly working on the stuff that's most important. And so if I take two hours on my Friday, which is a very quiet day for me typically like a very, very quiet day for me, you know, like that's such a time well spent just because it's like, I know that I'm going to go into the next week or the weekend knowing exactly like what the most important things are. Like I'm going to be coming off the tail of like a week. I'm going to be able to actually sort of evaluate like what I've got coming up and make a real honest, like, appraisal the situation. And if it takes me two hours to do that, then it's like, that's totally worth it because I make up that time so much in the rest of the week.

Amos: Yeah.

Chris: I can go to my to-do list and be like my OmniFocus and be like, okay, I'm at my work computer. What's the next thing? And I see like 10 to 15 things. If I see more than 10 to 15 things I know I've like screwed up somewhere. And like, there are different projects that I could be working on, but they're all actionable. Okay. I'm going to do that because that's, you know, that's the mood I'm in. Or like I know something has come up and like, this is the most important thing now .

Amos: That's fair. That's fair. It's I mean, same thing whenever like writing software, right. Is that, I want, I try to try to do things now. They may take me a little more time to combat a feature or a fix, but I wanna make sure that I'm saving myself that time later and then some. Paying dividends.

Chris: Right, right. Yup. That's the thing. I think like at the end of the day, you know, you gotta spend the time, like I said, how could you, how can you afford not to look? How can you afford not to spend that time? That's the way I think about it.

Amos: Fair fair.

Chris: And a lot of times too, like, you know, I mean, I've just done it long enough now that like I'm in the habit or, and sometimes I dip out of the habit and then I realized like my life is shambles and I get back into it again. But I mean, sometimes you, you develop all these tricks, like one time, like, like a lot of times, like I have to, the review process is so important because if you see the same things coming up and you know, they're important, you have to ask yourself the question, like, why is this not getting done? I mean, the first question is obviously like, is this actually important? And you have to be pretty honest about that and be willing to like throw stuff away.

Amos: Right.

Chris: But if you know, it's important and you know, it's not getting done, you have to be like, okay, well, why? And for me, it's almost always that I have not identified the actual first task. Like we need to build some shelves in our bathroom, in our master bathroom. And the first task I had on there was like, go buy lumber to make shelves. And I realized, and I, and, but it's like been on there for like two weeks. And I'm like, why is this not done? Besides the fact that was like the holidays and everything. It's like, why is this not actually happened? Because that's not actually the first step, the first step is go find a tape measure and measure how like the width that you actually need and like get exactly what you need. So you can go to the store. And then it's like, as soon as you do that, and that's just like such an easy thing, then like the ball starts rolling. Or like we needed to paint, we were painting some of my, in my son's room. He wanted like a green wall. So we painted him a wall and we had some paint or whatever. I need to go get some more. And so it was like, the first thing was like, go get paint. And I realized, again, like that, wasn't getting done. And I was like, why isn't it actually getting done? It's like, oh, cause I don't actually know what color paint it is. And the actual was like, ask Andrea, what, for the name on the paint swatch. Like that was like the first, that was the actual first task. And as soon as I did that, I was able to do like all the rest of it. That's why you do the review. That's like why you spend the two hours to do the review.

Amos: I just, I just did shelves too. So I feel you. It wasn't getting done. And I realized it was because I brought the shelves up to where I needed to put them together and laid all of the stuff out, but I didn't actually bring the tools upstairs. So then it was like, man, I'm not going to do that right now. I got something else to do. And then it was like two days later when I thought, you know what, I'm going to take these screwdrivers upstairs. And as soon as I did that, I put together the shelves.

Chris: Yeah, exactly, exactly. So I don't know. I think that's why I, that's why I am so- that's why I say, how can you afford not to do it? It's like, how can you afford not to just take that out? I mean, most of the time it'll only take me about an hour, but occasionally it takes two hours and I blocked two hours out to do it because like, sometimes it's hard. Sometimes the review is hard. It's like emotionally hard. It's mentally hard because you really need to go through all of it.

Amos: And you gotta think about it deeply.

Chris: And close projects that you're not going to do.

Amos: So speaking of getting started, the reason why we didn't get started so far is because Anna wasn't here and now she's here.

Anna: Hi!

Chris: I didn't know her mic was working yet.

Anna: What were you guys talking about?

Chris: OmniFocus.

Amos: Yeah. Getting, getting things done, doing a weekly review of the tasks that you have and cleaning up saying what's the most important, is there, are there other things that need to get done? Hey, this task has been sitting here for a while. How do we, why has it been sitting here for a while and what we discussed there at the end, I think that's when you showed up, was that figuring out what the first thing to do is, is often the reason why things aren't getting done, because you're not sure. Or you just haven't thought about it. It doesn't mean that you aren't sure. Like I knew the screwdrivers needed to be upstairs, but I just hadn't committed to doing that part of it yet, I guess.

Anna: How have you all been? I haven't talked to you all in a while.

Chris: Good. Fine. Working.

Amos: Fantastic.

Chris: I've been dealing with, uh, alarms and rewriting a bunch of our alerting rules .

Amos: Oooh, at work alarms?

Chris: Yeah, no, no, no. My household alarms, like when my kids wake up and when I wake up, we've been putting together a new, uh, a new action plan for getting out the door in the morning. We have a new checklist.

Amos: I just go hit the, hold, the test button down on the fire alarm. Cause they all get up and run outside and it's like, oh, I guess you're awake. You can get ready for-

Chris: Nailed it. It's like, it's like a, it's like Elixir, it's like Elixir Conf last year. Fire Conf.

Amos: Oh, perfect.

Chris: Topical.

Anna: Aww, man. Are y'all both- are you both going to Lonestar?

Amos: Yeah. We're recording there. All of us.

Anna: That's right. Okay. Just checking.

Amos: Yeah. Yeah. It's going to be awesome.

Chris: I'm excited about that. It'll be fun. So I'm ready to start this year.

Amos: We're going to have Eric and Justus on.

Chris: Oh, right. Yeah.

Amos: Yeah. From Elixir Wizards. So we'll have them on, uh, since they just had Keathley on, we thought we'd return the favor. Um, no, I'm, I'm looking forward to having a conversation with them and also like a lot of, a lot of the talks seem really good. I'm looking forward to, um, Samuel Mullins's talk on a telemetry, but I've been working on telemetry like the last few days. So it makes sense.

Anna: Cool.

Chris: Telemetry's great. We talked about this last, no, two weeks ago. We talked about this at some point. Telemetry is just great.

Amos: Yeah. Yeah. Uh, the documentation is tough, but I know they're working on it.

Chris: Yeah. You know, listen, It's, it's a, it's a work in progress. Okay.

Amos: Yep. Yep. And there were there working on it. I was talking to Brian Negley last night about it last night-

Chris: In and our super-secret-

Amos: In our super-secret secret channel. Yeah. The, uh, friends of the show channel. And he, uh, he, he said that there, that the EEF working group on metrics is, is trying to that's one of their main goals is to get documentation in a, in a state that's a little easier to, to figure out how to get things going. I think that's the biggest problem for me was like, you need multiple (inaudible) to actually get telemetry to be useful, like out of the box by itself, it's not that useful, right. So you have to add other types of plugins on top of it. And that's where it's like, okay, well where do I, what do I add? What do I need? What am I after? And then Grafana, I'm just going to say is beautiful.

Chris: It's really good. I really like Grafana.

Amos: Yeah me too. It's pretty, so pretty.

Chris: I mean, it's also useful. I'm going to go ahead and say, it's a useful tool as well.

Amos: It is very useful. It is very useful. But yeah. Getting, getting Grafana and Prometheus and metrics to talk, once I figured it out, it was like, oh yeah, this is, this is really not that difficult. But getting down to figuring out all the moving pieces I needed was, was a little rough.

Chris: Yeah. Yeah. It's, it's a tie it all together. And there's a lot of Docker containers you need to run, but eventually you run enough Docker containers and it all works. You get all that Yammel right. Get all the indentation correct. And eventually it comes together and it's so beautiful. You can see just all the indentations all coming together. It's amazing.

Amos: How about you, Anna? What have you been up to? We haven't, we haven't spoken with you in a while. I know last time you tried to connect and then your microphone wasn't working, so-

Anna: Yeah, it's a bummer. Um, things are good. I don't know. I'm excited for Lone Star. I'm heading to Patagonia in a couple of weeks for a trip, um, with the fam, but that's about it.

Chris: That sounds awesome.

Amos: I'm jealous.

Anna: I went a couple of years ago with some friends. Um, so I'm excited to head back down there.

Chris: That's rad.

Anna: Yeah. Um, but yeah, I'm excited to see you all in person at Lone Star. I don't know. Things are good. It's raining like crazy today. It took forever to get here. Cause it's like pouring.

Chris: Street's flooded.

Anna: Yeah. Basically. That's what happens in San Francisco when it rains.

Chris: You get like an inch of rain and all of a sudden it's like, whoa, whoa, whoa, whoa, whoa, whoa. Y'all, there's water coming from the sky. I don't know what's happening anymore. This is not what I signed up for. And also now all the streets are flooded because like the, not the, uh, the sewer system, like isn't equipped to like handle that amount of water going through it, right.

Anna: Yeah.

Chris: Cause like the Hills and everything else.

Anna: Yeah, I got an advisory on my phone this morning at like 7:30 as I was leaving. And it was like, watch out for flash flooding. I was like, cool, cool.

Chris: That's great. Awesome.

Amos: That happens all the time here too. But it's because we get tons of water.

Anna: Yeah. We're not as good about that. I mean, we need the water, so I don't, I'm not complaining, but almost as bad as LA when it rains,

Chris: Just the world shuts down.

Anna: Not quite as bad as San Francisco, but yes. Yeah. That's what's going on. Trying to spin up Elixir Bridge again soon. Probably when I get back, it's been awhile. I don't have anything exciting to report. I wish I'd did.

Chris: No, no worries.

Amos: And you are both speaking at Lone Star outside of just doing the podcast, correct?

Anna: Right.

Chris: I don't think I am.

Amos: Oh, Chris is not, just Anna.

Chris: I don't maybe, maybe I am. I don't think, I don't know. Just had a mild panic attack.

Amos: Breathe. Breathe.

Chris: Lone Star Elixir.

Amos: He's going to double-check. So what is your talk about Anna? Well, Chris, makes sure that he's not dying

Anna: I'm still determining what it's going to be about.

Amos: Perfect. Should I ask you, uh, like February 26th?

Anna: No, well, I have some higher level ideas that are coming together. Um, I'm trying to determine exactly what direction I want to take it in. Um, so I don't quite know yet.

Amos: Any hints? Are you, are you keeping these close to your heart right now?

Anna: Well, I just don't know yet, exactly. Um, exactly what direction I'm going to go in. Um, so yeah, I can't give you a lot. Yet.

Amos: Alright. That's cool.

Anna: I'm excited about it. I think it'll be fun. Um, Keathley set a very high bar.

Chris: Oh my gosh!

Anna: So thanks. Thanks for that Keathley.

Chris: Uh, my bad. I don't know how to take that.

Amos: Anna, Anna, don't make me cry.

Chris: I figured, like, I'm going to get one shot at a keynote ever. So I gotta, you know, I like, I used all my best material. Like I always, I was pulling material out that like I had been saving for a long, I have a document of jokes and like, I'm not even kidding. Like I realize y'all are laughing, But I'm not actually kidding, like, I keep a document, like, uh it’s called like the story file. Cause it's in markdown. And I write down anecdotes and funny bits and stuff that I could use in talks or on this show. But most of the show is just off the cuff.

Amos: I need that. I think I'm going to keep a big uh-

Anna: I wanna see what's in that file.

Chris: Foot marbles was in that file. That was, that's a callback. That's an old joke.

Amos: Wow. Uh-

Chris: Foot marbles was in the file.

Anna: Wait, how do things, how do things make it into the file?

Chris: I go for runs. And then I think about funny things or I just like notice stuff in the world and I make a habit of trying to like find funny stuff about things that I notice in the world. And then I write them down. But actually if you want to know the real way they, they enter the document is through VIM. There was a document in VIM, and then I typed the words.

Anna: That's not quite what I was going for, thank you.

Chris: No, I don't know. Where does anybody find humor, right. Like you look around and you just think like that's funny, so- and you just try to figure out why it's funny. And then I workshop it. And then occasionally like when it makes sense, I pull those jokes into talks and that's my secret.

Anna: That's your secret.

Chris: That's the, that's the secret behind the process right there. So some of those jokes, I was sitting on, some of that material for that talk I was sitting on for literally years. Cause I was like, this is really funny, but it's too good for this one talk. This is too good for an Elixir Conf audience. I'm not wasting this joke on an Elixir Conf audience.

Amos: Ouch!

Anna: Keathley!

Amos: I'm saving this for a keynote.

Chris: Yeah. Well, you know, I mean, no one else is ever going to keynote an Elixir Conf. So I figured like why, why waste it there?

Amos: I wonder if Dockyard is going to still sponsor Elixir Conf?

Chris: Hmm.

Anna: Yeah, like what's going on there?

Chris: I think they just let go of a bunch of people.

Anna: I know, I saw that.

Amos: I don't know how many is a bunch, but um, I, well, Brian, Brian has for a few years, has been the outgoing CEO and it finally happened he's he stepped down. So there'll probably be some changes. I don't know what there'll be.

Chris: Yeah.

Amos: They still do have Elixir people. They're not, they didn't like clean house and get rid of everybody. So.

Chris: Yeah. I don't know. Yeah. Idle speculation. I'm not sure what, so all I know is what, uh, what is it that people are looking for work right now. So, you know, if you need Elixir people, there's probably, uh, people looking for work. So, uh, I mean, they might not be hired by the time that this comes out.

Anna: That's true.

Amos: Probably. Especially with the way I submit them to be edited. Sorry.

Chris: Yeah. Well, you know, yeah, yeah. We're at, we're at the, we're at the whims of Amos's schedule here and memory and to-do list, which if you would review, this would probably get done more often.

Amos: If I would review, it would have been done.

Anna: I was just interested that the Platform of Tech thing happened around the same time. Yeah.

Chris: Yeah, Platform of Tech got bought. That's an interesting thing-

Amos: By New Bank, right?

Anna: Yup.

Chris: The bonk.

Amos: Nabonk.

Chris: Nabonk. I don't actually know how to say it. I assume it's New Bank,

Anna: Which is interesting, right?

Chris: So the working theory is that because New Bank is Clojure uh, they're attempting to kill off Elixir in Brazil.

Anna: Wait really?

Chris: Well, that's just, I mean, that's as good theory as any. As long as we're just doing idle speculation. The dynamic language, 2.0, wars of Brazil are being waged in front of our very eyes. We were here for it. We were right here for it. We got to witness it. Where were you-

Amos: From a different country.

Chris: when Elixir died in Brazil? Thoughts and prayers.

Amos: So, so that means that there would probably be some Elixir developers from Brazil who could be hired if-

Chris: I believe so. I believe there is a large amount of the Elixir people who are looking for work.

Anna: Is that a good sign or not for the community? Like, I'm just curious.

Chris: I will say like, man, it's a, there's a lot of, there's a lot of shakemups happening right now.

Anna: Yeah. As far as what that means for the community. It's interesting.

Chris: Yeah. I don't know. I feel like something else changed recently too. Like there was some other announcement, somebody else, I don't know. Some other kind of like company was shifting around or something like that. It's I don't know. It's in the , I guess its Q1. Everybody's got money again to make decisions. So I don't know . The right time for it.

Amos: Or, or they realize how much they owe in taxes

Chris: Or something. Yeah. Oh man, I've got to cut this loose.

Anna: How are things going at Bleacher report Keathley?

Chris: Good. Um, yeah. Good. I'm, uh, mostly doing a lot of alerting and monitoring and building tools to do learning and monitoring. Um, mostly around Kafka stuff. Um, Kafka, consumers are notoriously really hard to, to monitor. I think this is true of most cuing things. They're really hard to monitor. Um, and they're hard to monitor because, um, for the same reasons that like, if you're trying to measure latency or rather if you're trying to measure well latency as well, but if you're trying to measure like error rates coming back from a service, the service itself has a hard time telling you what the error rates are. Um, like it's a hard, it's, it's hard for a service to detect that it's having problems. Like that's sort of determined cause it's sort of determined by the consumer of that thing. Like the caller kind of determines, not always, not for all metrics, but it determines a lot of the actual health of the downstream thing because often there's like a bunch of intermediaries in between you and the downstream thing right. And so any number of things could be going wrong. And so if you only reported error rates from the service that served the responses, uh, you might miss a whole swath of things. So you need to do both like you, you do any want that, you know, services should report errors that they notice. Uh, but your callers also need to report errors that they noticed, right? So this is true of Kafka stuff as well, except its actually kind of like harder because what you care about is throughput to a lot of, in a lot of cases and where, where throughput's actually not really descriptive enough. What you care about is throughput and like successful throughput. Like you don't want to be skipping messages necessarily. Like you might opt to skip messages. If you're like have a blocking error or something like that, you'd rather like skip over them. But like it turns out so you can process stuff really quickly if you don't do any work, right? Like if you just don't write to the database, you can process work super quickly. Or if you don't produce to a downstream topic, you can, you can like consume really fast Kafka and Kafka, consumers are really good at that. That's not actually what we care about, right? Like we care about successful things being done. And we care about the throughput implications on that.

Chris: But there's a scenario in Kafka land and in other queuing things. Um, you know, you can interchange the word Kafka with something, with your cue of, with your, your, your log queue thing of choice. But, uh, there's a scenario where if a Kafka consumer dies or crashes while it's processing something, it may not have checkpointed where it's at. So what it does is it fetches like messages 0 through 10 and tries to process those. And then when it's done with that, it processes messages 11 through 20 and then processes those or whatever. However it, you know, fetches in batches, and it tries to process all those messages. And if it crashes in the middle of them, it needs to, well, it has a bunch of different like it hasn't checkpointed yet. And yet if it crashes in the middle of it, it hasn't necessarily checkpoint and where it's at and then it needs to replay those. And because of the way it replays, you might end up replaying the same message that causes a crash over and over and over and over and over again, but never make forward progress. And if all you did was monitor the Kafka consumer and the amount of things it was processing, and all you looked at was like, how many things have you processed? It might look like it's fine to you. Even though it's all it's doing is processing the same things over and over again, you know, devoid of any other signal, like we've got a lot of errors or this error keeps occurring or whatever. Like if you had no other metric to use, right, if all you saw was like messages being processed, you would see, uh, you wouldn't necessarily see a problem. Now there are other signals to like, look at like, you can look at commits and there's like Kafka specific things that aren't that interesting. But every, every application or every consumer now needs to like export those metrics and you need to build monitoring around that. And it becomes really hard. It's like really, it's actually really hard to know, uh, how the consumer is doing. And that only tells you like how many, if they're even alive. Right. The other thing that can happen is that you could be producing into Kafka a lot faster than you could be consuming.

Chris: And the consumer can't necessarily know that like the consumers is this detached thing, that all it knows is like how many messages it's actually producing, uh, or sorry, consuming. It doesn't necessarily know how many messages are in the topic left to consume. And so because of that, you don't know how much you're lagging or if you're lagging at all. And so, uh, what I did is I built a little, like very simple, like Elixir service that sits and watches Kafka's internal metadata topics, which was fun to learn how to like decode all their like binary encoding stuff that they use to shove into this, uh, into this metadata topic. And, um, yeah, so I sit there and I watched that and it consumes that as fast as possible. And then it keeps a running count. Like it keeps a circular buffer of lag based on little on time periods, based on minute long time periods and for every consumer group, for every partition on every topic for every consumer, because it can just watch that. And so it can actually from the outside world, tell you if you're starting to lag or not. Cause it also periodically checks to see where the head, uh, offset is and then just takes the difference of those two things. And so, because of that, it becomes sort of this like observer, this like higher level observer, and it can actually report an alert, and it sends metrics. It's like telemetry sends like time series, but it also can do like smart reporting because it can do it can evaluate over a window because you have the circular buffer. So you can have like the sort of these more robust, like Kafka specific reporting and alerting rules about like, is this thing lagging? Is it catching up? Because it's also hard to like tell, cause like a very common thing that happens is like you'll get like a huge spike of, uh, of, of, of messages that go into a topic.

Chris: And because of that, if you look at it on a small enough time window, it would look like it's lagging. It'd look like the, the downstream consumers lagging, but it's not it's you just produced like a thousand things really, really quickly all in like a batch and now it needs to catch up. So we have some smart alerting rules about like, if you're making progress at all, but you're just behind, you're not actually lagging. And we can tell that cause we look at a long enough time window. And so it helps that. You get into this like secondary problem where now you need to monitor the monitor where like now that thing needs to be monitorable and you need to know like that, that thing's still running and consuming, which is like a whole different problem. But for our purposes, like it's been really, really useful. Um, without a ton of, it wasn't a ton of work to build so a ton, It was like more work to learn how to like dig through the Kafka source code and figure out how that's that junk works internally.

Amos: The metadata.

Chris: Yeah. Yeah. Well, it's like offset commit messages and also like how Kafka connect works with that. And there's a bunch of other like rules with are very Kafka specific. And probably aren't interesting for this podcast, but like anyway, so, worked on that. It's open source ish, but yeah, we have a fork of it that we run internally. I, I started as an open source project and then we forked it and we use it internally, but it's had inside improvements that are specific to our needs. And I need to like, backport those at some point.

Amos: What's it called an open source in case somebody is looking for it?

Chris: Uh it's on my GitHub, it's called Orwell. I'll put a link to it in the, in the show notes. I thought that that name was very, uh, that was one of my better names.

Amos: That's good.

Anna: Uh-huh.

Chris: Well cause Kafka, like, it's good but I'm not gonna explain the joke. It's good.

Anna: I get it. I get it.

Amos: Should I explain your joke?

Chris: No! No! Don't ever explain my jokes.

Amos: They're so fun to explain though.

Chris: Don't you ever.

Amos: I just like to see your face when I start explaining.

Chris: So that's been fun. So even, but I like I've been, but it's, you know, I have to tie all these alerts together and there's still some like improvements we need to make in terms of like the evaluation windows, et cetera. It doesn't matter that it's boring technical details, but there's, there's a lot of work we need to be doing around that still, um, to improve it. It's not, it's not necessarily the best thing ever yet. Although if people wanna play with it, they're more than welcome to .

Anna: Cool. That sounds-

Chris: It's been useful for us. It's been very useful for us.

Amos: Nice.

Chris: It's taken a lot of guesswork out of how the consumers are doing and their health. So.

Anna: That's cool.

Chris: I have a topic. If, if, if, if you, if you'll allow it.

Amos: I thought we, we had a topic. I have, I have a hard stop in about 15 minutes, but that's fine. You guys can keep going without me, but I do have a hard stop.

Chris: Oh! A hard out! Well, let's say my topic, my topic is good. And I want-

Anna: What's the topic?

Chris: We'll save it for next time, but I want to talk-

Anna: What is it though?

Chris: I want to talk a bit about, uh, it's not exactly Elixir specific, but I want to talk a bit about how people do testing and development, when they're working in with, when they're working with lots of different services, like what's your strategy and like, what are things that are working for you or not working for you? How, you know, how does that affect your testing specifically testing and like, you know, assurance that things are all still working together and like strategies for that. And then if you have any, and we can tie that into Elixir a little bit by just talking about Elixir specific ways of maybe doing that. But, uh, I'm mostly interested in like what y'all's experiences have been, but I feel like that's going to take more time than we have. I don't want to rush it. I want to actually get some good info out of this.

Amos: External, internal, or both ?

Chris: Internal, specifically is what I'm worried about or interested in.

Amos: I think some of this stuff is both, but I'm writing it down so that I remember to think about it. Maybe I should put it in my queue for.

Chris: Put it in OmniFocus. Review it, tomorrow.

Amos: Review it, review it.

Chris: Sorry. That was some inside baseball. We record on Thursdays .

Amos: Two hours. Yeah.

Chris: Yeah. For two hours. We'll just, you should review all the things. Don't review that for two hours. It's going to take a really long time to do your review.

Anna: Wait, what is this? This is like your weekly review of all the things that need to get done.

Chris: Yeah. Have you ever red Getting Things Done?

Anna: Uh, no, but you told me about it.

Chris: I'm a, I'm a disciple.

Amos: Me too. I'm a slow disciple though. Like I, I fail pretty much regularly.

Chris: A Thomas if you will.

Amos: I'll explain that joke too. So, uh, what Keathley's saying is-

Chris: No, never mind!

Anna: Wait, I missed that joke someone explain it to me,

Chris: Doubting Thomas, he questioned the Lord.

Anna: Oh, right.

Chris: Anyway. Any-who

Amos: That just made it all awkward.

Chris: I've made my decision. You really are the Rosie O'Donnell of this group, Amos.

Amos: (laughing) Please don't. Alright, I'm

Chris: Contrasted with my Madonna. Don't make me make you Betty Spaghetti again.

Anna: Keathley, did you have you had lot of coffee this morning?

Chris: I did. I super did. Oh man. It's bad.

Amos: Too much caffeine.

Anna: Didn't you quit coffee for a while?

Chris: I did. I did. And I can quit again whenever I want.

Anna: Uh-huh.

Amos: Alright, I have a short topic then.

Anna: Yeah, what's up?

Chris: A shorty.

Amos: I've been working on a lot of socket, socket communication stuff. And, and one of the things that I see happen is sockets, whenever there there's an event going to go out that there's a lot more data being generated than, than it's actually what's needed. Uh, and having lots of, lots of different things using like the same message. So instead of saying update this small thing, it's like here's half the half the data in the world, and we're just gonna send that every time something changes so that everybody can use the same message. That has a lot of performance problems that I've run into. Obviously. I don't know if this is a good one, but.

Chris: What you wanna talk about? What's your question?

Amos: Uh,-

Chris: New Soviet way. No, hang on. Let me, let me back up. Just don't do that. Done. You're right. This was short.

Amos: Well, I thought it would be short because I don't think there's a ton, but

Anna: You're right. Just don't do that. Just stop it.

Amos: So what are the, what are the things that, that you suggest ?

Anna: Just don't do it.

Chris: Don't do that.

Amos: Okay, thanks.

Chris: What are you asking? What's the question. I mean obviously you have to do it.

Amos: Right, so- do what?

Chris: There's a reason that you are using this pattern. Well, I say you like the royal you , like, there's a reason you and whoever you work with are using this pattern, then you're locked into it. So as the question, cause obviously the, the answer, the answer, the obvious answer that we're taking off the table is just don't do that.

Amos: So, so is there a reason to do that? Like, is there a reason to try to-

Chris: Always send the full view of the world?

Amos: Yeah.

Chris: Um-

Amos: And if not, what are, what are strategies like if you run into this to, to break it apart? Do you have, like, you've worked in a lot of distributed systems. It's not just a socket. It doesn't have to be a web socket. It's could just be two pieces of software that have to share data.

Chris: Well, it's always easier to send the full view of the world because you don't have, you don't have to do reconciliation, uh, because you just sort of say that's the view of the world and if there's like a canonical source and also you have large time windows between when you send stuff like, okay, so like, I'm going to take, I'm going to use some words here-

Anna: Hopefully, that'd be good.

Chris: But I'm going to take, I'm going to, well, yeah, exactly. This is, this is an audio podcast. You know what? That was a good burn. That was a good burn Okay. I respect that. I just want to acknowledge that that was good.

Amos: Keathley, I'm going to use ASL only for the next five minutes.

Chris: Alright, so, goddamm it, what was I saying? So

Anna: You said that you were going to use words,

Chris: Right. Right. Right. So, um, let's take, let's take things like, let's assume that, uh, that you have a service that is canonical and there's large time windows between use and data so that you're not worrying about this, like, if I send it twice in rapid succession, and one thing has changed within like a small enough time window that there could be pro like, you know, we're going to assume that you can do something stupid, like last write wins. Right. So we're gonna take all of the like delivery guarantees and like causal ordering that you would need to do this correctly off the table for just a second, cause it's complicated, right. And we'll just say that you're like sending it from a canonical source and they always have like an integer ID that tells you the version that you're on. Right. There's some stupid way to do last write wins. Right? If you send the full view of the world all the time, then the downstream thing is only out of date, as long as it's not seeing the most recent update. Right? Like that makes it a lot easier. Because if you kept, if you held, if the downstream thing held a view of the world and you sent pieces, uh, or you sent operations, let's say, right. Like if you sent out like, Hey, this thing changed, right. This one field on your thing changed. Well then if you miss a message, you're now like wrong, but you're wrong in a, in a very subtle way. You're wrong in a way that like is not obvious because like, if you start getting messages, if you just drop one, but then you start receiving all the other messages, then you're like almost correct. And it is not obvious that you're just like completely wrong about that one piece of that state, right. And because of that, uh, that's part of the reason why like sending the full view of the world is a lot easier most of the time, because you can tag it, like you can tag it with like a version and you could tag like the individual fields with a version as well, or some sort of like logical clock, right. Or whatever, like to do reconciliation, but you still run the risk of like, if you drop it, what happens? Whereas like, cause then like the downstream thing doesn't even know that it needs to request the full payload, unless you have some other like remediation steps to know that you're out of date somehow. So then like-

Amos: You're doing this for simplicity.

Chris: Yeah. I mean, how does the consumer know to go back to the service and be like, okay, actually I think I'm so far out of date that you just need to give me all of it. Like, how does it know that a piece of state is wrong? So it's a, it's a simplicity thing. Um, if you can get away with it, like if it's small, like relatively small, like let's say under a hundred fields, like, I don't know, just go for it. You know? And if it's way too big to send, like, you know, if like, if there's trans- if there's like extreme transport costs for that, then you need to start like working out other solutions, but you might be able to get away with just using like, message pack.

Amos: My concern is database costs actually to pull all that data. Even if there's only a hundred fields, maybe it's complicated for the database to actually get it out for some reason. Or Your database is just under heavy load a lot. And so the more you can reduce what your database is doing the better.

Chris: Sure. Well, and it also depends on what your tolerance for like being stale and wrong is like downstream. So if the penalty for being wrong is like really, really low, or, you know, if you have remediation steps already, maybe it's not as big a deal. Uh to send like diffs or to send like just little bits that have changed. Um, but if you can't, if you, but like, I don't know. I think it's a, it's a measure of like where you want to put your risk and where you want to put your, you know, like, yeah. And, and like, where do you wanna put your engineering time? And in some cases like, uh, I dunno, maybe you can fix it like a view, like a database view.

Amos: Yeah.

Chris: Like maybe there's a different way to slice up that problem. If you need to send like less than a hundred things, I don't know, you, maybe you can just hack around it with like a better query or views or materialized views or something like that, uh, and like message pack. Like maybe you can just solve it sort of orthogonally, um, which is not as like a holistic solution, but the holistic solution incurs risk and, and introduces problems that you'll then need to solve, like how to do causality tracking. Like that's hard. So it's, it's all about where you want to spread your risk. And, and for me, like attaching an undue amount of risk to your data. We've talked about this before, or I've, I've been on this, on this kick before, but like attaching, attaching risk to your data is, I don't know that that's like scary to me, um, because it has a lot of implications because users make decisions based on that stuff or other services make decisions based on that stuff. And then they have potential to like send messages back upstream that manipulate data in such a way that now you've like, you've lost the ability you just sort of like do, to actually track why you're out, you're in bad shape.

Amos: Yeah.

Chris: Uh, and that becomes really tricky. So, uh, the simpler, you can make that, I think the better. You obviously still run the risk, like run all of those same risks by sending the full thing and then allowing the client, or like the consumer of the thing to make decisions at all. Because like it could just be out of date and like you make decisions based on out of date, state, uh, you know, you need the way to like correct for that or check it or whatever, or account for it in some way. Uh, but I think the risk, the subtlety is like, you know, you run less, you run, the risks are less subtle, if you, if you send the full thing,. They're still there, but they're, but they're not as, yeah. They're more obvious when you're sort of wrong, I think, in general.

Amos: Okay.

Anna: Was that helpful to you Amos?

Chris: Did we do it?

Amos: It is. Um, some of the things that I'm running into that I think are, are, um, some of those, some of the pieces that we're, I'm dealing with are I would, I would say fit right into what you're talking about. And then some of them are just, there's really only one piece of data that's changing out of the hundred fields and then resending all hundred of them every time, even though there's only one that could possibly change at all. For example, I have a table of files loaded and they're generated. And so when you tell it to generate a file, there's a progress bar. As the files generating, it puts it at the top of the table and all the other ones are down below it. Every time the progress bar gets updated, the socket is sending the entire table HTML back. So it's selecting all of these other files out that they're already generated and done. They're not going to change. You've just really want to update that file progress.

Chris: Yeah. And in that case, in that case, maybe like the risks of being out of date or just like so trivial that you don't care.

Amos: Yeah.

Chris: And you just send the diff like, you know what I mean? Like you have to weigh all this stuff, right. Like in that scenario, like yeah. Maybe just send like the one, the one thing that updated, like don't bother trying to send the, don't try to, don't bother trying to rerender the entire thing, just resend the entire, like , thing back down. Cause there's no need to do that problem.

Amos: Right.

Chris: Because it just probably doesn't matter that much. But if it does, you know, then you care about it, but you can make that trade off, like in the moment, knowing like what the risks for that application are.

Amos: But make sure you know those.

Chris: Well just think about.

Amos: Yeah, yup.

Chris: Schedule some time in your OmniFocus.-

Amos: To go on a walk.

Chris: And then think about it. And then take a nap.

Amos: Naps are important.

Chris: And then think about it.

Amos: That's right. Well, I've got to get out of here.

Anna: All right.

Amos: That's hard stop for me.

Chris: All right. Sounds good. Uh, don't forget to send us questions. I still want questions .

Amos: And we haven't gotten any questions yet, so yeah. Please. Tweet at us. Send it to our contact.

Chris: Send us questions. We will answer them.

Amos: Yep.

Chris: Probably.

Anna: Probably.

Amos: Maybe.

Anna: Keathley will answer them.

Chris: Well someone will answer them.

Anna: *sighs* So good.

Amos: Uh, well, thank you. Nice seeing you again Anna. I'm glad you're back.

Anna: Yeah, same. Thanks.

Amos: Thanks. Bye.

Chris: Alright. Later.

Amos King

Chris Keathley

Anna Neyzberg

Submit a Comment

Recent Posts

Quick Links

Find Us

Subscribe