Interested in going full-time bug bounty? Check out our blueprint!

April 30, 2026

Episode 172: Source Code Review Meta Analysis

Apple Podcasts podcast player badge

Spotify podcast player badge

Castro podcast player badge

RSS Feed podcast player badge

YouTube podcast player badge

Show Notes
Transcript

Episode 172: In this episode of Critical Thinking - Bug Bounty Podcast trying out a new structure of episode: a Meta Analysis of sorts of many Source Code Review techniques. This episode features tips gathered from Shubs, Rafax, and FSI. Justin highlights best approaches, patterns, and common pitfalls.

Follow us on twitter at: https://x.com/ctbbpodcast

Got any ideas and suggestions? Feel free to send us any feedback here: info@criticalthinkingpodcast.io

Shoutout to YTCracker for the awesome intro music!

====== Links ======

Follow your hosts Rhynorater, rez0 and gr3pme on X:

https://x.com/Rhynorater

https://x.com/rez0__

https://x.com/gr3pme

Critical Research Lab:

https://lab.ctbb.show/

====== Ways to Support CTBBPodcast ======

Hop on the CTBB Discord at https://ctbb.show/discord!

We also do Discord subs at $25, $10, and $5 - premium subscribers get access to private masterclasses, exploits, tools, scripts, un-redacted bug reports, etc.

You can also find some hacker swag at https://ctbb.show/merch!

Today’s Sponsor: Adobe - Get 10% bonus for valid AI vulnerabilities in Adobe Stock and Lightroom Web. Use code: CTBB063026 in your report.

Expires June 30, 2026.

====== This Week in Bug Bounty ======

Open-source security testing: the Bug Bounty guide to code analysis

https://www.yeswehack.com/learn-bug-bounty/open-source-guide-code-analysis?utm_source=youtube&utm_medium=sponsor-critical-thinking&utm_campaign=open-source-guide-code-analysis

====== Resources ======

Abusing Windows, .NET quirks, and Unicode Normalization to exploit DNN (DotNetNuke)

https://slcyber.io/research-center/abusing-windows-net-quirks-and-unicode-normalization-to-exploit-dnn-dotnetnuke/#:~:text=across%20different%20languages.-,A%20MUST%2DKNOW%20BEHAVIOUR%20OF%20PATH.COMBINE,-Another%20key%20implementation

====== Timestamps ======

(00:00:00) Introduction

(00:06:49) Tracing Data Flow, knowing where your playload is landing, and developer mistakes.

(00:17:33) Mapping the software

(00:24:46) Sniffing for blood

(00:31:54) Common Patterns and Pitfalls

Title: Transcript - Thu, 30 Apr 2026 13:40:24 GMT
Date: Thu, 30 Apr 2026 13:40:24 GMT, Duration: [00:51:02.75]
Speaker 1
Pay somebody to set up a server with this on it on Fiverr or freelancer.com, right? Like even if it's a couple hundred bucks, like hit up a developer that works with this software stack and just says, hey, uh, can you just like put this on the server and set it up? And then you've got it set up. All right, hackers. I'm looking at my inbox right now for HackerOne and I'm smiling ear to ear because I got paid out a crit and it came from that last CTBB hackalong that we did live in the Discord with a couple hundred of you guys present when I had to cut the stream and go report a vulnerability live. That resulted in a $7K crit from Adobe, guys. I'm so pleased. The Adobe team also added a $650 bonus to that report because I used the CTBB partnership code. Okay. So I want to remind you guys that Adobe is offering a 10% increase in bounties to you guys, the Critical Thinking listeners, because you guys rock. Okay. And that is targeting Adobe Stock and Lightroom Web. Adobe Stock and Lightroom Web, and it is a 10% bonus for valid AI vulnerabilities mapped to OWASP Top 10 for LLMs. Okay, so definitely go target those two products, look at the AI features and report some awesome bugs and you'll get that 10% bonus. We'll put the, the information down in the description. The code is CTBB063026. That's June 30th, 2026, because that's when the promotion ends. Okay, go tear them up, guys. They're a really good program. They're paying some great bounties and they're offering some special love to you guys, the CTBB listeners. All right, let's go back to the show. All right. Sup hackers, before we hop into the episode today, uh, we've got the This Week in Bug Bounty segment, just some rapid fire news for you. The only item on our list actually is very relevant to the episode and it is the Open Source Security Testing Guide, uh, by YesWeHack. Okay. Um, so we'll go ahead and link that down in the description, but we're going to spend a lot of time this episode talking about, uh, source code review and that sort of thing. And this, while this guide is focused specifically on open source, there are a lot of really applicable techniques here to any sort of source code review, right? A lot of the tools you need to know, a lot of the terms you need to understand. So give this a glance if you're getting into that space. There's also another really big added benefit to these write-ups like these nowadays, which is you can feed these into your, your AI. Hey, say, hey, check this out. Write up a nice little skill for this based off of this article and others. Um, one of the ones that I thought was really nice in this, uh, example was, um, mentioning that you can define, uh, bookmarks throughout the different pieces of code, uh, that makes it easier for you to jump around into different pieces and follow your own code flows inside of VS Code. Love that. So check out this article. It'll be in the description. All right, let's hop to the show. Alright, hackers. Today I am attempting something quite ambitious, which is I'm going to try to give you guys, um, some cross between a meta-analysis and a recipe and just random thoughts thrown together on source code review. Um, the idea originally for this episode was like, can I find a way to grab a bunch of different sources on source code review, analyze what kind of patterns we see across all of them. And kind of put those into one episode. So we did that. And then I kind of added a little bit of my thoughts here and there. We asked a couple experts in the industry, got their thoughts, pulled all that in. Um, and that is what this episode turned into. So I've spent the past 3 hours reading through this document, trying to tweak it, trying to get it to the spot where it needs to be. Um, and hopefully and try to abstract some of the stuff as much as I can. And hopefully you guys will be able to get some good comprehensive coverage on the topic of source code analysis. Okay. I think this is a really applicable topic right now because AI is, is augmenting the space a lot. And while I will give you some tips on how to use AI for source code analysis today, most of this is going to be focused on the good old here's how you do source code analysis. Source code review. And you can then take those and collaborate with Claude or your AI of choice to, to sort of superpower it. Okay. So first, let's, let's go ahead and jump right into it. The first thing that I want to challenge you with, if you're going to do source code review, is what is your goal for the source code review? Looking at a couple of the main purposes for source code will reveal a couple things. One, for example, if you look at Asynode or the SL Cyber team, they are often just looking for pre-auth RCE, right? That's also the kind of thing you see in Pwn2Own, right? If you know that you are just looking for pre-auth RCE, you know you have to start from an unauthenticated route, right? Or you have to bypass authentication. So that's, you know, if you've defined your goal in advance, hey, what I want is unauthenticated RCE, then you can start from there. If you are a bug bounty hunter and you're just looking to cash in, Maybe you can, you know, get bugs, get rewards for post-auth XSS or something like that, right? So it's just, it's helpful to know in your head, what am I going for here? And have that well-defined from the beginning. Okay. So next, I kind of wanted to throw a little theory out there before we jump into everything. I've been sitting here looking at all the vulnerabilities that sort of powered this episode and all the articles and stuff like that. And I think, I think that there are really 3 areas that almost all of the vulnerabilities that we find sort of, sort of come together, sort of narrow down to when you think about it. The first one is tracing the data flow better than the developer. The second one is knowing the nuances of the context where your payload is landing better than the developer. And then the third and last one is just, of course, the developer makes some sort of mistake and forgets a security control. So let's look at each one of those really quick and kind of talk about those from a high level so that we can understand where all the other vulnerabilities that, that you'll see sort of how it falls into that model. And you can start thinking about it from a more abstracted perspective. Okay. So tracing the data flow better. This often, you know, this is a pretty common scenario where the developer just doesn't understand that what they are working with at the time is user input. I'm gonna give you an example of this from a real vulnerability that I found. We, me and my buddy, we uploaded a zip file and this, it extracts and pulls a file out of that zip. And then inside of that file, there's XML. And inside of that XML, there's an attribute. Okay. And somewhere down in the pipeline, that attribute got embedded into a command line shell out and we were able to escape that, that command line argument and get RC via command line injection. Okay. And that's a classic scenario of like, oh, because it was so abstracted, because, you know, they had parsed the XML, the XML was different from the file. It was, you know, far away from the zip. And, you know, it was, it was in a job that was not even, you know, correlated with that original HTTP request. I think the developer just didn't realize that they were dealing with malicious input here. Okay. So this is one of the things that we're going to talk about a good bit in this, in this session is you just really need to get super good at tracing data flow. Like, I know that that is super dumb and basic to say, okay, but really that's, meat of it is you need to know if data is coming in, right? Where, where is it going all the way through the pipeline, right? If there's one skill that I can tell you to master, it's that just tracing where the data is going to go through all of the sources all the way to the sinks. Okay. Second one is knowing the nuances of where your payload is landing better than the developers. There's a really good example of this. Let me go ahead and pull it up here in one of the SL Cyber write-ups. I'm going to reference them a lot. Shout out to Shubs and the team for also helping with this episode. I'm gonna, I'm gonna share my screen here as well so that you guys can see this, this snippet from the, from the write-up. But essentially it says another key implementation here, implementation detail that must be well known by those writing any file and path operations in C# is how Path.Combine works. We've seen this throughout our careers as source code auditors where the usage of Path.Combine leads to critical vulnerabilities. If the second argument of Path.Combine usually the user input. If it is an absolute path, the previous argument is completely ignored and the absolute path is returned. The documentation does try to make this obvious— be this behavior obvious, clearly stating— and then proceeds to quote the documentation which says the exact same thing. Okay, um, so this is a perfect example of like, hey, the developer thinks that what they're doing with pathCombine is prepending you know, a certain path that locks the user into a directory. But really, you can supply an arbitrary directory as the second argument and it just removes the first argument, right? And so that's a classic example of like, well, if you understand the nuances of this technology better than the developers themselves, then you'll be able to find vulnerabilities like path traversal, or I guess it's not really even a traversal here, it's just arbitrary path injection here. Another classic example of this, like understanding the nuances of the payload or the context in which your input is getting injected is XXE, right? A lot of people don't realize when they're parsing XXE or parsing XXE, man, you can see in my hacker brain, when they're parsing XML, that XML has this crazy ability to like request files and like send HTTP requests and stuff like that. Like why would it have that? And the only way that you can exploit that is by knowing more about XML than the people using the XML, right? So number 2 here on the list is knowing the nuances of where your payload is landing better than developers. And this takes a lot of work. And AI can certainly help with this, right? You can, you can ask it to learn about a specific technology stack or even a specific function, how it's implemented. Right? We saw that with, with path, the path combine callout there. Yeah, just really need to know where your path, where your, your input is landing in what context. And we'll talk a little bit more later about the more complex that context, the better. Okay. And then last but not least on my list of 3 items is forgetting a security control. Okay. This one's basic. There are security controls in place. In the app, or maybe there aren't any at all and they just completely forgot about security altogether. But if somebody forgets a security control, then of course you can just plop in there and exploit it. Okay. So here are the main things abstracted up as high as I can that you should look for. Tracing the data better than, than the developers themselves, realizing that you control a piece of data deep, deep, deep, deep, deep in the app. Knowing the nuances of where your payload lands better than the developer. Right, understanding the context and then looking for forgotten security controls or, you know, the lack of security controls in general, okay? So those are some sort of abstracted stuff that I thought you guys might benefit from. I keep that under the set your goal heading because these are sort of things we need to think about as we're going into the app, okay? Now let's talk about getting our hands on the software. A lot of times we have open source software and that's great. We can just audit the source code directly from there. And you may even want to pick your target based off of the fact that you have access to source code. I've talked on an episode recently about hacking on SDKs. Those are really good attack surface because you have the source code for the SDK and you can just audit it and look for vulnerabilities in there. But if you need to get your hands on source code and it's not open source, there are a couple of ways you can do it. One, you got to look for Docker images supplied by the organization. That's a great way to do it. AWS, Azure, or GCP images also in the cloud environment. Sometimes they have those set up so that you can set up that software and get a trial or whatever, moving very quickly without having to talk to a sales team. We've seen that a couple of times. I figure I'll just shout this out as well. You also should look into packages and related packages. To the actual main, you know, app in pip and npm. We just saw recently with Cloud Code, you know, Cloud Code, we can look at the minified JavaScript or whatever, but they shipped their map files, right? And, you know, if, if you've got the map files, that makes your life a lot easier. So make sure you check the repositories, the packages, the package managers. That's what I'm saying. The package manage repos, and know that those are not necessarily gonna correlate with a GitHub repo. You know, there may not even be a GitHub repo, but even if there is a GitHub repo, note that there may be something different in the package manager there. So keep that in mind. Another option is reversing it, right? We oftentimes will get a binary or something like that, and it's reversible with like.peek or, Maybe if you got a JAR, you can convert it back to Java or whatever. You got to know that what type of language was used to create this binary and is it possible for me to reverse stuff out? Another shout lately is that we've been using Ghidra with a very, very, very dense binary and just hooking Claude into it via MCP. It's cracked, guys. It's freaking cracked. And I've actually seen a couple of people on Twitter being like, What the heck, my job is gone. You know, like, like really, really good reverse engineers just kind of saying sadly, like, man, this is my art. I've been like working on this art for a long time and now Claude can just come in there and just be like, hmm, hmm, it goes like this and then it goes, you know. And so even if you're not good at reverse engineering, hook Claude up to Ghidra and you're going to get some crazy results out of that. Okay. And then here was another shout that I, you know, I, I'll have to look up the source for this, where this came out, came from, because I didn't see it in one of the articles. It was actually from the notes of our researcher. But he, he noted down, pay somebody to set up a server with this on it on Fiverr or freelancer.com, right? Like, even if it's a couple hundred bucks, like, hit up a developer that works with this software stack and just says, hey, can you just like put this on the server and set it up? And then you've got it set up and that's the best $200 you've ever spent in your life. If you know how much of a pain it is to get stuff set up. Um, and then you can jump on that, that, um, server reverse, you know, grab the code out and apply some of the other techniques that we just mentioned. Um, but yeah, once you've got your hands on the code, then we go to the next step. Um, which is you, you really do need to set up the code and get a local environment. Guys, this is where AI is really super game-changing. Okay. AI knows how to set stuff up better than anything I've ever seen in my life, dude. And if there's one area where I save a lot of time, it is this, like where I just wouldn't bother with it before. Cause it's like, oh my gosh, this is so excruciatingly painful to set stuff up. Like I just can't do it. Right. If there's not a Docker image, you can just throw AI at it and AI will churn, churn, churn, churn, churn until it gets it working. Such a big, big win to use Claude for this sort of thing. And then as much as I have been on the fence about it, hooking a debugger into this stuff is really, really legendary, guys. I've played with it a little bit lately and I have to say, like, I used to be a proponent for just like try to yeet it at the main server, you know, look at the source code, try to figure it out. It's so unbelievably overpowered to set a, you know, a debug breakpoint and then just be able to see and inspect live variables. So if you're, if you're going to sit down and you're going to be serious about it, if you're going to reverse some code, get it spun up, attach debugger, and you're good to go. This is another area where AI can really, really help a ton. Okay. So we're, we're, we're through the boring stuff here. Figure out a way to get your hands on the source code and get that shit set up and get a debugger hooked into it. Okay. All right. Next, mapping the software. Okay. This is an essential first step. A lot of people say, okay, we, we prefer sync-based analysis. If you listen to some of the episodes back when Joel was a co-host, we debated a good bit like, hey, should we start at syncs? Should we start at sources? What's the idea here? Whatever the idea is, you should map the software first. Okay. And I personally think that moving from a source enumeration perspective at the beginning really gives you a, a strong understanding of the app. But either way, you need to understand the basic structure of the app before you, you start going in and trying to find, you know, the exec function or something like that where that you want to map up to a source. Okay. And once again, AI comes in super clutch here. It can build you graphs, it can lay, you know, slurp up the build files, which is one of the things I wanted to recommend is taking a look at the actual build files. And looking at the various services that they're sort of, you know, duct taping together in some of these enterprise apps. But it can look at all of that stuff and it can create a map for you of what the application looks like. And guys, that is so valuable, okay? One of the things I've learned by working really closely with Shubs a couple times is how much impact he gets out of finding weird little APIs that were hidden in a corner that aren't defined the same way, that have some exception, that are routed a little bit differently. And, you know, he turns that into magic time and time again. Okay. So if, you know, maybe your app is just vanilla and it's just normal, you know, Python Flask routes in a file, you know, but you got to make freaking sure that that's the case. Okay. And that there's not any other microservice or WebSocket server or like gRPC something something. you know, that's like hiding in a corner, uh, that you can go after. Okay. So that's something you really have to pay attention to. I noted, I noted just a second ago, um, that we should be talking about, um, you know, these route definitions and the routing. Well, reverse proxies is another big piece here. Are they using that reverse proxy to wall off certain endpoints or certain parts of the, the routing on the backend, right? Don't just look at the router and assume that all of that stuff is accessible. And then, you know, when you hit your, your, uh, you know, production server, you're like, oh, that route isn't accessible, blah, blah, blah. You've got to look at the reverse proxy. That's the first layer, right? And understand what they're trying to wall off. And that also can give you a nice treasure map for what you should get access to. Okay. Reverse proxy configuration is very important. Middleware configurations, equally as important. Okay. A lot of times auth is being applied at the middleware level. So figure out what decorator or additional piece that gets built into these routes is applying the auth middleware and check that it is holistically applied across all the routes. If it's not, that could be a good gadget. Okay. So make sure you're checking all those. You're looking for WebSockets, you're looking for gRPC, you're looking at the reverse proxies, you're looking at the routes definitions and don't skip the boring parts of all of this. Okay. The Grafana SSRF that I found in 2020, which is one that I refer to all the time because it's such a classic example of a clean exploit, okay? This exploit was found in the avatar route for Grafana, okay? This is the Grafana SSRF. And it would hit Gravatar on the backend and then got a redirect off of Gravatar to AWS metadata credentials, okay? And if I had skipped the avatar route on Grafana, then I would not have found that. And the avatar route is not typically the most interesting path, right? But we really need to be thorough about it. And it was an unauthenticated path, which is what drew me to it in the first place. So make sure you're being thorough and not skipping any of those boring, boring, quote unquote boring looking paths, okay? You know what's gonna be boring is when you don't submit any dang reports because you didn't look at the boring routes. So, Yeah, I mentioned that you should look for services that are explicitly not exposed by the reverse proxy. I also wanted to mention here that you can use stuff like Asset Notes or SL Cyber's, excuse me, Hyoketsu, which is H-Y-O-K-E-T-S-U. That's Japanese for verdict, Hyoketsu. And what that does is it scans the, it supports various, source code bases. But for example, it'll look for jars that are known open source libraries or reused code, right? And then you can not spend your time decompiling all those and trying to cross-correlate them. So this is a great way to get out all the fluff and look at the meat of what is the custom code from this application. So then you're not spending a bunch of time searching for, you know, zero days in some Apache JAR file, right? That being said, I also had in my notes here that sometimes when the single point of failure might be a third-party library, sometimes it, it, it's worth it to look into that a little bit. Okay. And it depends on what program you're working with. You know, if you're, if you're trying to, if they're really not going to pay for third-party bugs, if you're doing this from a bug bounty perspective, then obviously you need to not do that on the other hand. But I do often find that if they're using some rinky-dink little 200-star on GitHub library, there's a decent chance that there actually might be a problem with that. That could be another area where we could see some tooling get introduced soon is like SL Cyber Hyoketsu, sort of. I just wanted to go off in Japanese there. The Hyoketsu, something that looks like Hyoketsu, but for identifying open source libraries or, you know, reused code that is very low star on GitHub or on the package managers or whatever, right? That could be another really interesting attack surface to go after if they're using a not very popular dependency. Just a thought. Once again, just sort of jumping into the AI for this specific section. Really, this— we really want to be using AI for mapping this out. This is one of the biggest productivity gains you gain with AI right now is I feel like I can ingest a codebase in like freaking 30 minutes now, whereas it would take 4 or 5, 6 hours sometimes before to like understand, okay, this is here, this is there, you know, and I feel like I have a good good grasp of it 30 minutes in. Very, very helpful. Massive productivity gains. Okay, so we've covered, you know, set your goal, set up the software, get your hands on the software, that sort of thing, map the software. And now we're going to go into— now we understand the architecture, we understand the attack surface, and we're going to start doing the actual assessment. Okay. I always kind of call this segment this portion sniffing for blood, because there are just so many— how do I put this? There are so many signs, there are so many, like, patterns that will just give you a sketchy feeling if you've done this for a little while. And even if you haven't done this for a little while, if you've read a lot of write-ups, you'll look at something and you'll be like, that's a little weird, right? And I was trying to figure out how to abstract that up And the only way that I think I can describe it is when our user input and like enters a more complex environment. Okay. That's, I think that's really the root of it is our, our, you know, if our string input that comes from a query parameter or something like that is just compared to a string, right. You know, it's not that complex. There's not a lot of wiggle room there, right. It's not probably setting off your alarm sensors for like you know, this could be vulnerable. If instead our query parameter is put into a configuration file that controls, you know, has a whole language in this configuration file that's at our disposal, then, you know, the gears start turning a little bit here where you're like, wow, my simple user input just ended up in a really complex environment. Same thing we see with XSS, right? You know, when we're, One of the ones I love to point to is if our input is inputted inside of a string, inside of a script tag, inside of HTML, right? There are just so many layers of complexity there, right? How is that string used? We've got string features we can use with like backslash Unicode escapes. We've got hexadecimal definitions. You've got so many different things. And then you've got the script layer. Okay, now we're in JavaScript. You know, what can I do here? Let me break out with a single quote. Okay, we're in HTML. I gotta make sure that those, you know, angle brackets and the slash are escaped, even if the single and double quotes are, because I can just cut off the tag, right? There's just so many layers of complexity when our user input ends up in various contexts. And as that complexity continues to increase, that's where things get sketchy, okay? So that's the, one of the more high-level principles that I would like to convey to you is look for your input entering complex environments. What are some complex environments? Well, let's talk through some of those, okay? I created a little list here. Of course, we have the traditional ones like database strings, right? Like SQL injection. Why is that complex? There's various encodings we can use. There's, you know, inside of the SQL, or database string, right? There is a syntax there, double quotes, single quotes or strings, you know, that sort of thing. That's an increase in complexity and comes along with massive capability, right? If it gets— if it just gets complex, that's great. But we also have to be able to do something, right? If we can break out of our string in a database, we can query the database, write to the database, sometimes execute commands, right? Complexity plus capability, very important. Some other areas where we might see that complexity, I mentioned config files from before. I think configuration file injection is a super underrated vulnerability type. There's different language syntax, there's encodings, there's, you know, some of these languages you can, just like JavaScript or, you know, inside of the JavaScript string, you can write in escaped hex, right? There's all sorts of stuff, Unicode, right? So definitely configuration file injection. If our user input ends up in a URL, we've got traversals, we've got the various parts of the URL that we can create segments for. Question mark pops us into the query. Hashtag pops us into the hash, right? If we're defining the domain portion, the sign pops us into the username and password field, right? There, there, these areas where there are different grammars in place that's where we need to be, you know, living. And it takes an understanding of the technology that you're working with, right? Which is why that mapping and understanding the architecture of the application and the technology base that you're working with is super important. And answering some of those core questions like, okay, my input just ended up here. What capabilities does that have? Okay. And that brings it all back up to the top. When I was talking to you guys in the beginning about understanding the nuances of where your payload lands better than the developer who wrote it. Right? That's what we're talking about here with this complexity. Okay. So hopefully I didn't drive that home too hard. Here are a couple other ones that are interesting. If your input lands in an API request body, not just the path, right? That's also interesting. Can you escape out of the JSON? Freaking Franz Rosen told me to start doing this. I was like, dude, nobody ever like doesn't use a library to parse JSON or create JSON. And then I started looking for it and I found a couple of them and I was like, Ate my words there. So make sure you're looking for weird concatenation of JSON, XML, other entities, you know, even some of these binary protocols. I just talked last week about Protobuf structures, right? Binary Protobuf structures. Being able to input into these, very impactful sometimes. We're looking at templates, concatenation with templates. And one of the ones that Shubbs called out as well as an increasingly complex environment is any sort of dynamic code evaluation paths. This is something that I've seen Shubbs do really well. He mentioned a couple: XSLT, RhinoScript, JEL, any sort of sandbox environment where you are getting, you know, plopped into a sandbox. Those are really interesting to look at. So your, your alarm should be going off, boom, boom, boom, boom, whenever you see our input landing in dynamic code evaluation paths or increasingly complex environments. Okay. Hope I drove that home. Um, another thing to look for that kind of, uh, leads to that, that weird tingly spidey sense on where the vulnerability is, is code that seems to be implemented in a different way or a, you know, a legacy way from the main code base. So once you've mapped out the architecture, Um, you know, you understand what, where is the majority of the functionality living? It's living in this main API that's defined in this Flask, blah, blah. Um, but as soon as, like, even if there's another API completely, whichever one's not the like main API is just inherently interesting. So, um, that's another area to look for, for, um, for issues. Okay. I tried to, I tried to get as much of that abstracted as possible. And I think there are probably a lot of other areas where, where you can sniff for blood, right? Where you can sort of, where you get a spidey sense for where the vulnerabilities may be. But these are the ones that I came up with that I thought were most abstractable. Okay. Now let's talk a little bit about some common pitfalls and then we'll go into some patterns that sort of extend. Actually, you know what, let's jump over to the patterns first. We'll jump to the patterns and then we'll jump back to the actual things that seal the deal in some of these environments. Okay, so what kind of patterns do we see in addition to our input landing in increasingly complex environments, which often results in vulnerabilities? And we pulled these from a bunch of different research, a lot of the Asano write-ups, a lot of just stuff I have bookmarked on, on Twitter, a lot of talks across YouTube on source code analysis, that sort of thing. And we kind of put them into like top 5 bullet points here, um, to try to show you guys what, what we've got. So the first one that came up is parser differentials. Um, this is something that, uh, the team over at Yes We Hack, uh, has done an excellent job of writing up recently. So definitely check that out. We'll link that in the description, but, um, there are gonna be a lot of, uh, a, a lot of scenarios where parser differentials, um, uh, cause vulnerabilities. And one of the most common patterns I've seen here is there being a well-defined security layer and then there being a well-defined, you know, application layer and the security layer. I've noticed that if the security layer just gives you a yes, no, then it's, it's less effective. And that isn't always the case, right? But one of the things I really like to do is, you know, look at the security layer and the security layer doesn't pass the input to the application layer is what I'm trying to say. So example, request comes in, hits the security layer, the security layer says, okay, and then it passes the request to the application layer, right? Versus the security layer saying, here's the information to the application layer. Does that make a little sense? There's like a direct handoff between the app security layer and the application layer. And the reason for that is there are two preparation processes is what I'm calling it right now. I'm sure there's a different way to define it, but whenever you're parsing unstructured data, for example, we'll just give the example of a JSON parse, right? Let's say your security layer lives in a JavaScript environment and using JSON.parse. And I'm just pulling this out of my, out of my hat here. And your backend is using Python, which uses JSON.loads. Environments, right? These are two different parsing environments and there might be a differential between those two. And so the security layer can just say, sure, here, you know, this seems fine, pass it, you know, pass it back and say, all good. And then the routing layer, you know, sort of passes it to the application layer on the backend. Versus let's say a scenario where you have something more monolithic, a JavaScript security layer and a JavaScript backend, and it just passes the object directly from the security layer into the, the application layer. Okay, does that make sense? So there's, there's, there's different preparation processes across the frontend and the backend where the security is implemented in the front. I hope that makes sense. There's lots of different examples of parser differentials across lots of different environments. So just be on the lookout for those on how the data is being processed. Next was chaining. Okay, that's another common scenario that we see is that a lot of times these more complex vulnerabilities, especially in enterprise applications, are not single bugs. They are multiple chained bugs together. And this is where I start to think AI is going to have some problems. Certainly we should be using AI and really AI is great for this, right? We can tell it, hey, look, I've got this, I've got this, I've got this. Actually just told it like an hour ago, look, I've got this CSPT. I need an arbitrary JSON hosting primitive. Okay. Find me this, find it for me now. You know? And, and it, you know, goes across everything it knows across about the codebase. And it says, look, the most likely places for this to exist are here, here, here. And I'm going to work through them one by one. And it goes through it and you go through it at the same time and you're, you know, handing stuff back and forth, right? So I would really encourage you guys as you use AI, you know, one of the things it's not great at is yet is doing this multiple chain thing, right? So you have to be the multiple chain thing, make it document gadgets, make it document things that you, you know, that you've used in other pieces of, in other chains, right? This is the importance of like maintaining a nice ClaudeMD file that where the, the AI knows, okay, arbitrary JSON hosting for Justin. Is a very, very valuable gadget, right? And then bringing them together to complete the chains. And same for, same for just forcing it to find the other piece of the chain that you need, right? Then you don't have to go through all the mental duress of like, man, I really need an open redirect right now. Like, where the heck am I supposed to find an open redirect? And you kind of offload it to the AI, see what it comes out with. Okay. So good stuff there. Chaining is essential to high quality, um, uh, you know, source code auditing in my opinion. Okay. Next up is dependency auditing. I sort of mentioned to the, this to y'all before, but if you've got a dependency that looks a little sus, it might be a good idea to take a peek at that, especially if it's low stars or, um, the source code just looks a little different, right? This one was one I added in myself because I've been finding vulnerabilities here. And typically when they're in programs that value impact, they pay pretty dang well. So don't sleep on that. And I wish, you know, I hope all the guys like Ronnie and the people that do dependency confusion vulnerabilities, I hope they, that that becomes a little bit more widely accepted in the bounty space right now. Because man, it is impactful if you compromise a dependency and then boom, your whole source code's gone. Really crazy. Okay, next one is the validate then transform code pattern. We see this pretty often. You know, there's something, for me, the episode, I mean, I was pretty far into my, you know, professional hacking career when I did this, the DOM Purify episode with Kevin Mizu. But in that episode, for me, is where it became super concrete. What Kevin was emphasizing in that episode was that if you do anything, anything to the string after you put it through DOMPurify, then it is polluted. And you can very, very, very likely, you know, cause a problem. The DOMPurify sanitization step needs to be like, pretty much wrapped in the whatever is gonna write it to the DOM for this to actually work properly. And similarly, any sort of sanitization, any sort of validation checking that you do for vulnerabilities really needs to be right before the sink where it is, where it would potentially be abused, okay? I'm gonna say that again. When there's some sanitization or checks for malicious input, It needs to be right before the place where it would be maliciously used. Okay. If it's not, any, anything in that space is interesting, right? What happens between the check and the actual sync? That is one of the most common things that we see is, you know, code gets more complex over time. That, that little segment between the validation and the sync gets bigger when somebody adds in a little feature here or there, right? So make sure you're checking that space between the validation and the sync. Um, another big one that I've seen often is alternative sources. Okay. I'm going to tell you guys a quick story about this one. So I was banging my head up against an app. I was like, man, if I could get XSS here, this would be game-changing because of the way of the, the way the app works, right? On authenticated input, it would be great. Um, anyway, it would compromise the whole app because of warmability. Okay, so I was banging my head up against this Markdown editor and I'm like, surely there's something here I can find. Couldn't find it. Anyway, I notice that there's also a way for me to send text messages into this application and I'm like, all right, you know, let me, let me, let me try that out. So I whip out my phone, I write a, you know, message, an actual, HTML message on my phone and I send it and I typoed it. So I send it again and it pops on the admin panel and I'm like, frick yes. Um, so one of the takeaway from this being is look for all alternate sources into the place where you want your data to live. Okay. Um, surely there's going to be the traditional path, but there's also going to be maybe legacy endpoints, v1, v2, v0 endpoints. There's going to be alternate data streams. Is data coming in from SMS? Is data coming in from WebSockets? Is data coming in from, you know, being loaded in via a mass import from a configuration file, right? Maybe that one's missing the sanitization that's happening at the source rather than the sink. And that's why we sanitize more at the sink rather than at the source, because like there's, you know, you have to hit every single source. Right. Versus if you sanitize right before the sink, then, you know, you know, it's just going to go right into the sink. Does that make sense? Okay. I hope it does. So look for alternative sources for data to flow into the application and look for patterns of where they're putting that sanitization, if they're putting it at the source or at the sink. Yeah. And I just wanted to say, you know, in with regards to AI and these common patterns, You can take these patterns, we'll, we'll have them in the Hacker Notes, and you can tell AI, hey, look for these across the application. Look for sanitization that is far from the, the sink. Look for, you know, sus dependencies. Look for this, you know, alternative sources into the same spot. Look for the, look for these gadgets that I need. AI is just here to bolster all of that as you, as you do your security assessment. Okay. Now let's jump into a couple common pitfalls that sort of close the deal in a lot of situations. I ranted a little bit about this on a recent episode, but freaking casing, guys, let me tell you, casing causes a lot of dang problems. Okay. So just keep that in mind. Uppercase letter, lowercase letter, things are gonna parse those things, you know, the same or different, right? And it's good to know which one's which. Unicode normalization, much like, you know, the casing, is also something that often causes a lot of problems. That one's a little bit less easy to spot in the source code review. That is something that you could shout to AI and be like, yo, you see any like Unicode normalization issues here? Where Unicode is potentially being processed, could be a good piece. Once again, you'll see that more in increasingly complex environments. If you're handing off from one technology stack to another technology stack, what are the assumptions that are between these two technology stacks that may not hold true in all edge cases? Another one that's just really funny that we see really often 2, I'll give you, non-global replacement and bad regex. Like, they get their own shoutouts here because they are so common, okay? If somebody's trying to replace something, they very, very often forget the, you know, the global piece or the recursive element of it, right? You know, especially when they're sanitizing with a replace, that's just hacked, okay? So anytime you see a replacement on your input, just like stare at it for a couple minutes and be like, hmm, Hmm. And something will jump out to you. That's what I do. I just, I see replace. All right. I'm gonna be staring at this for the next 10, 15 seconds at least. Definitely a good use of time. Bad regex, like I mentioned a second ago, very common. Dot escapes, missing, you know, up arrow at the beginning or dollar sign at the end, super common. Look deep into this because there are really some things that people can miss here. One of the ones that I really like is a backslash inside of square brackets. Trying to escape stuff that you don't need to escape and accidentally putting the backslash character inside of square brackets and thinking that it's, you know, it's actually backslash escaping something, okay? Really, really fun scenario there. Encodings, another thing, you know, we mentioned Unicode, but you've got to master all of your encodings, of course. Um, yeah. Incomplete syntax checks is another really big one. For example, I mean, the big one that just comes to your mind is like, oh, if they're just looking for like script tag for XSS input, these are this, I'm giving you this example. I know, you know this, right? But let's generalize it a little bit. Let's abstract it a little bit. A little bit to any sort of incomplete syntax checks. Anytime you are looking for, you know, the, the code is looking for a specific syntax to disallow, right? That's when we need to think, okay, is that disallow list, is that block list really thorough or not? And are there any cases? Can I capitalize my script tag? Can I, you know, use an SVG on event handler, right? What else can I do here that will address this incomplete block list? Very common. These are the kind of things that seal the deal, right? Once you've found, you sniff the blood, you find an unauthenticated route from the legacy API, you go in there, you're looking at the code, bad sanitization, boom, XSS, boom, file read, whatever. This is one that I added to the list because we see it a lot with like RCE write-ups, which is, uh, you know, something that we used largely for this meta-analysis here, is dynamic function calling. So by this, what I mean, I mean, there's lots of ways that you can do this, but the one that sort of rings true for me in PHP, 'cause we covered it on an episode a long time ago with the Wordfence team, is this like really weird method invocation structure where you can just call a string in PHP. Right? You have a string and it has a function's name in it and you just call it, right? And it's like, oh wow, that works. Like, that's crazy. So it's really important to do your, your, your data tracing there and see if there's any sort of scenario where you're essentially doing an RPC, right? Like a remote procedure call, because that is, is where a lot of yuck comes in and a lot of flexibility for the attacker. Okay. So keep an eye out for any of those. Keep an eye out is what I meant to say. For any of those situations. As I'm saying that, keep an eye out is such a weird English phrase, right? Like, keep an eye out, like your eye. Anyway, I digress. Got a couple more here and then we'll wrap for this episode. Incomplete application, like I mentioned before, of security controls. That's another really big one. Ask the AI. This is something that I can really help you with. Where are the pivotal security controls here? If they are not globally applied as middleware, how, how do I check whether everything has this, you know, function call right at the beginning of its route handler, right? How do I check whether, you know, there are any weird places where, where this function is not being called properly or or maybe it's like in an if statement with another condition, or, you know, the way that this function is called seems non-standardized, right? Like I just mentioned the if statement just a second ago, maybe one of them is a truthy sort of thing, right? That's another big thing is truthy issues and typing checks. We need to be, it needs to be consistent across the board, right? Any sort of auth or Sanitization checks need to be consistently implemented across the board, and any variance is a very, very common pitfall and an area that you should look deeply into. Another one that came up here is making auth decisions off of an inconsistent source. Okay, like, as I said a moment ago, the middleware, if it's not globally applied, where, where are they getting their auth information? Is this coming from anywhere except this bearer or this cookie or this one function that processes it as a part of the HTTP request. If they are, um, then we need to know where that is and we need to double-click into that deeply. Okay. Um, so that's a really big one as well. Lastly, we have custom sanitizers. Okay. Um, a lot of times we have things like DOMPurify for a reason, right? There are going to be very, very complex environments where our input is landing. If you see a custom sanitizer, you need to be very, very locked in on that. And you may even be able to instrument it to fuzz it a little bit better. That's something that I've done in the past. And when you do, something often falls out. And it may not be a full bypass, but a partial bypass sometimes can be enough to give you another gadget in your chain. Which you can chain with something else. All right, y'all, that's just a bunch of me ranting about stuff. I, we're gonna have a ton of resources linked in the description from this episode that you can go to learn more about source code analysis. I tried to summarize it best I can. I wanted to, so really give a lot of respect to those people whose work is in the description. I really, really appreciate the write-ups that they did, but I wanna give a special thanks to Shubs, Rafax and FSI for giving direct feedback or direct input into this episode and really changing the way that we, that we thought about this, this episode from the meta-analysis perspective. So shout out to them. And I hope you guys gained some value out of that. Keep hacking, y'all. Keep popping bugs and zero days on these, on these code bases. All right. All right. Peace. And that's a wrap on this episode of Critical Thinking. Thanks so much for watching to the end, y'all. If you want more critical thinking content, uh, or if you want to support the show, head over to ctbb.show/discord. You can hop in the community. There's lots of great high-level hacking discussion happening there on top of the masterclasses, hackalongs, exclusive content, and a full-time hunters guild if you're a full-time hunter. It's a great time, trust me. All right, I'll see you there.