Google CTF 2025 – webz : Exploiting zlib's Huffman Code Table
velog.io103 points by rot22 2 days ago
103 points by rot22 2 days ago
It should mention the bug only exists after some arbitrary "patch" was introduced. As the current title makes it sounds like the actual zlib has a security issue.
Seems like it's not just arbitrary, but crafted. Could not find it anywhere, for example, searching for "DISTS so we can remove overflow checks from" (with quotes ofc) brings up just this site, both in Google and Bing. It has typos, btw. It would be another issue if it came from https://chromium.googlesource.com/chromium/src/+/HEAD/third_..., but that's not the case.
Crafted for the Google CTF. Here's the challenge:
https://capturetheflag.withgoogle.com/challenges/pwn-webz
There's an attachment link, which I believe contains the patch (I haven't looked though):
https://storage.googleapis.com/2025-attachments/193040ef9e60...
The original title included "[CTF] Google CTF 2025" which would strongly hint(CTF=capture the flag) at the possibility of an artificial setting. That probably should of been included in the submission.
Not the author. The first sentence of the article does say this “webz is a zlib exploitation challenge from Google CTF 2025. The Google-zlib implementation provided in the challenge is not upstream; it’s a version with an arbitrary patch applied.”
It’s almost quite literally your comment word for word.
Google CTFs are fascinating. Amazing questions, I always enjoy the write ups.
Unfortunately I’ve never been able to solve one, or even make meaningful progress.
Don't give up. You can do it.
You should start with the Beginner's Quest CTF, by implementing a writeup's solution without looking at the writeup's actual code, and by playing other CTF style challenges such as Overthewire's Bandit.
Great resources and sound advice. Thank you, will take a look at the beginner’s quest for sure. Also I definitely will follow the implementation advice. It just clicked. It’ll geerate a ton of aha moments for sure.
I’ve done Bandit years ago and many other wargames and ctfs (htb, defcon etc), and still doing ctfs every Friday, been working in the field for over a decade, and have 3 CVEs (cvss 7+, one 9) to my name. I think I’m missing something else entirely when it comes to Google CTF.
Maybe I need more theoretical knowledge (is that the right word here? By theoretical I mean more around pure cs and math) vs hands on real world (as in day to day) vulnerability research and exploitation.
Would love to hear some feedback to get better. There’s always more to learn in all directions.
I haven't seriously competed for a while - the team I used to play with is all but disbanded. Back in the day I used to complete a challenge, maybe two, very rarely three in the top tier CTFs - out of 20-30 challenges - so definitely you need a team. (I also often got zero challenges and nothing to show for my time.)
I don't have any references for this but I remember reading that a couple of the bigger teams, those who would win often, had 30-40 players so they have one or two people working on each challenge in parallel. Of course, talent isn't equally distributed - My team usually had 10-12 people, of which maybe 3 people would get us 60-70% of the points we earned.
(I was not one of them. My personal goal was 1/n of our points, so if we were 10 people playing and got 5000 points, I'd be content if I solved challenges worth at least 500. I made it about half the time.)
Anyway, I don't think CS theory is necessarily useful for this - with the exception of the crypto (more on this later). What you really need is a combination of four things:
1) Solid understanding of the elements of each challenge type:
For web or misc, that's how to use sockets, make HTTP requests; what you can and cannot do (can you send a request with unescaped characters? Can you send the wrong Content-Length header? How big a payload can you realistically send?); what basic algorithms exist, how fast they can run and how to use them; Linux permission models. For pwn that's exploitation techniques, ROP, memory protections. For reversing that's reverse engineering techniques, the use of Ghidra or IDA or radare2, sometimes writing processor definitions for them.
For crypto you need to understand linear algebra over finite fields at the very least.
2) Fast learning: You will need to learn a new crypto attack, or the intricacies and gotchas of a particular JS framework, a new language, or a new embedded processor. In [1] you needed to learn what PIL can and cannot parse, how Pickle works under the hood, and, at a shallow level, how PNG image compression works.
3) Iteration. Challenges often have multiple steps. Solving one is usually not enough. Read [1] - it's a great writeup that highlights that point.
4) Resilience. I worked on [2] for a day and a half. But I'm not super up on lattice reduction theory and I didn't know about BKZ reduction. Other people didn't know about it either, learned about it as they went and solved it. I didn't manage. So I didn't solve it. That happens a lot. Live with it and do your best.
[1] https://emanuelmairoll.at/posts/hitcon2025-imgc0nv
[2] https://ibrahimadel.netlify.app/posts/filtermaze-google-ctf-...
Legitimately, they are often too hard. Balancing the problems is quite challenging.
On top of that, the solutions often make the problems seem much intimidating than they are (not that they are easy). Most solutions involve a lot of “happenstance”, where someone tried something and it got an outcome that was useful, which they build on top of. This makes the solutions look crazy complicated (“how would i have ever thought of this!?”), when in reality they are Rube Goldberg machines built out of duct tape and baling wire.
I’ve only solved a few Google CTF problems, and one of them was the one I wrote, lol. That was nearly a decade ago though.
Maybe I'm misgeneralizing, but this seems very similar in flavor to the webp vulnerability a few years back
> LZ77 decoding. This actually triggers the bug and causes integer overflow.
As I understand it, accumulating the tables is contingent on CTW.
Good god that's a wild read.
I wonder if AIs could catch that.