High-resolution efficient image generation from WiFi Mapping

arxiv.org

118 points by oldfuture 12 hours ago


fxtentacle - 10 hours ago

FYI the images are not generated based on the WiFi data. The WiFi data is used as additional conditioning for a regular diffusion image generation model. So what that means is the WiFi measurements are used for determining which objects to place where in the image, but the diffusion model will then fill in any "knowledge gaps" with randomly generated (but visually plausible) data.

jychang - 11 hours ago

The image examples from the paper are absolutely insane.

Is this just extremely overfitted?

Is there a way for us to test this? Or even if the model isn't open source, I'd pay $1 to upload a capture from my wifi card on my linux box and upload it to the researchers and have them generate a picture and see if it's accurate

esrh - 8 hours ago

This is my paper (first author).

I think the results here are much less important and surprising than what some people seem to be thinking. To summarize the core of the paper, we took stable diffusion (which is a 3-part system of an encoder, u-net, decoder), and replaced the encoder to use WiFi data instead of images. This gives you two advantages: you get text-based guidance for free, and the encoder model can be smaller. The smaller model combined with the semantic compression from the autoencoder gives you better (SOTA resolution) results, much faster.

I noticed a lot of discussion about how the model can possibly be so accurate. It wouldn't be wrong to consider the model overfit, in the sense that the visual details of the scene are moved from the training data to the model weights. These kinds of models are meant to be trained & deployed in a single environment. What's interesting about this work is that learning the environment well has become really fast because the output dimension is smaller than image space. In fact, it's so fast that you can basically do it in real time... you turn on a data collection node and can train a model from scratch online, in a new environment that gets decent results with at least a little bit of interesting generalization in ~10min. I'm presenting a demonstration of this at Mobicom 2025 next month in Hong Kong.

What people call "WiFi sensing" is now mostly CSI (channel state information) sensing. When you transmit a packet on many subcarriers (frequencies), the CSI represents how the data on each frequency changed during transmission. So, CSI is inherently quite sensitive to environmental changes.

I want to point out something that most everybody working in the CSI sensing/general ISAC space seems to know: generalization is hard and most definitely unsolved for any reasonably high-dimensional sensing problem (like image generation and to some extent pose estimation). I see a lot of fearmongering online about wifi sensing killing privacy for good, but in my opinion we're still quite far off.

I've made the project's code and some formatted data public since this paper is starting to pick up some attention: https://github.com/nishio-laboratory/latentcsi

equinox_nl - 10 hours ago

I'm highly skeptical about this paper just because the resulting images are in color. How the hell would the model even infer that from the input data?

nntwozz - 9 hours ago

One step closer to The Light of Other Days.

"When a brilliant, driven industrialist harnesses the cutting edge of quantum physics to enable people everywhere, at trivial cost, to see one another at all times: around every corner, through every wall, into everyone's most private, hidden, and even intimate moments. It amounts to the sudden and complete abolition of human privacy--forever."

nashashmi - 8 hours ago

Where is the color info coming from? It can’t come from WiFi. Is that being fed in using a photo?

malux85 - 10 hours ago

PSA: If you publish a paper that talks about high resolution images can you please include at least 1 high resolution image.

I know that is a subjective metric but by anyone’s measure a 4x4 matrix of postage stamp sized images are not high resolution.