SELECT * is Bad ...

Mar 20, 2024

Cover image

I've read many posts discussing why SELECT * is bad and should be avoided, but most of them don’t clearly explain why. Today, let's discuss why using SELECT * is not recommended, especially when working with large datasets that include millions of rows or columns.

1. Unnecessary I/O

You don't always need to select all columns. If you fetch all columns but only use a few of them, it wastes resources because you're reading unnecessary data from every page (we'll discuss pages in future posts). Also, SELECT * can interfere with index-only scans, which are more efficient.

2. Increased Network Traffic

If you are fetching unnecessary columns, it means you are consuming more network bandwidth. The more data you transfer, the greater the network load, which can slow down your application, especially when dealing with large datasets.

3. Increased Application Memory Usage

As your dataset grows, your application will require more memory to store the unnecessary data being fetched. This can put extra pressure on your application's memory, leading to performance issues.

4. Client-Side Deserialization

Before data is sent to the client, it is serialized according to protocol rules. The more data you send, the more serialization operations are needed, which consumes CPU resources. After sending, when the client receives the serialized data, it needs to deserialize it. This step also consumes CPU resources, adding even more overhead.

Conclusion

This post doesn’t claim that SELECT * is inherently bad, but rather, that not knowing when and where to use it is the real issue. It's fine to experiment with it while learning, but in production environments, using SELECT * indiscriminately can be costly in terms of performance.

Telegram: otabekswe

Enjoying the content?

Consider supporting me through these platforms!

Buy Me a CoffeeBuy me a coffee, if you found this helpful!

TelegramJoin my Telegram channel for more content!