Why I Switched From Pandas to Polars for Football Event Data
A faster, lighter way to explore real football data - without waiting on your tools.
Building my analytics pipeline, I hit a wall: 3,433 Parquet files (~12 M rows) took 44 seconds to load in Pandas and ballooned to 14 GB of RAM. Even a simple .describe()
felt like watching paint dry.
Enter Polars. Same data, 3 seconds to load, 7 GB peak RAM, no manual column wrangling and chaining queries feels like SQL. My laptop finally kept pace with my curiosity.
If you use Pandas on your match data or any large, wide table - this benchmark could spare you hours of waiting.
👉 See the technical deep dive and results here | View the code
Have thoughts, questions, or feedback? Let me know by responding or leaving a comment.