July 2024

Published on Jul 30, 2024

#personal#duckdb

A Month of Demos

tldr; It’s been quite a month. I finished offboarding from my previous company and am starting a new role soon. I have been experimenting with a lot of different things. You can see some activities below:

Hugging Face Datasets Explorer

In July, I collaborated with the Hugging Face team to create the Datasets Explorer that allows you to explore the datasets on the Hub with SQL. It’s entirely open source and available on the Chrome Web Store.

alt text

The ability to run any SQL query on any dataset on the Hub is pretty amazing. I wrote a blog post about it with a few examples. Here are a few great features about the Datasets Explorer:

  • Run any SQL query on any dataset on the Hub (Public and Private)
  • Read data from the Hub in chunks with infinite scrolling
  • Command + Enter Shortcut to run the query
  • Automatically loads dataset configs and splits as views

Model Release Heatmap

Hugging Face Hub Stats

Powered by DuckDB WASM it queries the hub-stats dataset on HF that I created. Essentially, I have a script that runs daily and updates the stats for all the models, datasets, and spaces on the Hub. It’s pretty amazing the thought of querying semi-large datasets in the browser.

Hub Stats Dataset

US EV Charging Dataset