Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Joule, a Git-like workflow for your Datasets (joule.host)
2 points by Allezxandre on Nov 3, 2019 | hide | past | favorite | 4 comments


I don't begin to understand what you're offering. Pretty graphics, nice check-boxes. But what is your service?

(Yes, I have data sets... Yes I do AI stuff...)


The service is basically dataset hosting. Today, hosting code is quite easy, with git for instance, you can have your code up for sharing on Github in a few seconds. However, as soon as you need to make your dataset available, you can’t use Git anymore. So instead, I’ve seen people use Google Drive to host the datasets, and then write scripts to make the downloading and stitching of the Zip files easier.

I built this service to solve that: instead of hosting the datasets yourself, you just push them using Joule, and the users that download your code will only need to run a single command to download your code.

EDIT: here’s a sample Git repository with a dataset hosted on Joule: https://gitlab.com/joule-host/that-state-of-the-art-research...


What advantages over Git LFS does it offer?


Git LFS is limited by your hosting solution. If you use Gitlab, for instance, your repository cannot exceed 10 Gb of storage space (including code and other ressources). And on Github I believe that limit is 1 Gb.

Many Deep Learning datasets that deal with pixel-data grow past the 10 Gb mark.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: