Google Drive¶
Learn how to copy directories hosted on Google Drive, version them with DVC and push them to DagsHub storage.
Intro¶
Google Drive is one of the most popular platforms for hosting and sharing files. When using Google Colab, it becomes even more valuable by providing direct access to the Drive directory. However, Google Drive doesn't support file versioning, a significant drawback when our project scales. Moreover, hosting our data in a different place from our codebase can be hard to manage and patch into a workflow.
Introducing The Dag Walker¶
We've received many requests from the community to help transfer and version files from Google Drive to DagsHub storage, so we decided to chip in and improve the process by automating parts of it.
The Dag Walker is a Colab notebook that automatically copies directories hosted on Google Drive to Colab runtime, version them with DVC and push them to DagsHub storage. All you need to do in the process is check some boxes, fill in your user's details and directory paths, and.. well, that's it - you're all set!
If you still want to utilize the mount capabilities Google Colab provides, we highly recommend hosting the DVC cache directory on Google Drive. This way, you can still version the large files using DVC, avoid pulling the same file to Colab runtime twice and mount it easily. To do so, you can use the DagYard Colab notebook and check the Google Drive cache option. Behind the sense, it will create a DVC cache directory in Google Drive and connect it to the working repository.