Skip to content

Storing a lot of items

The problem

If you have a lot (=millions) of items to store (experimental sentences, tweets, audio fragments, individual calculations, etc), it might be tempting to store them as individual files so the file system more or less reflects your data structure. This has the important downside, however, that the file system will become incredibly slow: a simple command like ls is no longer instant, loading your data with a script might take ages, and most importantly: it will be practically impossible to move around your files, for example if they need to be moved to another disk for maintenance reasons.

The solution

There are multiple solutions

  1. Store everything in larger files, where each line is one item
  2. Use an sqlite or mysql database. Mysql databases can be requested via the admin.

Further reading