Now where did I put that file?
Finding and reusing your data will be easier, both for you and for other researchers, if you give a little thought early in the process to how you will name your data files and what file formats you will use to store your data. If you are planning to archive or share your data, you will also want to consider best practices for describing your data.
The format of the electronic data files you work with during your research may be determined by the research equipment and computer hardware and software that you have access to. However, for long-term preservation and ease of sharing, best practices may dictate that the files be converted to a different format after your project has ended. Give some thought to this eventuality at the outset. Considerations include:
Stanford University Libraries - Data Management Services provides a useful overview of preferred file formats. From the Stanford resource:
Containers: TAR, GZIP, ZIP
Databases: XML, CSV
Geospatial: SHP, DBF, GeoTIFF, NetCDF
Moving images: MOV, MPEG, AVI, MXF
Sounds: WAVE, AIFF, MP3, MXF
Statistics: ASCII, DTA, POR, SAS, SAV
Still images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
Tabular data: CSV
Text: XML, PDF/A, HTML, ASCII, UTF-8
Web archive: WARC
Additional helpful guidelines for selecting file formats can be found at these websites:
Before you begin your research, decide on a naming convention for your files. Document the naming convention you choose, and make sure that you and your collaborators follow it. It will save you time and will help others who may use your files in the future. Best practices include:
Bisondata_1.0 = original document
Bisondata_1.1 = original document with minor revisions
More considerations for naming files can be found at these websites:
At the beginning of a research project, it is important to create a stable folder structure in which you can organize materials. The specific folders will depend on your own research process. File organization could be based on how you plan to gather materials, which experiment or process generated them, when they were created, or other strategies. The key is to use folders that make sense to you and allow you to easily find your materials.A simple method to designate a revision is to note it at the end of the file name. This way, files can be grouped by their name and sorted by version number. For example: