Although it is possible to register datasets with Crossref, we wouldn’t necessarily recommend it. If your university is creating a data repository, it’s much better for them to work with another Registration Agency altogether for this - DataCite.
DataCite develop and support tools and methods that make data more accessible and more useful. If someone in the scholarly community is trying to find data, they will search and use the DataCite database, they wouldn’t necessarily think to come to Crossref for that sort of content. Although we have a schema for datasets it’s very basic and doesn’t contain rich metadata fields. Our schema and reference linking infrastructure are set up specifically to support and provide services around published content, rather than data.
I would recommend that you contact DataCite to discuss the needs of your data repository. You can find out more about this on our website.
I can’t seem to figure out what the difference is between dataset_type=“collection” and dataset_type=“record” because these terms aren’t defined anywhere in the schema. To my mind, all databases are inherently ‘collections’, especially relational databases. What value is there in creating a DOI for an individual database ‘record’ (like a line in a spreadsheet)? Neither the markup guide or XML example for datasets explain this distinction either.
Also, what is the difference between the ‘Database Level’ and the ‘Dataset Level’? I see in the schema that a ‘dataset’ is contained in a ‘database’, but what do those terms actually mean for registering DOIs and how do they relate to actual databases? It looks like most of the metadata you can record under the ‘dataset’ tag matches what you can record in the ‘database_metadata’ tag, except for funding, format, and citations.