Hi,
I am currently try to collect and prepare data for deposit XML files. I have faced some problems when converting the references of the articles to the citation list.
The available citation data is fairly structured. (The journal articles have compiled via LaTeX, which results .bbl
files.) I am a little bit confused about the exact role of the citation element.
I think its purpose is to identify the cited publication.
- In ideal case, it contains the DOI number.
- When the data of the cited publication is available in structured form, it should describe it as precise as possible.
- As a fallback solution, it should contain the citation data in the
<unstructured_citation>
element.
I assume the followings.
- Providing structured data is better than just using the
<unstructured_citation>
element. - When the DOI is available, all of the other elements of the citation can be ignored.
- The unstructured data is necessary only, when it contains data which is not described in the other elements.
Are my assumptions correct?
The root of my confusion is that the citation element is
- more verbose than necessary for identification (for instance, the ISSN implies the title, the first page implies authors),
- but is not precise as can be (for instance, it describes only the surname of the first author, the first page is in the schema but the last is not).
I usually have the whole list of authors. In some cases my processed citations include the publisher, the country, an article identification number, a URL to the publication.
What is the preferred way of organizing the mentioned data in the XML?
Thank You for your help in advance!
Best Wishes,
Imre