请输入您要查询的百科知识:

 

词条 Data publishing
释义

  1. Methods for publishing data

      Data files as supplementary material    Data repositories   Data papers{{anchor|Papers}}{{anchor|Paper}}  Data journals{{anchor|Journals}}{{anchor|Journal}} 

  2. Data citation{{anchor|Citation}}

  3. See also

  4. References

{{distinguish|Database publishing}}

Data publishing (also data publication) is the act of releasing research data in published form for (re)use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish.

This practice is an integral part of the open science movement.

There is a large and multidisciplinary consensus on the benefits resulting from this practice.[1]

[2][3]

The main goal is to elevate data to be first class research outputs.[4]

There are a number of initiatives underway as well as points of consensus and issues still in contention.[5]

There are several distinct ways to make research data available, including:

  • publishing data as supplemental material associated with a research article, typically with the data files hosted by the publisher of the article
  • hosting data on a publicly-available website, with files available for download
  • hosting data in a repository that has been developed to support data publication, e.g. figshare, Dryad, Dataverse, Zenodo. A large number of general and specialty (such as by research topic) data repositories exist.[6]
  • publishing a data paper about the dataset, which may be published as a preprint, in a journal, or in a data journal that is dedicated to supporting data papers. The data may be hosted by the journal or hosted separately in a data repository.

Publishing data allows researchers to both make their data available to others to use, and enables datasets to be cited similarly to other research publication types (such as articles or books), thereby enabling producers of datasets to gain academic credit for their work.

The motivations for publishing data may range for a desire to make research more accessible, to enable citability of datasets, or research funder or publisher mandates that require open data publishing.

Methods for publishing data

Data files as supplementary material

A large number of journals and publishers support supplementary material being attached to research articles, including datasets. Though historically such material might have been distributed only by request or on microform to libraries, journals today typically host such material online. Supplementary material is available to subscribers to the journal or, if the article or journal is open access, to everyone.

Data repositories

There are a large number of data repositories, on both general and specialized topics. Many repositories are disciplinary repositories, focused on a particular research discipline. Repositories may be free for researchers to upload their data or may charge a one-time or ongoing fee for hosting the data. These repositories offer a publicly-accessible web interface for searching and browsing hosted datasets, and may include additional features such as a digital object identifier, for permanent citation of the data, and linking to associated published papers and code.

Data papers{{anchor|Papers}}{{anchor|Paper}}

Data papers are “scholarly publication of a searchable metadata document describing a particular on-line accessible dataset, or a group of datasets, published in accordance to the standard academic practices”.[7]

Their final aim being to provide “information on the what, where, why, how and who of the data”.[4]

The intent of a data paper is to offer descriptive information on the related dataset(s) focusing on data collection, distinguishing features, access and potential reuse rather than on data processing and analysis.[8] Because data papers are considered academic publications no different than other types of papers they allow scientists sharing data to receive credit in currency recognizable within the academic system, thus "making data sharing count".[9] This provides not only an additional incentive to share data, but also through the peer review process, increases the quality of metadata and thus reusability of the shared data.

Thus data papers represent the scholarly communication approach to data sharing.

Despite their potentiality, data papers are not the ultimate and complete solution for all the data sharing and reuse issues and, in some cases, they are considered to induce false expectations in the research community.[10]

Data journals{{anchor|Journals}}{{anchor|Journal}}

Data papers are supported by a rich array of journals, some of which are "pure", i.e. they are dedicated to publish data papers only, while others – the majority – are "mixed", i.e. they publish a number of articles types including data papers.

A comprehensive survey on data journals is available [11]

A non-exhaustive list of data journals has been compiled by staff at the University of Edinburgh.[12]

Examples of "pure" data journals are:

Earth System Science Data,

Journal of Open Archaeology Data,

Open Health Data,

Polar Data Journal,

and Scientific Data.

Examples of "mixed" journals publishing data papers are:

Biodiversity Data Journal,

F1000Research,

GigaScience,

PLOS ONE,

and SpringerPlus.

Data citation{{anchor|Citation}}

Data citation is the provision of accurate, consistent and standardised referencing for datasets just as bibliographic citations are provided for other published sources like research articles or monographs. Typically the well established Digital Object Identifier (DOI) approach is used with DOIs taking users to a website that contains the metadata on the dataset and the dataset itself.[13][14]

Several organizations have been established with the aim of driving the data citation agenda. These include the following:[15]

  • CODATA Data Citation Standards and Practices Task Group
  • Data Preservation Alliance for the Social Sciences (Data-PASS)
  • DataCite
  • [https://www.force11.org/group/joint-declaration-data-citation-principles-final Data Citation Synthesis Group] of FORCE11
  • [https://rd-alliance.org/groups/data-citation-wg.html Data Citation Working Group] of the Research Data Alliance

Data citation is an emerging topic in computer science and it has been defined as a computational problem.[16]

Indeed, citing data poses significant challenges to computer scientists and the main problems to address are related to:[17]

  • the use of heterogeneous data models and formats – e.g., relational databases, Comma-Separated Values (CSV), eXtensible Markup Language (XML),[18][19] Resource Description Framework (RDF);[20]
  • the transience of data;
  • the necessity to cite data at different levels of coarseness – i.e., deep citations;[21]
  • the necessity to automatically generate citations to data with variable granularity.

See also

  • Data archiving
  • Registry of Research Data Repositories
  • Disciplinary repository

References

1. ^{{Cite journal|author=Costello MJ|year=2009|title=Motivating online publication of data|journal=BioScience|volume=59|issue=5|pages=418–427|doi=10.1525/bio.2009.59.5.9}}
2. ^{{Cite journal|author=Smith VS|year=2009|title=Data publication: towards a database of everything|journal=BMC Research Notes|volume=2|issue=113|pages=113|doi=10.1186/1756-0500-2-113|pmc=2702265|pmid=19552813}}
3. ^{{Cite journal|author1=Lawrence, B|author2=Jones, C.|author3=Matthews, B.|author4=Pepler, S.|author5=Callaghan, S.|year=2011|title=Citation and Peer Review of Data: Moving Towards Formal Data Publication|url=http://www.ijdc.net/index.php/ijdc/article/view/181|journal=International Journal of Digital Curation|volume=6|issue=2|pages=4–37|doi=10.2218/ijdc.v6i2.205}}
4. ^{{Cite journal|author=Callaghan, S., Donegan, S., Pepler, S., Thorley, M., Cunningham, N., Kirsch, P., Ault, L., Bell, P., Bowie, R., Leadbetter, A., Lowry, R., Moncoiffé, G., Harrison, K., Smith-Haddon, B., Weatherby, A., & Wright, D.|year=2012|title=Making data a first class scientific output: Data citation and publication by NERCs environmental data centres|url=http://ijdc.net/index.php/ijdc/article/view/208|journal=International Journal of Digital Curation|volume=7|issue=1|pages=107–113|doi=10.2218/ijdc.v7i1.218}}
5. ^{{Cite journal|vauthors=Kratz J, Strasser C|year=2014|title=Data publication consensus and controversies|url=http://f1000research.com/articles/3-94|journal=F1000Research|volume=3|issue=94|doi=10.12688/f1000research.4518}}
6. ^{{Cite journal|author1=Assante, M.|author2=Candela, L.|author3=Castelli, D.|author4=Tani, A.|year=2016|title=Are Scientific Data Repositories Coping with Research Data Publishing?|journal=Data Science Journal|volume=15|doi=10.5334/dsj-2016-006}}
7. ^{{Cite journal |author1=Chavan, V. |author2= Penev, L. |last-author-amp=yes |title=The data paper: a mechanism to incentivize data publishing in biodiversity science |journal=BMC Bioinformatics |year=2011 |volume=12 |issue=15 |url=http://www.biomedcentral.com/1471-2105/12/S15/S2 |doi=10.1186/1471-2105-12-S15-S2 |pages=S2 |pmid=22373175 |pmc=3287445}}
8. ^{{Cite journal |author1=Newman Paul |author2=Corke Peter |title=Data papers — peer reviewed publication of high quality data sets|journal=International Journal of Robotics Research|year=2009|volume=28|issue=5|pages=587|doi=10.1177/0278364909104283|url=http://ijr.sagepub.com/content/28/5/587}}
9. ^{{Cite journal |vauthors=Gorgolewski KJ, Margulies DS, Milham MP |title=Making data sharing count: a publication-based solution|journal=Frontiers in Neuroscience|year=2013|volume=7|pages=9|doi=10.3389/fnins.2013.00009|pmid=23390412|pmc=3565154}}
10. ^{{Cite journal |author1=Parsons, M.A. |author2=Fox, P.A.|title=Is data publication the right metaphor?|journal=Data Science Journal|year=2013|volume=12|pages=WDS31–WDS46|url=https://www.jstage.jst.go.jp/article/dsj/12/0/12_WDS-042/_article}}
11. ^{{Cite journal | author=Candela, L., Castelli, D., Manghi, P. and Tani, A.|title=Data Journals: A Survey|journal=Journal of the Association for Information Science and Technology|year=2015|volume=66|issue=1|pages=1747–1762|doi=10.1002/asi.23358}}
12. ^https://www.wiki.ed.ac.uk/display/datashare/Sources+of+dataset+peer+review
13. ^Australian National Data Service: Data Citation Awareness (Accessed 20 March 2012)
14. ^Ball, A., Duke, M. (2011). ‘Data Citation and Linking’. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/briefing-papers/
15. ^Data Citation Principles Workshop, May 16 - May 17, 2011, IQSS at Harvard University: Links (Accessed 20 March 2012)
16. ^Buneman, P., Davidson, S. and Frey, J. (2016). ‘Why data citation is a computational problem’. Communications of the ACM, To appear in September 2016. Available online: http://frew.eri.ucsb.edu/private/preprints/bdf-cacm-data-citation.pdf
17. ^Silvello, G. (2018). ‘Theory and Practice of Data Citation’. Journal of the Association for Information Science and Technology (JASIST) (AIS Review), vol. 69 issue 1, pp. 6-20, 2018. Available online (open access): https://onlinelibrary.wiley.com/doi/full/10.1002/asi.23917
18. ^Buneman, P. and Silvello, G. (2010). ‘A Rule-Based Citation System for Structured and Evolving Datasets’. IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 3, No. 3. IEEE Computer Society, pp. 33-41, September 2010. Available online: http://sites.computer.org/debull/A10sept/buneman.pdf
19. ^Silvello, G. (2017). ‘Learning to Cite Framework: How to Automatically Construct Citations for Hierarchical Data’. Journal of the Association for Information Science and Technology (JASIST), Volume 68 issue 6, pp. 1505-1524, June 2017. Available online: http://www.dei.unipd.it/~silvello/papers/2016-DataCitation-JASIST-Silvello.pdf
20. ^Silvello, G. (2015). ‘A Methodology for Citing Linked Open Data Subsets’. D-Lib Magazine 21 (1/2), 2015. Available online: http://www.dlib.org/dlib/january15/silvello/01silvello.html
21. ^Buneman, P. (2006). ‘How to Cite Curated Databases and how to Make Them Citable’. In Proc. of the 18th International Conference on Scientific and Statistical Database Management, SSDBM 2006, pages 195–203, 2006.

6 : Data publishing|Academic publishing|Open access (publishing)|Data|Open science|Scholarly communication

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/11 12:08:56