Challenges and opportunities for the re-use of New Data Types (NDTs) in a changing landscape


Published: Jul 26, 2024
Keywords:
new data types, data repositories, data re-use, research infrastructures
Dimitra Kondyli
https://orcid.org/0000-0003-1037-5415
Nicolas Klironomos
https://orcid.org/0000-0002-1485-6251
Abstract

Over the past fifteen years, technology has contributed to the emergence of new types of data, particularly big data, influencing the methods of observation, study, and measurement of social phenomena from the perspective of the social sciences. The increasing digitization of social activities generates vast amounts of data that fuel contemplation about the way modern societies function. Additionally, factors such as the recent COVID-19 pandemic with mandatory social distancing have contributed to the creation of a favourable environment for the generation of new types of data, with an emphasis on big data. Within this ongoing transformation of the data landscape, we will attempt to pose questions related to the environment of Data Repositories/Research Infrastructures and the means/methods of addressing and managing these data. It appears that social research is shifting towards a more "data-driven approach", which requires new skills and capabilities at the intersection of the computational and social sciences. One of the major issues that arise is the potential for collaborations between data organizations and researchers/users of data to promote not only a culture of data sharing but also the reuse of such data. This work will be based on primary and secondary sources generated within the framework of research projects in collaboration with CESSDA ERIC (European Social Science Data Archives-European Research Infrastructures), as well as literature on the management of data from various sources, with an emphasis on their legal/ethical and technical aspects.

Article Details
  • Section
  • Articles
Downloads
Download data is not yet available.
Author Biographies
Dimitra Kondyli, National Centre for Social Research (ΕΚΚΕ)

Research Director at the Institute of Social Research,  President of the SoDaNet Steering Committee, National Representative of SoDaNet at CESSDA - ERIC

Nicolas Klironomos , National Centre for Social Research (ΕΚΚΕ)

Political Scientist, Scientific Associate

References
Ackland, R. (2013). Web social science: Concepts, data and tools for social scientists in the Digital age. SAGE Publications.
Athanasiou, S., Amiridis, V., Gavriilidou, M., Gerasopoulos, E., Dimopoulos, A., Kaklamani, G., Karagiannis, F., Klampanos, I., Kondyli, D., Koumantaros, K., Konstantopoulos, P., Lenaki, K., Likiardopoulos, A., Manola, N., Mitropoulou, D., Benardou, A., Boukos, N., Nousias, A., Ntaountaki, M., … Psomopoulos, F. (2020). National Plan for Open Science. Zenodo. https://doi.org/10.5281/zenodo.3908953
Barbier, G., Liu, H. (2011). Data Mining in Social Media. In C. Aggarwal (Eds.), Social Network Data Analytics (pp. 327-352). Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_12
Bauchner, H., Golub, R. M., & Fontanarosa, P. B. (2016). Data sharing. JAMA, 315(12), p. 1238. https://doi.org/10.1001/jama.2016.2420
Besançon, L., Peiffer-Smadja, N., Segalas, C., Jiang, H., Masuzzo, P., Smout, C., Billy, E., Deforet, M., & Leyrat, C. (2021). Open science saves lives: Lessons from the covid-19 pandemic. BMC Medical Research Methodology, 21(1). https://doi.org/10.1186/s12874-021-01304-y
Bastin, G. & Tubaro, P. (2018). Le moment big data des sciences sociales. Paris : Presses de Sciences Po. Revue française de sociologie. 2018/3 Vol. 59, 375-394. Αποθήκευση 21/9/2019. Url: https://www.cairn.info/revue-francaise-de-sociologie-2018-3-page-375.htm.
Bishop, L. (2017). Big data and data sharing: Ethical issues. UK Data Service, UK Data Archive, 7.
Boté, J. J., & Termens, M. (2019). Reusing data technical and ethical challenges. DESIDOC Journal of Library & Information Technology, 39(06), pp. 329–337. https://doi.org/10.14429/djlit.39.06.14807
Boyd, D., & Crawford, K. (2012). Critical questions for Big Data. Information, Communication & Society, 15(5), pp. 662–679. https://doi.org/10.1080/1369118x.2012.678878
Breuer, J., Bishop, L., & Kinder-Kurlanda, K. (2020a). The practical and ethical challenges in acquiring and Sharing Digital Trace Data: Negotiating public-private partnerships. New Media & Society, 22(11), pp. 2058–2080. https://doi.org/10.1177/1461444820924622
Chou, M. H. (2014). The evolution of the European Research Area as an idea in European integration. Building the Knowledge Economy in Europe, pp. 27–50. https://doi.org/10.4337/9781782545293.00007
Conrad, F. G., Gagnon-Bartsch, J. A., Ferg, R. A., Schober, M. F., Pasek, J., & Hou, E. (2019). Social media as an alternative to surveys of opinions about the economy. Social Science Computer Review, 089443931987569. https://doi.org/10.1177/0894439319875692
Desrosières, A. (2005). Décrire l'État ou explorer la société : les deux sources de la statistique publique. Genèses 2005/1 (no 58), pp. 4-27. [In French]
Digital Science (2017). The State of Open Data Report 2017. Digital Science. https://doi.org/10.6084/m9.figshare.5481187
Dugoua, E., Kennedy, R., & Urpelainen, J. (2018). Satellite data for the Social Sciences: Measuring rural electrification with night-time lights. International Journal of Remote Sensing, 39(9), pp. 2690–2701. https://doi.org/10.1080/01431161.2017.1420936
ESFRI. (2006). European Roadmap for Research Infrastructures. 2006 Report. Office for Official Publications of the European Communities. Belgium. https://www.esfri.eu/sites/default/files/esfri_roadmap_2006_en.pdf
ESFRI. (2008). European Roadmap for Research Infrastructures. 2006 Report. Office for Official Publications of the European Communities. Belgium. https://www.esfri.eu/sites/default/files/esfri_roadmap_update_2008.pdf
Giglietto, F., & Rossi, L. (2012). Ethics and Interdisciplinarity in computational social science. Methodological Innovations Online, 7(1), pp. 25–36. https://doi.org/10.4256/mio.2012.003
Harari, G. M., Müller, S. R., Aung, M. S., & Rentfrow, P. J. (2017). Smartphone sensing methods for studying behavior in everyday life. Current Opinion in Behavioral Sciences, 18, pp. 83–90. https://doi.org/10.1016/j.cobeha.2017.07.018
Harford, T. (2014), Big data: A big mistake?. Significance, 11, pp. 14-19. https://doi.org/10.1111/j.1740-9713.2014.00778.x
Hemphill, L., Pienta, A., Lafia, S., Akmon, D., & Bleckley, D. (2022). How do properties of data, their curation, and their funding relate to reuse?. Journal of the Association for Information Science and Technology, 73(10), 1432–1444. https://doi.org/10.1002/asi.24646
Hox, J. J., & Boeije, H. R. (2005). Data Collection, Primary vs. Secondary. In Encyclopedia of Social Measurement (pp. 593-599). Elsevier, Amsterdam. https://doi.org/10.1016/B0-12-369398-5/00041-4
Hu, T., Guan, W. W., Zhu, X., Shao, Y., Liu, L., Du, J., Liu, H., Zhou, H., Wang, J., She, B., Zhang, L., Li, Z., Wang, P., Tang, Y., Hou, R., Li, Y., Sha, D., Yang, Y., Lewis, B., … Bao, S. (2020). Building an open resources repository for covid-19 research. Data and Information Management, 4(3), pp. 130–147. https://doi.org/10.2478/dim-2020-0012
Kabir, M., & Madria, S., (2020). CoronaVis: a real-time COVID-19 tweets data analyzer and data repository. arXiv preprint arXiv:2004.13932.
Kadakia, K. T., Beckman, A. L., Ross, J. S., & Krumholz, H. M. (2021). Leveraging open science to accelerate research. New England Journal of Medicine, 384(17). https://doi.org/10.1056/nejmp2034518
Karpf, D. (2012). Social Science Research Methods in internet time. Information, Communication & Society, 15(5), pp. 639–661. https://doi.org/10.1080/1369118x.2012.665468
Khan, N., Thelwall, M. & Kousha, K. (2022). Are data repositories fettered? A survey of current practices, challenges and future technologies. Online Information Review, 46(3), pp. 483-502. https://doi.org/10.1108/OIR-04-2021-0204
King, G. (2011). Ensuring the data-rich future of the Social Sciences. Science, 331(6018), pp. 719–721. https://doi.org/10.1126/science.1197872
Kleiner, B., Kondyli, D., Klironomos, N., Bishop, L., Vavra, M. & Cizek, T. (2022). D14 Overview and summary of existing outputs (inside and outside of CESSDA) on NDTs. CESSDA.
Kondyli, D. & Klironomos, N. (2022). FAIR Data: Opportunities and challenges for research infrastructures and research communities. In J. Kallas et al. (Eds.), Development of Infrastructures for Data Production and Management in the Social Sciences. [In Greek]. https://doi.org/10.17903/CV09INFRA
Kondyli, D. & Linardis A. (2021). New data types, new roles for research infrastructures?. In N. Nagopoulos (Ed.), Social Sciences today. Dilemmas and perspectives beyond the crisis. Proceedings of the 2nd conference of the School of Social Sciences, University of the Aegean. [In Greek]
Kondyli, D., Nisiotis, C. S., & Klironomos, N. (2024). Data reusability for migration research: A use case from SoDaNet data repository. Frontiers in Human Dynamics, 5. https://doi.org/10.3389/fhumd.2023.1310420
Kosinski, M., Matz, S. C., Gosling, S. D., Popov, V., & Stillwell, D. (2015). Facebook as a research tool for the Social Sciences: Opportunities, challenges, ethical considerations, and practical guidelines. American Psychologist, 70(6), pp. 543–556. https://doi.org/10.1037/a0039210
Lagoze, C., Block, W. C., Williams, J., Abowd, J., & Vilhuber, L. (2013). Data Management of Confidential Data. International Journal of Digital Curation, 8(1), pp. 265–278. https://doi.org/10.2218/ijdc.v8i1.259
Lazer, D., & Radford, J. (2017). Data ex machina: Introduction to big data. Annual Review of Sociology, 43(1), pp. 19–39. https://doi.org/10.1146/annurev-soc-060116-053457
Li, Y., Jiang, B., Shu, K., & Liu, H. (2020). Toward a multilingual and multimodal data repository for covid-19 disinformation. 2020 IEEE International Conference on Big Data (Big Data). https://doi.org/10.1109/bigdata50022.2020.9378472
Linardis, A., Alexandris, K. & Klironomos, N. (2022). The new SoDaNet Data Catalogue. The transition from Nesstar to Dataverse. In J. Kallas et al. (Eds.), Development of Infrastructures for Data Production and Management in the Social Sciences (pp. 147-183). [In Greek]. https://doi.org/10.17903/CV06INFRA
Ma, Y., Wu, H., Wang, L., Huang, B., Ranjan, R., Zomaya, A., & Jie, W. (2015). Remote sensing big data computing: Challenges and opportunities. Future Generation Computer Systems, 51, pp. 47–60. https://doi.org/10.1016/j.future.2014.10.029
Mannheimer, S., & Hull, E. A. (2018). Sharing selves: Developing an ethical framework for curating Social Media Data. International Journal of Digital Curation, 12(2), pp. 196–209. https://doi.org/10.2218/ijdc.v12i2.518
Metzler, K., Kim, D. A., Allum, N., & Denman, A. (2016). Who is doing computational social science? Trends in big data research (White paper). SAGE Publishing. https://doi.org/10.4135/wp160926
OECD (2013). New data types for understanding the human condition. https://www.oecd.org/sti/inno/new-data-for-understanding-the-human-condition.pdf
OECD (2016). Research ethics and new forms of data for social and Economic Research. OECD Science, Technology and Industry Policy Papers. https://doi.org/10.1787/5jln7vnpxs32-en
Politou, E., Alepis, E., Virvou, M., & Patsakis, C. (2021). Conclusions. Privacy and Data Protection Challenges in the Distributed Era, pp. 181–185. https://doi.org/10.1007/978-3-030-85443-0_11
Reinhart, A., Brooks, L., Jahja, M., Rumack, A., Tang, J., Agrawal, S., Al Saeed, W., Arnold, T., Basu, A., Bien, J., Cabrera, Á. A., Chin, A., Chua, E. J., Clark, B., Colquhoun, S., DeFries, N., Farrow, D. C., Forlizzi, J., Grabman, J., … Tibshirani, R. J. (2021). An open repository of real-time covid-19 indicators. Proceedings of the National Academy of Sciences, 118(51). https://doi.org/10.1073/pnas.2111452118
Resnik, D. B., & Elliott, K. C. (2015). The ethical challenges of socially responsible science. Accountability in Research, 23(1), pp. 31–46. https://doi.org/10.1080/08989621.2014.1002608
Ruppert, E., Law, J., & Savage, M. (2013). Reassembling social science methods: The Challenge of Digital Devices. Theory, Culture & Society, 30(4), pp. 22–46. https://doi.org/10.1177/0263276413484941
Salah, A. A., Canca, C., & Erman, B. (2022). Ethical and legal concerns on data science for large scale human mobility. In A. A., Salah et al. (Eds.), Data Science for Migration and Mobility. Proceedings of the British Academy. https://webspace.science.uu.nl/~salah006/salah22legal.pdf
Savage, M. (2016). The use of big data in the analysis of inequality. In ISSC, IDS and UNESCO, Challenging Inequalities: Pathways to a Just World, World Social Science Report. UNESCO Publishing http://en.unesco.org/wssr2016
Sawchuk, S. L., & Khair, S. (2021). Computational reproducibility: A practical framework for data curators. Journal of eScience Librarianship, 10(3). https://doi.org/10.7191/jeslib.2021.1206
Shah, D. V., Cappella, J. N., & Neuman, W. R. (2015). Big Data, digital media, and Computational Social Science. The ANNALS of the American Academy of Political and Social Science, 659(1), pp. 6–13. https://doi.org/10.1177/0002716215572084
Silber, H., Breuer, J., Beuthner, C., Gummer, T., Keusch, F., Siegers, P., Stier, S., & Weiß, B. (2022). Linking surveys and digital trace data: Insights from two studies on determinants of data sharing behaviour. Journal of the Royal Statistical Society Series A: Statistics in Society, 185(Supplement_2). https://doi.org/10.1111/rssa.12954
Stier, S., Breuer, J., Siegers, P., & Thorson, K. (2019). Integrating survey data and digital trace data: Key issues in developing an emerging field. Social Science Computer Review, 38(5), pp. 503–516. https://doi.org/10.1177/0894439319843669
Stuart, D., Baynes, G., Hrynaszkiewicz, I., Allin, K., Penny, D., Lucraft, M., & Astell, M. (2018). Whitepaper: Practical challenges for researchers in data sharing. https://doi.org/10.6084/m9.figshare.5975011.v1
Thanos, C. (2017). Research data reusability: Conceptual Foundations, barriers and Enabling Technologies. Publications, 5(1), 2. https://doi.org/10.3390/publications5010002
Ulnicane, I. (2019). Broadening aims and building support in science, technology and innovation policy: The case of the European Research Area. Journal of Contemporary European Research, 11(1), pp. 31-49.
Uzwyshyn, R. (2016) Online Research Data Repositories: The What, When, Why and How. Computers in Libraries, 36(3), pp. 18-21. https://digital.library.txstate.edu/handle/10877/7597
Zhou, B., Pei, J., & Luk, W. (2008). A brief survey on anonymization techniques for privacy preserving publishing of Social Network Data. ACM SIGKDD Explorations Newsletter, 10(2), pp. 12–22. https://doi.org/10.1145/1540276.1540279
Most read articles by the same author(s)