Skip to content

OpenRefine

OpenRefine andmete puhastamiseks - 03.05.2024, Tallinn - Kutsega

Praktiline töötuba, kus puhastame käsitsi sisestatud andmeid OpenRefine tarkvaraga. Esmalt käsitleme tabelarvutuse parimaid tavasid ning seejärel rakendame teadmisi OpenRefine’is. Koolitusel uurime OpenRefine tarkvara funktsionaalsusi: andmete puhastamist suuremate partiidena ja andmete ühtlustamist ühe hoobiga. Lisaks tutvustame võimalust laadida alla täiendavaid andmeid teistest andmebaasidest ja lisarakendusi.

Autumn semester ELIXIR courses

ELIXIR Estonia is continuing with the data management-related lectures and workshops this semester. To get more information about these courses, read below and visit https://elixir.ut.ee/training.  

 

We ask you to register responsibly. If you can't attend the lecture, please let us know as soon as possible via email (elixir@ut.ee).

 

How to make your messy data usable? / OpenRefine - 30.11.2023

  • Time: 30.11.2023 13:00-17:00
  • Type: Workshop
  • Language: English
  • Duration: 4h
  • Location: Delta building, Narva mnt 18 room 1008, Tartu
  • Audience: People, who need to clean messy data
  • Instructor: Diana Pilvar, MSc

The practical workshop on cleaning your messy data with OpenRefine software.

First, we will cover spreadsheet best practices. Then, we will put that knowledge into practice with OpenRefine. This course will explore the depths of OpenRefine software and see what it can offer. This will include cleaning the data in bigger batches and unifying the data in one sweep (transforms and expressions). Additionally, we will introduce the possibility of downloading additional data from other databases and different extensions OpenRefine software has.

Learning outcomes for the participants: 

  • Describe spreadsheet best practices
  • Compare Excel and OpenRefine
  • Apply transforms (cell editing, column editing, transposing) in OpenRefine
  • Write simple expressions in OpenRefine
  • Match your dataset with that of an external source 

We ask you to register responsibly. If you can't attend the lecture, please let us know as soon as possible via email (elixir@ut.ee).

In order to not miss out on a course next time, subscribe to our newsletter at  https://lists.ut.ee/wws/subscribe/elixir.news?previous_action=edit_list_request
Applications are accepted manually within a few days. 

Autumn semester Data Management courses

ELIXIR Estonia is continuing with the data management-related lectures and workshops this semester. To get more information about these courses, read below and visit https://elixir.ut.ee/training

 

Face-to-Face (F2F) lectures will take place at the Delta building in Tartu (Narva mnt 18, Tartu). 

ONLINE workshops will be held through Zoom (the meeting link will be sent a couple of days before the workshop). 


 

  • 13.09.2022 - Metadata and README (lecture, F2F, 2h) CLOSED

Register: https://forms.gle/6y2FaHMN7nrbA5bR7 

More information: https://elixir.ut.ee/node/448 

  • 27.09.2022 - Licensing Research Outputs (lecture, F2F, 2h) CLOSED

Register: https://forms.gle/ahVcG64FPJvTwuu68 

More information: https://elixir.ut.ee/node/450 

  • 4.10.2022 - How to Make Your Messy Data Usable? - PART 1 (workshop, ONLINE, 2h + independent work) CLOSED

Register: https://forms.gle/bWiBGrpanoot3YR59 

More information: https://elixir.ut.ee/node/456 

  • 25.10.2022 - Data Visualization I - Figures (lecture, F2F, 2h) CLOSED

Register: https://forms.gle/rjiGvJj32qweG5mj6 

More information: https://elixir.ut.ee/node/452 

  • 1.11.2022 - Crash Course of GDPR (lecture, F2F, 2h) - Cancelled

Register: https://forms.gle/dNVQPs3y35k5Ua8y8

More information: https://elixir.ut.ee/node/454 

  • 8.11.2022 - How to Make Your Messy Data Usable? - PART 2 (workshop, ONLINE, 1h + independent work) CLOSED

Register: https://forms.gle/2uMN9RN8LEyT8doM7 

More information: https://elixir.ut.ee/node/456 

 

We do ask you to register for the lecture responsibly. If you can’t attend the course, please let us know as soon as possible via email at elixir@ut.ee

How to Make Your Messy Data Usable? - Beginner and Advanced workshop - 04.10.2022/8.11.2022

On the 4th of October 2022, ELIXIR-Estonia will be holding a data management online course: How to Make Your Messy Data Usable? - PART 1. The course will be held in English. This course will be in two parts: a 1.5-hour online lecture on how to make a spreadsheet usable for other people held on the 4th of October at 10:00 in Zoom. The practical workshop on cleaning your messy data with OpenRefine software will be a video lecture you can follow in your own time. Additionally, we will hold 2 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software.

This course will cover how to name your files and variables, version control, compile a data dictionary, and what to do with empty cells. In the second part of this lecture, OpenRefine software is introduced. With this, you can easily clean up the messy data. For the more practical aspect of using the OpenRefine software, we will share a video that will teach the basics. You can watch it anytime and do the lessons yourself. On two days (7.10 and 10.10), there will be a 1h slot (10:00-11:00) on Zoom, when you can come and ask any question you have regarding tables and OpenRefine software. 

 

Workshop information for PART 1

Lecture: 4th of October, 2022 at 10:00 (lecture, 1.5h)

Q&A session: 7.10 and 10.10 at 10:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: CLOSED

Registration closes at 23:59 on 30.09.2022 or when the course gets full.

 

Learning outcomes for the participants: 

  • Compile a data table that abides by the FAIR Principles
  • Recognize what a clean table for others to use looks like
  • Explain how to use OpenRefine to clean the messy data


 

On the 8th of November 2022, ELIXIR-Estonia will be holding a data management online course: How to Make Your Messy Data Usable? - PART 2. The course will be held in English. This course will be in two parts: a 1-hour online lecture on what you can do in OpenRefine, including transformations, expressions, and extensions for the software, held on the 8th of November at 10:00 in Zoom. The practical workshop on how to transform your previously cleaned data with OpenRefine software will be a video lecture that you can follow in your own time. Additionally, we will hold 2 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software.

This is a follow-up workshop for “How to make your messy data usable - PART 1”, where we talked about spreadsheets but also introduced the OpenRefine software. This course will explore the depths of OpenRefine software and see what else it can offer. This will include cleaning the data in bigger batches and unifying the data in one sweep. These include transformations and expressions. Additionally, we will introduce the possibility of downloading additional data from other databases and different extensions OpenRefine software has. 

You should have participated in the PART 1 workshop in order to register for the PART 2. 

 

Workshop information for PART 2

Lecture: 8th of November, 2022 at 10:00 (lecture, 1h)

Q&A session: 10.11 at 11:00 and 14.11 at 10:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: CLOSED

Registration closes at 23:59 on 04.11.2022 or when the course gets full.

 

Learning outcomes for the participants: 

  • Understands what OpenRefine software can do with data
  • Knows what kind of transformations OpenRefine software supports
  • Knows how to write transformations and expressions in OpenRefine software

 

ONLINE workshops will be held through Zoom (the meeting link will be sent a couple of days before the workshop). 

We do ask you to register for the lecture responsibly. If you can’t attend the course, please let us know as soon as possible via email at elixir@ut.ee

“How to make your messy data usable?” and “Metadata and README” courses REGISTRATION CLOSED

In the month of April, ELIXIR Estonia will be holding two data management online courses: "How to make your messy data usable?" on the 4th of April and "Metadata and README" on the 11th of April. Both of the courses will be held online, in Zoom, and in English. 

"How to make your messy data usable?" course will be in two parts: an 1.5 hour online lecture on how to make a spreadsheet usable for other people, held on the 4th of April at 10:00 in Zoom. The practical workshop on cleaning your messy data with OpenRefine software will be a video lecture that you can follow on your own time. Additionally, we will hold 3 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software. In the "Metadata and README" lecture, we will be going over what exactly is metadata, what is the minimum information that should be included with each of the scientific results you are sharing and how exactly can you write a README file. 

 

In recent years, more attention is put on what researchers do with the data (and other resources) they produce. Especially in Europe, but also everywhere else. The main idea is that when researchers use taxpayers' money, the taxpayers themselves should also have access to the results, free of charge. This means that the research should be published in open access journals and data should be made publicly available. 

Good data management may help you with that, at least to make the process easier on the whole. If you think what to do with your data at the beginning and during the project and know what you plan to do with it at the end of the project, the process at the end will be easier. However, what is “good data management”, is up to debate. The FAIR Principles concentrates on making your data findable, accessible, interoperable and reusable, so this is a good start. And let’s be honest, some of these things you are probably already doing. 

 

How to make your messy data usable? course information

In this course, we will be going over how to name your files and variables, version control, compile a data dictionary, and what to do with empty cells. In the second part, OpenRefine software is introduced. With this, you can easily clean up the messy data. For the more practical aspect of using the OpenRefine software, I will share a video that will teach the basics. You can watch it anytime and do the lessons yourself. On three days (6.04, 7.04 and 8.04) there will be a 1h slot (10:00-11:00) on Zoom, when you can come and ask any question you have regarding tables and OpenRefine software. 

 

Information about the lecture:

Lecture: 4th of April, 2022 at 10:00 (lecture, 1.5h; in English)

Q&A session: 6.04, 7.04 and 8.04 at 10:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: https://forms.gle/axZTA5rw3bPnKDww9 REGISTRATION IS CLOSED

Registration closes at 23:59 on 31.03.2022 or when the course gets full.

Learning outcomes: 

  • Compile a data table that abides by the FAIR Principles
  • Recognize what a clean table for others to use looks like
  • Explain how to use OpenRefine to clean the messy data

 

Metadata and REAME lecture information

In general, metadata is the descriptive information about your data. However, what exactly is metadata and how much of it should be included with your data? Good metadata can make up for human fallibilities. People forget and misplace things, and leave research projects taking their knowledge of the research methodology and the data with them. Metadata ensures that we will be able to find the data, use it, preserve and reuse it in the future.

  • Finding Data. Metadata makes it much easier to find relevant data. Most searches are done using text (like a Google search), so formats like audio, images, and video are limited unless text metadata is available. Metadata also makes text documents easier to find because it explains exactly what the document is about.
  • Using Data. To use a dataset, researchers need to understand how the data is structured, definitions of terms used, how it was collected, and how it should be read.
  • Reusing Data. Researchers often want to reuse data collected for another project for their own project. The data still needs to be found and used, but often at a higher level of trust and understanding. Reusing data often requires careful preservation and documentation of the metadata.

This means that the metadata provides additional information that helps data consumers to better understand the meaning of the dataset, its structure and to clarify other issues, such as rights and license terms, the organization that generated the data, data quality, data access methods and the update schedule of datasets. Additionally, metadata also gives information about the data in general. What an actual metadata file includes, varies between disciplines and types of data you are working with. However, the documentation for your data should contain the minimum information required to be able to reuse (or understand) the data described. 

In this lecture, we will be going over what metadata about your dataset should be included when you are sharing it. Additionally, we will also go over some examples on how to write a good README file. 

 

Information about the lecture:

Time: 11th of April, 2022 at 10:00 (lecture, 2h; in English)

Place: ZOOM (link will be sent to your email)

Register: https://forms.gle/YKvQyd8wrx2cvyYf9 REGISTRATION IS CLOSED

Registration closes at 23:59 on 31.03.2023 or when the course gets full.

Learning outcomes: 

  • Understands the importance of good data management
  • Knows what metadata means in data files
  • Knows how to add metadata to the data
  • Knows what should be included in the README file
  • Can write a simple README file to accompany the data

 

How to make your messy data usable? (registration closed)

On the 25th of November 2021, ELIXIR-Estonia will be holding a new data management online course: How to make your messy data usable. The course will be held in English. This course will be in two parts: an 1 hour online lecture on what makes a data table usable for other people held on 25th of November at 13:00 in Zoom. The practical workshop on cleaning your messy data with OpenRefine software will be a video lecture that you can follow in your own time. Additionally, we will hold 3 Q&A sessions in Zoom, where you can talk about any problems you encountered with the OpenRefine software.

 

More attention is put on what researchers do with the data (and other resources) they produce in recent years, especially in Europe, also in everywhere else. Since most of your data needs to be uploaded to a repository, it is essential that the data is tidy and other people understand and can easily read your data.  

In this course, we will be going over how to name your files and variables, version control, compile a data dictionary, and what to do with empty cells. In the second part, OpenRefine software is introduced. With this, you can easily clean up the messy data. For the more practical aspect of using the OpenRefine software, I will share a video that will teach the basics. You can watch it anytime and do the lessons yourself. On three days (29.11, 30.11 and 1.12) there will be a 1h slot (11:00-12:00) on Zoom, when you can come and ask any question you have regarding tables and OpenRefine software. 

 

Information about the lecture

Lecture: 25th of November, 2021 at 13:00 (lecture, 1h)

Q&A session: 29.11, 30.11 and 1.12 at 11:00 (Q&A, feedback, 1h)

Place: ZOOM (link will be sent to your email)

Register: registration closed

Registration closes at 23:59 on 24.11.2021 or when the course gets full.

Materials: https://doi.org/10.5281/zenodo.5720271 

 

Learning outcomes: 

  • Compile a data table that abides by the FAIR Principles
  • Recognize what a clean table for others to use looks like
  • Explain how to use OpenRefine to clean the messy data