Tips for Data Quality

Publication of information on qualifications and learning opportunities is crucial to ensure that Europass can work as a tool to support lifelong learning and career management (Article 3.2 Europass decision).

Working towards an increase in data quality is fundamental to having relevant, rich, accurate, complete, and trustworthy information on qualifications, learning opportunities and accreditations. This includes ensuring completeness of the data fields, the correct use of data fields from the European Learning Model and a correct structure of the data.

Improving data quality is primarily a responsibility of Member States, as owners and/or providers of the data, while the European Commission provides various support points (information provision, guidance and technical support).

Relevance of the provision of high data quality

The QDR should contain clear, rich, updated and reliable information on information on qualifications, learning opportunities and accreditations, coming from all countries involved in Europass and in the European Qualifications Framework (EQF). The following elements are crucial to ensure high data quality:

  • Completeness; besides the mandatory data fields, the data should include as many recommended and optional fields indicated in the European Learning Model as possible;
  • Consistency (structured data); the data should be structured to align with the European Learning Model. Structured data are better captured by Artificial Intelligence techniques which are behind the functioning of the “course recommender” system;
  • Accuracy; the data should be reliable and exclude errors;
  • Up to date; the data should be updated regularly. No ’outdated data’ should be present in the database such as wrong starting dates and expired web links.

Focus on high-quality data regarding the qualifications and learning opportunities in Europass is relevant for several reasons:

  • The search function: when data is placed in the correct fields, complete and structured, the results coming from the search will be more accurate and the visualisation of learning opportunities and qualifications will be better. E.g. when a user searches for a particular“thematic field’’, and learning opportunities or qualifications of a country’s database are tagged with the wrong thematic field, results might not be accurate and useful for the end user, and therefore not well visualised;
  • To provide the end-users and stakeholders with qualitative, relevant, and extensive information on the details of a learning opportunity or of a qualification;
  • To suggest courses to end-users based on relevant and correct data (recommender system), using Artificial Intelligence techniques.
Adverse effects of poor data quality

Poor data quality could result in less reliable data. This could lead to misunderstandings for the users, difficult localisation in the search and less satisfied end-users users in general.

Following an analysis performed on the data currently available in the QDR, the Commission identified four recurrent issues in the quality of the data, which should be addressed in order to ensure that data follow the above-mentioned quality standards:

  • Inappropriate use of data fields: information provided is not relevant/correct in relation to the specific data field. Examples that have occurred are;
    • wrong ISCED-F or language codes
    • very generic data
    • a single data field containing a mix of multiple items
    • uninformative fields (e.g. "see description")
    • fields containing mark-up language;
  • Poor quality of data in terms of structure, completeness and/or precision. This was observed in particular in relation to the description of learning outcomes; unstructured learning outcomes descriptions e.g. by jumbling all learning outcomes into one textbox, or a lack of information on learning outcomes
  • Duplication of information across several data fields or across different learning opportunities/qualifications. Examples that have occurred are the use of EQF level descriptors to fill in learning outcomes fields or identical descriptions of learning outcomes for different learning opportunities or qualifications)
  • Many (non-mandatory) data fields left empty
  • Dummy data provided for certain data fields
Next steps to improve data quality

Poor data quality can lead to bad and even wrong results and suggestions and limit the full potential of Europass. On the short term, the Commission will provide guidelines on how to improve data quality (tips and tricks), and the Commission will provide a “golden data example’’, which is a course in the Europass Interface mapped with the fields in the data model.

In the longer term, the Commission aims to provide targeted personalised support (data quality reports), or explore the possibilities to check the quality of the data within the QDR via KPIs indicators.