Managing duplicates

Wiki - 2016-02-28

Table of contents

1. How to access the Duplicates function ?

2. How to use the tool ?

    Which criterions (variables) can be compared ?

    How to display a supplementary variable ?

3. How to remove a patient ?

Duplicate patients may induce bias in your statistics. Clean them before analyses !

In practice, duplicates correspond to patients (or biologic samples, ...) which have been included twice or more times in your series. Happily, a tool exists in EasyMedStat to find these duplicates and eventually remove them : it is called ... Duplicates.

How to access the Duplicates function ?

Fig. 1 - Access the Duplicates tool by getting to "Data Manager" (1.) and then click on "Duplicates" (2.)

How to use the tool ?

Basically, the tool will display the patients with the same birth date, gender and initials, grouped by birth date

You can adjust the number of criterions (variables) compared by changing the "Level of Rigor" (Fig. 2b).

  • For example, initials may be wrong for some patients, so you want to ignore them. Change the level of rigor to "Strict" and the tool will only compare birth date and genders.
  • Or you can also decide to show patients which looks more similar with more criterion by choosing a "softer" level of rigor.

Which criterions (variables) can be compared ?

You can select patients which have the same a. birth date, b. gender, c. initials, d. inclusion date, e. last consultation, f. death date.
For the moment, no other variable can be chosen to adjust duplicate search. But you can still display the variable of your choice (see How to display a supplementary variable ?).

Fig. 2 - Main functions of the Duplicates tool

a. Active filters : you can look for duplicates among a subset of patients if you use filters.
b. Level of Rigor defines how many criterions (variables) must be similar between two patients to be considered as duplicates.
c. Patient counter tells you the number of duplicate patients that have been found according to the level of rigor. The criterions (variables) compared are also detailed in this line.
d. Birth date is always compared between patients (fixed criterion). Patients are grouped by date of birth which is displayed here
e. Supplementary variable : this select list allows you to display another variable in addition to all of the criterions you chose. This is useful if you want to look for patients which may be included twice (e.g. if you create one "patient" for every blood sample, and that one patient can have several blood samples analyzed)

How to display a supplementary variable ?

Let's say you are an orthopedic surgeon and you have created one entry for every shoulder you operated and you want to remove "shoulders" with the same patient reference, but keep "shoulders" with different patient references.

Next to every patient group, you will find a select list (Fig. 3) displaying all the variables of your series.
Just choose the variable you want to display and wait for the page to reload.
Now, the variable you have chosen (e.g. Patient reference) will be displayed next to the criterions.

Fig. 3 - Display Supplementary variable using the select list

How to remove a patient ?

Once you have found duplicates, you should remove them. Click here for more details on how to remove a patient.

Recent posts

Most peer-review journals will ask you a "Statistical analysis" chapter in your article (in Material and Methods), in which you can quote EasyMedStat as follows: "Statistical analysis was performed with the online software EasyMedStat (www.easymed ... Read more
It's easy to customize a bar chart. Read more
You can easily perform a survival analysis, with the Kaplan-Meier method. Read more
Variables can be grouped into categories and subcategories to help you. Read more
To add a patient, click on the "Patients" menu and then choose "+ Add patient" Read more

Copyright EasyMedStat© 2013

Easy Made Stat - Association Loi 1901 - n° W922009408
Neuilly-sur-Seine - France

Contact us - Legal terms