Stata _merge

Merging datasets. Stata automatically creates a variable called merge which indicates the results of the merge operation. The variable takes the values. You merge when you want to add more variables to an existing dataset (type help merge in the command window for more details) What you need: Both files should be in Stata format Both files should have at least one variable in common (id) Step1. You need to sort the data (both datasets) by the id or ids common to the files you want to merge and save the files.

  1. Stata _merge 1 2 3
  2. Stata Merge
_mergeStata

Here's what you must know about the two datasets you are about to merge.

  • What is the identifier variable on which the files should be combined?
  • Is each observation (row) of the identifier variable unique? In other words, does each row value for the identifier variable occur only once? The answer to this question matters for how you would merge the two datasets, as you will see.

Let's evaluate the two items above in turn.

Stata _merge 1 2 3

  • Since we wish to combine data on a person's age and data on a person's sex, the identifier variable is person.
  • In Dataset 1, each person appears only once, so person uniquely identifies each person in the dataset. Likewise for Dataset 2. This means that we should perform a one-to-one merge of the two datasets based on person.

Before merging, it is good practice to verify whether or not your identifier variable/s is/are unique across observations with duplicates report. Here you would type duplicates report person.

Stata _merge variable

Stata Merge

Here's what we want to do:

Stata merge two files
Dataset 1
Dataset 2
Merged data
personage
A21
B57
C35
D23
personsex
Amale
Bfemale
Cmale
Dfemale
personagesex
A21male
B57female
C35male
D23female