Merge Files Stata

  • In the -local tomerge- command to whatever the appropriate directory is. If you choose the latter route, then you must also change the -user `starter', clear- and -merge 1:1 FIPScode year using `f'- commands to refer to the full pathname of the files. (For these reasons, I think it is simpler to -cd- to the directory where the files are.).
  • To create a single string in stata I know you do this: local x = 'a string' But I have about 200 data files I need to loop through, and they are not conveniently named with consecutive suffixes like '2000' '2001' '2002' etc.

Joinby is similar to merge but forms all combinations of the observations where it makes sense. Joinby would be appropriate, for instance, where A contained data on parents and B contained data on their children. Joinby familyid would form a dataset of each parent joined with each of his.

Here's what you must know about the two datasets you are about to merge.

  • What is the identifier variable on which the files should be combined?
  • Is each observation (row) of the identifier variable unique? In other words, does each row value for the identifier variable occur only once? The answer to this question matters for how you would merge the two datasets, as you will see.
Merge

Let's evaluate the two items above in turn.

New
  • Since we wish to combine data on a person's age and data on a person's sex, the identifier variable is person.
  • In Dataset 1, each person appears only once, so person uniquely identifies each person in the dataset. Likewise for Dataset 2. This means that we should perform a one-to-one merge of the two datasets based on person.

Before merging, it is good practice to verify whether or not your identifier variable/s is/are unique across observations with duplicates report. Here you would type duplicates report person.

Here's what we want to do:

Merge jpg files

Stata Percentiles

Dataset 1
Dataset 2
Merged data
personage
A21
B57
C35
D23
personsex
Amale
Bfemale
Cmale
Dfemale
personagesex
A21male
B57female
C35male
D23female

Merge Excel Files Stata

    • Mar 2019
    • 4

    Merging Excel Files to Create a Stata Dataset

    Hello all,
    New here, new to statistics, programming, and brand new to Stata so bear with me.
    I am trying to merge three separate Excel files (Location: 'C:Stata') into a single Stata dataset ('MERGED') for analyses.
    File names:
    AE.xls
    FZ.xls
    WEIGHTS.xls
    Unique Identifier (in all three): ID
    What would be the code to go about creating this new data set? Thank you so much in advance.
    • Mar 2014
    • 2675
    In general, see
    If you are planning to stick with Stata for a while, read those help files, then start step by step. Something like
    A canned alternative is xls2dta (from SSC, see: help ssc). If those three files are the only Excel files in the directory
    If you are interested in the underlying technique, search for similar posts on the forum and see
    Best
    Daniel

    Comment

    • Mar 2019
    • 4
    Daniel, thank you so much for a prompt reply! Just to clarify, should I be doing just the steps you listed or something like this (because there are three total files);
    import excel using C:StataAE.xls , clear save AE.dta import excel using C:StataFZ.xls , clear save FZ.dta import excel using C:StataWEIGHTS.xls , clear save WEIGHTS.dta merge 1:1 ID using AE , generate(_mergeAE) merge 1:1 ID using FZ , generate(_mergeFZ) merge 1:1 ID using WEIGHTS , generate(_WEIGHTS) save MERGED.dta I will give this a shot and check should I need further help. Thanks!

    Comment

    • Mar 2019
    • 4
    import excel using C:StataAE.xls , clear
    save AE.dta
    import excel using C:StataFZ.xls , clear
    save FZ.dta
    import excel using C:StataWEIGHTS.xls , clear
    save WEIGHTS.dta
    merge 1:1 ID using AE , generate(_mergeAE)
    merge 1:1 ID using FZ , generate(_mergeFZ)
    merge 1:1 ID using WEIGHTS , generate(_WEIGHTS)
    save MERGED.dta

    Comment

    • Mar 2014
    • 2675
    Saving the last dataset, that is WEIGHTS, is not necessary; you are using this last imported dataset as your master dataset to which the other two are then merged. Likewise, the last merge command is not necessary because you start out with WEIGHTS in memory.
    Best
    Daniel

    Comment

    • Mar 2019
    • 4
    Sounds great! Thank you so much!

    Comment