Sas Merge If

To perform a one-to-one merge, use the MERGE statement without a BY statement. SAS combines the first observation from all data sets that are named in the MERGE statement into the first observation in the new data set, the second observation from all data sets into the second observation in the new data set, and so on. If data sets don’t share a common variable, you can merge them using a series of merges in separate DATA steps. The data sets must be sorted by the appropriate BY variable. Match-Merging a SAS Data Set and an Excel Workbook LIBNAME libref 'physical-file-name'; You can merge a SAS data set with an Excel workbook. You can use the SAS/ACCESS. SAS Syntax (.sas) Syntax to read the CSV-format sample data and set variable labels and formats/value labels. Combining Datasets by Merging In many practical situations, you may have relevant data in two different datasets, and in order to perform your analysis, you'll need to combine those datasets. There are three steps to match merge the dads file with the faminc file (this is called a one-to-one merge because there is a one to one correspondence between the dads and faminc records). These three steps are illustrated in the SAS program merge1.sas below. Use proc sort to sort dads on famid and save that file (we will call it dads2).

1. Introduction

When you have two data files, you can combine them by merging them side by side, matching up observations based on an identifier. For example, below we have a data file containing information on dads and we have a file containing information on family income called faminc. We would like to match merge the files together so we have the dads observation on the same line with the faminc observation based on the key variable famid.

After match merging the files, they would look like this.

2. One-to-one merge

There are three steps to match merge the dads file with the faminc file (this is called a one-to-one merge because there is a one to one correspondence between the dads and faminc records). These three steps are illustrated in the SAS program merge1.sas below.

  1. Use proc sort to sort dads on famid and save that file (we will call it dads2)
  2. Use proc sort to sort faminc on famid and save that file (we will call it faminc2)
  3. merge the dads2 and faminc2 files based on famid

These three steps are illustrated in the program below.

The output of the program is shown below.

The output from shows that the match merge worked properly. The dad and faminc are merged side by side. The next example considers a one-to-many merge where one observation in one file may have multiple matching records in another file. We will see that kind of merge is really no different from the one-to-one merge we saw here.

3. One-to-many merge

Imagine that we had a file with dads like we saw in the previous example, and we had a file with kids where a dad could have more than one kid. Matching up the 'dads' with the 'kids' is called a 'one-to-many' merge since you are matching one dad observation to possibly many kids records. The dads and kids records are shown below.

After matching the dads with the kids you get a file that looks like the one below. Bill is matched up with his kids Beth,Bob and Barb; Art is matched up with AndyAl, and Ann; and Paul is matched up with Pete,Pam and Phil.

Sas Merge If

Just like the 'one-to-one' merge, we follow the same three steps for a 'one-to-many' merge. These three steps are illustrated in the SAS program merge2.sas below.

  1. Use proc sort to sort dads on famid and save that file (we will call it dads2)
  2. Use proc sort to sort kids on famid and save that file (we will call it kids2)
  3. merge the dads2 and kids2 files based on famid

The program below illustrates these steps.

The output of the program is shown below.

The output shows just what we hoped to see, the dads merged along side of their kids. You might have wondered what would have happened if the merge statement had reversed the order of the files, had we changed step 3 to look like below.

The output with the modified step 3 is shown below.

This output shows what happened when we switched the order of kids2 and dads2 in the merge statement. The merge results are basically the same, except that the order of the variables is modified — the kids variables are on the left and the dads variables are at the right. Other than that, the results are the same.

4. Problems to look out for

These examples cover situations where there are no complications. We show some examples of complications that can arise and how you can solve them below.

4.1 Mismatching records in one-to-one merge

Sas Merge If A And Not B

The two data files havemay have records that do not match. Below we illustrate this by includingan extra dad (Karl in famid 4) that does not have a correspondingfamily, and there are two extra families (5 and 6) in the family filethat do not have a corresponding dad.

As you see above, we usethein option to create a 0/1 variable fromdadx that indicateswhether the resulting file contains a record with data from the dadsfile. Likewise, we use IN option to create a 0/1 variable fromfamxthatindicates if the observation came from the faminc file. The fromdadx and fromfamx variables are temporary, so we make copies of them in fromdad and fromfam so we have copies of these variables that stay with the file. We can then use proc print and proc freq to identify the mismatching records.

PROC PRINT DATA=merge121;
RUN;

PROC FREQ DATA=merge121;
TABLES fromdad*fromfam;
RUN;

The output belowillustratesthat there were mismatching records. For famid 4, the valueof fromdad is 1 and fromfam is 0, as we would expectsince there was data from dads for famid 4, but no data fromfaminc. Also, as we expect, this record has valid data for the variables from thedadsfile (name andinc) and missing data for the variables fromfaminc(faminc96 faminc97and faminc98). We see the reversepattern forfamid‘s5 and 6.

A closer look at the fromdadand fromfam variables reveals that there are three records that have matching data: one that has data from the dads only, and two recordsthat have data from the faminc file only. The crosstab tablebelow confirms this.

Merge

You may want to use thisstrategy to check the matching of the two files. If there are unexpectedmismatched records, then you should investigate to understand the causeof the mismatched records.

Sas Merge If A And B

Use the where statement in a proc print to eliminate some of the non-matching records.

Merge

4.2 Variables with the same name, but different information

Below we have the fileswith the information about the dads and family, but look more closely atthe names of the variables. In the dads file, there is a variablecalled inc98, and in the family file there are variables inc96,inc97 and inc98. Let’s attempt to merge these files and see what happens.

PROC PRINT DATA=merge121;
RUN;

The results are shown below.As you see, the variable inc98 has the data from the dadsfile, the file that appears last on the merge statement. Whenyou merge files that have the same variable, SAS will use the values fromthe file that appears last on the merge statement.

There are a couple of waysyou can solve this problem.

Solution #1. The most obvioussolution is to choose variable names in the original files that will notconflict with each other. However, you may have files wherethe names have already been chosen.

Solution #2. You can renamethe variables in a data step using the rename option (whichrenames the variables before doing the merging). This allows youto select variable names that do not conflict with each other, as illustratedbelow.

DATA merge121;
MERGE faminc(RENAME=(inc96=faminc96inc97=faminc97 inc98=faminc98))
dads(RENAME=(inc98=dadinc98));
BY famid;
RUN;

PROC PRINT DATA=merge121;
RUN;

As you can see below, thevariables were renamed as specified.

Join

5. For more information

Sas Data Merge

  • For information on concatenating datafiles, see the SAS Learning Module on ConcatenatingData Files in SAS.