In our previous tutorial, you had learned how to merge multiple CSV files using Python built-in functions. Today, we’ll demonstrate how to use Pandas to merge CSV files and explain with a fully working example.

We’ll start by telling you – what is the use of Pandas? It is a library written in Python for data munging and analysis. It provides highly optimized data structures and high-performing functions for working with data.

Pandas handle data from 100MB to 1GB quite efficiently and give an exuberant performance. However, in case of BIG DATA CSV files, it provides functions that accept chunk size to read big data in smaller chunks.

Python Using Pandas to Merge CSV Files


  • 1 Python script to merge CSV using Pandas

When you have a set of CSV files in a multitude of 100s or 1000s, then it is impossible to combine them manually. But, if you try to do so, then it may lead to incorrect merge and a lot of errors. In the below section, we are providing a step by step mechanism to combine multiple CSV files. We’ll be creating a simple Python script and use the Pandas library.

Python script to merge CSV using Pandas

Include required Python modules

In our Python script, we’ll use the following core modules:

  • OS module – Provides functions like copy, delete, read, write files, and directories.
  • Glob module – Provides glob function to list files and directories in Python.
  • Pandas – Provides functions to merge multiple CSV files in quick time.

To sum up, check out the below coding snippet. It loads the required modules and sets the working dir for our testing.

Prepare a list of all CSV files

In this step, we have to find out the list of all CSV files. Therefore, we’ll use the glob() function and give it the “.csv” pattern to list matching the target.

Below is a piece of code to list all files matching “.csv” pattern.

Concatenate to produce a consolidated file

It is the last step where we have to call Pandas concat() to return a consolidate object. After that, we convert the result back to a single CSV file. It generates the final output in the current working directory.

Let’s check out the final piece of code that does our task.

Full script code

We hope that you now know how to use the Pandas library to merge CSV files. Also, you can write a fully working Python script. It will help you combine multiple files quickly.