Followers

Showing posts with label Python CSV File Handling Part-1. Show all posts
Showing posts with label Python CSV File Handling Part-1. Show all posts

Tuesday, April 25, 2023

Python CSV File Handling: A Comprehensive Guide with Examples Part-1

 

Python is a powerful programming language that offers many libraries and tools to handle data in various formats. One such format is CSV (Comma-Separated Values), which is commonly used for storing and exchanging tabular data between different applications.

Python provides several libraries for working with CSV files, including the built-in csv module. In this article, we'll explore how to use the csv module to read and write CSV files in Python, along with some useful tips and tricks.

Types of CSV Files

There are different types of CSV files, depending on the delimiters used to separate the values. The most common types are:

  1. Comma-separated values (CSV)
  2. Tab-separated values (TSV)
  3. Pipe-separated values (PSV)

In a CSV file, each line represents a row of data, and the values in each row are separated by commas. In a TSV file, the values are separated by tabs, and in a PSV file, they are separated by pipes.

Reading CSV Files

To read a CSV file in Python, we can use the csv.reader class from the csv module. Here's an example:

python
import csv with open('example.csv') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        print(row)

In this example, we're opening a CSV file named example.csv using a context manager (with statement) to ensure that the file is properly closed after reading. We then create a csv.reader object using the csv.reader() function and iterate over each row in the file using a for loop.

By default, the csv.reader class assumes that the values are separated by commas. However, we can specify a different delimiter by passing a delimiter argument to the csv.reader() function.

python
import csv with open('example.tsv') as tsvfile:
    reader = csv.reader(tsvfile, delimiter='\t')
    for row in reader:
        print(row)

In this example, we're opening a TSV file named example.tsv and specifying the tab character as the delimiter using the delimiter='\t' argument.

Writing CSV Files

To write data to a CSV file in Python, we can use the csv.writer class from the csv module. Here's an example:

python
import csv
 data = [
 ['Name', 'Age', 'City'],
 ['Achinta', '35', 'Kolkata'],
 ['Diganta', '30', 'Bengaluru'],
 ['Santi', '40', 'Mumbai']
]
with open('output.csv', 'w', newline='') as csvfile:
     writer = csv.writer(csvfile)
     writer.writerows(data)

In this example, we're creating a list of lists called data that represents the rows of a CSV file. We then open a new file named output.csv in write mode using a context manager and create a csv.writer object using the csv.writer() function. We then use the writerows() method to write the data to the file.

By default, the csv.writer class separates values with commas. However, you can specify a different delimiter by passing a delimiter argument to the csv.writer() function.

python
import csv data = [['Name', 'Age', 'Gender'], ['Achinta', 35, 'Male'], ['Diganta', 30, 'Male'], ['Santi', 40, 'Female']] with open('output.tsv', 'w', newline='') as tsvfile: writer = csv.writer(file, delimiter='\t') writer.writerows(data)

In this example, we're creating a TSV file named output.tsv by specifying the tab character as the delimiter using the delimiter='\t' argument.

Handling Headers

CSV files often include a header row that provides the names of the columns. When reading a CSV file, we can use the next() function to skip the header row.

python
import csv with open('example.csv') as csvfile: reader = csv.reader(csvfile) headers = next(reader) for row in reader: print(row)

In this example, we're reading a CSV file named example.csv and using the next() function to skip the header row and store it in a variable called headers. We then iterate over the remaining rows using a for loop.

When writing data to a CSV file, we can include the header row by calling the writerow() method with the header values before writing the data rows.

python
import csv data = [ ['Name', 'Age', 'City'], ['Achinta', '35', 'Delhi'], ['Diganta', '30', 'Mumbai'], ['Santi', '40', 'Kolkata'] ] with open('output.csv', 'w', newline='') as csvfile: writer = csv.writer(csvfile) writer.writerow(['Name', 'Age', 'City']) writer.writerows(data[1:])

In this example, we're creating a CSV file named output.csv and calling the writerow() method with the header row values ['Name', 'Age', 'City'] before writing the data rows using the writerows() method.

Handling Quotes and Special Characters

CSV files can also contain special characters and quotes that need to be properly escaped when reading and writing data. The csv module in Python provides several options for handling these cases.

When reading a CSV file, we can use the csv.reader class with the quotechar and escapechar arguments to handle quotes and special characters.

python
import csv with open('example.csv') as csvfile: reader = csv.reader(csvfile, quotechar='"', escapechar='\\') for row in reader: print(row)

In this example, we're reading a CSV file named example.csv and specifying the double quote character as the quotechar and the backslash character as the escapechar.

When writing data to a CSV file, we can use the csv.writer class with the quotechar and escapechar arguments to properly escape quotes and special characters.

python
import csv data = [ ['Name', 'Age', 'City'], ['Achinta', '35', 'Delhi'], ['Diganta', '30', 'Bengaluru'], ['Santi', '40', 'Kolkata']] with open('output.csv', 'w', newline='') as csvfile: writer = csv.writer(csvfile, quotechar='"', escapechar='\\', quoting=csv.QUOTE_MINIMAL) writer.writerows(data)

In this example, we're creating a CSV file named output.csv and specifying the double quote character as the quotechar, the backslash character as the escapechar, and the quoting=csv.QUOTE_MINIMAL option to use the minimum required quoting.

Conclusion

Python's csv module provides a simple and flexible way to handle CSV files in Python. With the examples and techniques covered in this article, we should be able to read and write CSV files in Python, handle different types of delimiters, handle headers, and properly escape quotes and special characters.